(1) Department of Physics, Indian Institute of Science, Bangalore 560 012,
Karnataka, India.
(2) Jawaharlal Centre for Advanced Scientific Research, Jakkur P.O.,
Bangalore 560 064, Karnataka, India.
We describe an algorithm to generate plausible three dimensional structures for a short peptide sequence or protein fragment without resorting to alignment of the sequence with related sequences or to any prior assignment of the residues to a particular secondary structure. The algorithm is essentially geometric. We prepare a dictionary of doublet conformations (amino acid pair) taking phi, psi and omega angles from a set of protein structures available in the Protein Data Bank. The doublet conformations are compiled as a set of 400 (20x20) tables, each table containing the observed conformations of a particular doublet. We view the structure of an amino acid sequence as an amalgamation of these doublet conformations; specifically as a fusion of the physical structures of the doublets which are contiguous to each other in the amino acid chain. For example if we want to generate structures for an amino acid triplet say A-B-C, we first split it into its doublet components A-B and B-C. The algorithm then treats the structure of A-B-C as arising out of a process in which the geometries of A-B and B-C are compared at the common residue B and then merged together. The binding rule is a recognition of the fact that B must possess the same or similar configuration (phi, psi values) in both A-B and B-C if they have to be physically merged together. The same procedure is used whenever a new residue is to be added. Thus the structure of a tetra peptide A-B-C-D is obtained by amalgamating the geometries of the three doublet units A-B, B-C and C-D. The operation is sequential, the union of A-B with B-C is followed by the union of C-D. The procedure stops either after all the doublets have been operated in this way or when the doublet to be attached fails to provide phi,psi values close to that of the previous residue. Using this simple rule the algorithm calculates an ensemble of structural alternatives for any given primary sequence segment.
Case studies show that for short peptides and protein fragments the algorithm facilitates simulation of a small number of three dimensional structures one of which is usually found to be close to the corresponding crystal structure. We will be presenting the application of the algorithm to Melittin (2mlt, 26 residues), Avian Pancreatic polypeptide (ppt, 36 residues) and Glutaredoxin (1aba, 87 residues).
Considering the flexibility of the peptide backbone for bond rotations, the number of possible conformations for the amino acid sequence can be tens and thousands and therefore it is revealing to find that the method provides a means to view plausible structural alternatives limited to small numbers and we hope that the method finds application in protein folding problem.