Unite de Bioinformatique, INRA, 78352 Jouy-en-Josas, France
This poster presents a simple and robust secondary structure prediction
scheme, Simpa96 (Levin 1996), based on an updated version of the nearest
neighbour method (Levin and Garnier, 1988).
Using a larger data base, the Blosum 62 substitution matrix and a
regularization algorithm the three state prediction accuracy is increased
by 4.7% points to 67.7% for a single sequence and up to 72.8% when using
multiple alignments. The increase in prediction accuracy with respect
to the previous version can be almost entirely ascribed to the 7 fold
increase in the size of the data base. A more detailed analysis of the
results shows that badly predicted regions of a protein sequence are
randomly distributed throughout the data base and that the goal of perfect
secondary structure predictions by methods which use only local sequence
information is illusory.
References
Levin J.M. 1996 Submitted.
Levin J.M. and Garnier J. 1988 Biochim. Biophys. Acta 955: 283-295