Groupe de Bioinformatique Moleculaire, Centre de Bioinformatique, INSERM U155,Universite Paris 7, tour 53, 1er etage 2, place Jussieu, 75251 Paris, France
The side-chain interactions have an essential role on the formation and the compacting stability of the protein structure. Considering that a determinism guides a large part of the side-chain interactions, an energy potential for the atomic interactions of all amino acid residue pairs, as a function of the distance between the involved atoms, have been carried out [1, 2] and are generally used in the methods of structure modelling.
The aim of this study is to build a simplified energy potential, which only takes the statistically significant side-chain interactions into account. The hypothesis is the high influence on the folding of the statistically over-represented side-chains interactions, the other observed interactions being only the result of the folding, namely random in a great part.
A previous statistical study [3] allowed the extraction of the preferential associations between the side-chains, first independently of their location in the 3D structure, and secondly, taking the accessibility to the solvent of the two concerned side-chains and their locations in secondary structures into account. From this analysis, a potential of mean force was defined taking the side-chain interactions, the residue burial and the location in secondary structure into account.
To define this simplified potential, we assume that the interaction frequency fij between two residues i and j is defined by a multiplicative model : fij=AijFiFj where Fi and Fj are the marginal interaction frequencies of the residues, and denotes a coefficient that we call "amplification coefficient". When a given side-chain interaction is defined randomly, this coefficient is equal to 1. So the potential of mean force is defined by : Eij = -kt.ln(Aij)+E0 where E0 denotes a reference value. The aim of this study is to simplify this potential of mean force by keeping only the higher interactions, i.e. by eliminating the interactions not statistically significant. The approach consists in optimizing a statistical criterion, either the Akaike Information Content (AIC) or the Bayesian Information Content (BIC) to reduce the number of parameters (Fi and Aij) involved in the model. Certain amplification coefficients are fixed to 1, so deleting the associated potential value.
Consequently, this simplified potential of mean force allows the relevant interactions to be found. The notion of protein cohesion, defined by this reduced set of side-chain interactions, will be tackled in a further study.
[1] Avbelj, F., (1992). Use of a potential of mean force to analyse free
energy contributions in protein folding. Bioch., 31, 6290-6297.
[2] Sippl, M.J., (1990). Calculation of conformational ensembles from
potentials of mean force. J. Mol. Biol., 213, 859-888.
[3] Mucchielli-Giorgi, M.H., Tuffery, P. and Hazout, S., (1996). Statistical
assessment of the major contributions driving side-chain interactions.
Submitted.