Research Institute of Molecular Biology, State Research Center of Virology and Biotechnology "Vector", Koltsovo, Novosibirsk region, 633159 Russia
New method for searching functionally or structurally important amino acids changes that influence activity or other properties (specificity, stability, LD50, pK, Kd and etc. ) in a set of evolutionarily related (or mutant) proteins is described. Data for the analysis are the aligned sequences, values of protein activities or properties and tertiary structure of one of the homologs (for the analysis of spatial sites). The method searches sites whose amino acid differences correlated with changes of protein activities. A new Structure-Activity Determination CoeffiSient (SADC) is suggested to use in searching activity - modulating sites. At the first step the protein set are automatically divided into groups by amino acid similarity in site (linear, discrete or spatial) under the question. SADC is defined then as square of a correlation coefficient between given partition of proteins into groups and their activity. Quantitively SADC reflects the maximal attainable value of multiple correlation coefficient between protein activities and various physico-chemical properties of amino acids in a given site. To reveal site physico-chemical properties that are of importance for protein activity, several amino acid alphabets are used, constructed based on amino acid similarity in charge, hydrophobicity, volume, etc. The results of sequence and structure analysis are represented as SADC profiles. The method is applied to the analysis of disintegrins, alpha-interferons, luciferases, Kunitz proteinase inhibitors and M2 proteins from influenza A virus. The research was carried out with three types of sites: linear (continuous on a sequence), discreet and spatial. For the analysis of spatial sites tertiary structure of one of homologs was used. Positions and sites in these proteins were determined, whose amino acid changes correlated with changes of protein#244#s activities (or virus phenotype, as in cases with M2 protein). Results of the analysis have shown, that these sites are located within or close to the functionally important protein regions. Changes of activities or phenotype in some case were found to be highly correlated with simple physico-chemical properties of amino acids in the given sites. The method may be used in protein structure - function investigations and in designing protein mutants with directionally changed activity that it is important for medicine and biotechnology.