Protein structures are key to understanding biomolecular mechanisms and diseases, yet their interpretation is hampered by limited knowledge of their biologically relevant quaternary structure (QS). Between 30% and 50% of proteins do adopt a stable QS in the form of a dimer, trimer, tetramer, or larger homo-oligomer illustrated below:
Main Symmetry classes found in homo-oligomers
To gain knowledge of this basic level of protein assembly we have been developing approaches to predict (Dey et al. Nature Methods, 2018, QSbio), browse (3DComplex), and curate (PiQSi) QS information.
We now leverage this wealth of structural data and combine it with omics data to discover new principles of evolution, assembly, and regulation of proteins, complexes, and networks.
Quaternary structure conservation across species points to biologically relevant crystal contacts. Top: Protein X-ray diffraction requires the formation of a crystal. At the molecular level, a crystal is formed by a lattice within which the repeated unit is the unit cell. Here, the unit-cell contains eight copies of the protein (PDB code 1EX2) in contact with one another. Identifying biologically relevant contacts among these is challenging. Bottom: Tyvelose epimerase is a tetrameric enzyme in Salmonella typhi (PDB code 1ORR). A similar tetramer is found in Arabidopsis thaliana (PDB code 1I2B, r.m.s. deviation = 3.55 Å), although the sequences of these two tetramers share only 22% identity. Such conservation suggests that both tetramers are biologically relevant. This information enables subsequent correction of entries showing identical sequence but different QS (e.g., PDB code 1I24).