Preliminary II Examination, Aug 04, 2008, 10:00AM – 12:00PM, Wachman 447
Identify biological interfaces from crystal structures of homologous proteins
Qifang Xu
Committee:
Dr. Zoran Obradovic (Advisor)
Dr. Roland L. Dunbrack, Jr.
Dr. Slobodan Vucetic
Dr. Longin Jan Latecki
Many proteins function as homooligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homooligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using crosslinking and mass spectrometry, solution-state NMR, and other experiments. But for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. PDB, PQS and PISA provide biological units hence the interfaces. Our study showed that the inconsistence in these databases and between them is significant. Most of the biological units are inferred from individual entries. Examination of interfaces across different PDB entries in a protein family reveals several important features. First, similarity of space group, asymmetric unit size, and cell dimensions and angles (within 1%) does not guarantee that two crystals are actually the same crystal form, that is containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all of the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers and larger structures and 132 monomers was used to determine whether the existence or lack of existence of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, while higher order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Third, the evolution information was used in evaluating interfaces in more than one crystal form. An interface shared in two different crystal forms by divergent proteins is very likely to be biologically important, while some interfaces are restricted to one branch of a family, indicating the evolution of an interface in one branch of the family and/or loss in another. Finally, the PISA database available from the EBI is more consistent in identifying interfaces observed in many crystal forms than is the PDB or EBI’s Protein Quaternary Server (PQS). The PDB in particular is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.