Identification of direct residue contacts in protein-protein interaction by message passing - PubMed (original) (raw)

Identification of direct residue contacts in protein-protein interaction by message passing

Martin Weigt et al. Proc Natl Acad Sci U S A. 2009.

Abstract

Understanding the molecular determinants of specificity in protein-protein interaction is an outstanding challenge of postgenome biology. The availability of large protein databases generated from sequences of hundreds of bacterial genomes enables various statistical approaches to this problem. In this context covariance-based methods have been used to identify correlation between amino acid positions in interacting proteins. However, these methods have an important shortcoming, in that they cannot distinguish between directly and indirectly correlated residues. We developed a method that combines covariance analysis with global inference analysis, adopted from use in statistical physics. Applied to a set of >2,500 representatives of the bacterial two-component signal transduction system, the combination of covariance with global inference successfully and robustly identified residue pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor kinase (SK) and response regulator (RR) proteins and for homointeractions between RR proteins. The spectacular success of this approach illustrates the effectiveness of the global inference approach in identifying direct interaction based on sequence information alone. We expect this method to be applicable soon to interaction surfaces between proteins present in only 1 copy per genome as the number of sequenced genomes continues to expand. Use of this method could significantly increase the potential targets for therapeutic intervention, shed light on the mechanism of protein-protein interaction, and establish the foundation for the accurate prediction of interacting protein partners.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.

Fig. 1.

The combined covariance/message-passing approach detects 2 groups of correlated pairs. (A) Scatter plot of direct mutual information (DI) versus total mutual information (MI) reveals 2 classes of covarying residue pairs, those with strong direct correlation found in the upper red quadrant (group I) and those with low direct correlation found in the lower green quadrant (group II). A group of pairings just around the border of the MI and/or the DI cutoff is highlighted in blue. (B) Direct and indirect interaction pairs depicted on exemplary structures of the HisKA and RR domain. All residues that appear in the network of pairings with MI > MI(t) were mapped onto the structures of HK853 from T. maritima (HisKA domain) and Spo0F from B. subtilis (RR domain). Those pairings showing strong direct correlation are depicted in red and connected by a red line and those that show low direct correlation are depicted in green and connected by a green line. Green lines connecting red residues represent low direct correlation for that particular residue pairing. For orientation, the N and C termini and relevant structural elements are labeled. The phosphotransfer sites H260 in HK853 and D54 in Spo0F are shown in yellow.

Fig. 2.

Fig. 2.

Direct Information is inversely correlated with residue distance of pairs in the Spo0B/Spo0F cocrystal structure. (A) Minimal atom distance for all 408 pairings that could be mapped to the Spo0B/Spo0F cocrystal structure was determined in Ångström and plotted either against direct information DI (red symbols) or total mutual information MI (blue symbols). (B) Specificity vs. rank percentile for predicting contact pairs via DI (red curve) and MI (blue curve). Specificity is defined as the fraction of pairings at the given rank percentile that are within 6Å in the Spo0B/Spo0F cocrystal structure.

Fig. 3.

Fig. 3.

Direct interaction between the identified dimer contact pairs. Four dimer contact pairings (red entries in

Table S1

) are localized to the α4- and α5-helices and are shown on the exemplary OmpR class RR structures of ArcA, PhoP, and MicA as a dashed line. Whereas contact pairing 89:109 (ArcA numbering, for PhoP and MicA numbering deduct 2) happens to represent a salt bridge in all 3 structural examples shown here, the pairing 94:115 and the cluster involving pairings 86:106 and 86:108 demonstrate nicely the type of residue variation that is the basis of the covariance method (see text). Detailed analysis of the covariance among the residues involved in these 4 pairings are given in

Fig. S8

.

Similar articles

Cited by

References

    1. Cusick ME, Klitgord N, Vidal M, Hill DE. Interactome: Gateway into systems biology. Hum Mol Genet. 2005;14((Spec No 2)):R171–R181. - PubMed
    1. Kortemme T, Baker D. Computational design of protein-protein interactions. Curr Opin Chem Biol. 2004;8:91–97. - PubMed
    1. Wells JA, McClendon CL. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature. 2007;450:1001–1009. - PubMed
    1. Liolios K, Mavromatis K, Tavernarakis N, Kyrpides NC. The Genomes On Line Database (GOLD) in 2007: Status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 2008;36:D475–D479. - PMC - PubMed
    1. Suel GM, Lockless SW, Wall MA, Ranganathan R. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat Struct Biol. 2003;10:59–69. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources