Identification of direct residue contacts in protein-protein interaction by message passing - PubMed (original) (raw)
Identification of direct residue contacts in protein-protein interaction by message passing
Martin Weigt et al. Proc Natl Acad Sci U S A. 2009.
Abstract
Understanding the molecular determinants of specificity in protein-protein interaction is an outstanding challenge of postgenome biology. The availability of large protein databases generated from sequences of hundreds of bacterial genomes enables various statistical approaches to this problem. In this context covariance-based methods have been used to identify correlation between amino acid positions in interacting proteins. However, these methods have an important shortcoming, in that they cannot distinguish between directly and indirectly correlated residues. We developed a method that combines covariance analysis with global inference analysis, adopted from use in statistical physics. Applied to a set of >2,500 representatives of the bacterial two-component signal transduction system, the combination of covariance with global inference successfully and robustly identified residue pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor kinase (SK) and response regulator (RR) proteins and for homointeractions between RR proteins. The spectacular success of this approach illustrates the effectiveness of the global inference approach in identifying direct interaction based on sequence information alone. We expect this method to be applicable soon to interaction surfaces between proteins present in only 1 copy per genome as the number of sequenced genomes continues to expand. Use of this method could significantly increase the potential targets for therapeutic intervention, shed light on the mechanism of protein-protein interaction, and establish the foundation for the accurate prediction of interacting protein partners.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
Fig. 1.
The combined covariance/message-passing approach detects 2 groups of correlated pairs. (A) Scatter plot of direct mutual information (DI) versus total mutual information (MI) reveals 2 classes of covarying residue pairs, those with strong direct correlation found in the upper red quadrant (group I) and those with low direct correlation found in the lower green quadrant (group II). A group of pairings just around the border of the MI and/or the DI cutoff is highlighted in blue. (B) Direct and indirect interaction pairs depicted on exemplary structures of the HisKA and RR domain. All residues that appear in the network of pairings with MI > MI(t) were mapped onto the structures of HK853 from T. maritima (HisKA domain) and Spo0F from B. subtilis (RR domain). Those pairings showing strong direct correlation are depicted in red and connected by a red line and those that show low direct correlation are depicted in green and connected by a green line. Green lines connecting red residues represent low direct correlation for that particular residue pairing. For orientation, the N and C termini and relevant structural elements are labeled. The phosphotransfer sites H260 in HK853 and D54 in Spo0F are shown in yellow.
Fig. 2.
Direct Information is inversely correlated with residue distance of pairs in the Spo0B/Spo0F cocrystal structure. (A) Minimal atom distance for all 408 pairings that could be mapped to the Spo0B/Spo0F cocrystal structure was determined in Ångström and plotted either against direct information DI (red symbols) or total mutual information MI (blue symbols). (B) Specificity vs. rank percentile for predicting contact pairs via DI (red curve) and MI (blue curve). Specificity is defined as the fraction of pairings at the given rank percentile that are within 6Å in the Spo0B/Spo0F cocrystal structure.
Fig. 3.
Direct interaction between the identified dimer contact pairs. Four dimer contact pairings (red entries in
Table S1
) are localized to the α4- and α5-helices and are shown on the exemplary OmpR class RR structures of ArcA, PhoP, and MicA as a dashed line. Whereas contact pairing 89:109 (ArcA numbering, for PhoP and MicA numbering deduct 2) happens to represent a salt bridge in all 3 structural examples shown here, the pairing 94:115 and the cluster involving pairings 86:106 and 86:108 demonstrate nicely the type of residue variation that is the basis of the covariance method (see text). Detailed analysis of the covariance among the residues involved in these 4 pairings are given in
Fig. S8
.
Similar articles
- Inference of direct residue contacts in two-component signaling.
Lunt B, Szurmant H, Procaccini A, Hoch JA, Hwa T, Weigt M. Lunt B, et al. Methods Enzymol. 2010;471:17-41. doi: 10.1016/S0076-6879(10)71002-8. Epub 2010 Mar 1. Methods Enzymol. 2010. PMID: 20946840 - Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method.
Burger L, van Nimwegen E. Burger L, et al. Mol Syst Biol. 2008;4:165. doi: 10.1038/msb4100203. Epub 2008 Feb 12. Mol Syst Biol. 2008. PMID: 18277381 Free PMC article. - Interaction fidelity in two-component signaling.
Szurmant H, Hoch JA. Szurmant H, et al. Curr Opin Microbiol. 2010 Apr;13(2):190-7. doi: 10.1016/j.mib.2010.01.007. Epub 2010 Feb 3. Curr Opin Microbiol. 2010. PMID: 20133181 Free PMC article. Review. - Computational prediction of protein-protein interactions.
Obenauer JC, Yaffe MB. Obenauer JC, et al. Methods Mol Biol. 2004;261:445-68. doi: 10.1385/1-59259-762-9:445. Methods Mol Biol. 2004. PMID: 15064475 Review.
Cited by
- AI-integrated network for RNA complex structure and dynamic prediction.
Liu H, Zhuo C, Gao J, Zeng C, Zhao Y. Liu H, et al. Biophys Rev (Melville). 2024 Nov 5;5(4):041304. doi: 10.1063/5.0237319. eCollection 2024 Dec. Biophys Rev (Melville). 2024. PMID: 39512332 Review. - Efficient epistasis inference via higher-order covariance matrix factorization.
Shimagaki KS, Barton JP. Shimagaki KS, et al. bioRxiv [Preprint]. 2024 Oct 14:2024.10.14.618287. doi: 10.1101/2024.10.14.618287. bioRxiv. 2024. PMID: 39464126 Free PMC article. Preprint. - Understanding epistatic networks in the B1 β-lactamases through coevolutionary statistical modeling and deep mutational scanning.
Chen JZ, Bisardi M, Lee D, Cotogno S, Zamponi F, Weigt M, Tokuriki N. Chen JZ, et al. Nat Commun. 2024 Sep 30;15(1):8441. doi: 10.1038/s41467-024-52614-w. Nat Commun. 2024. PMID: 39349467 Free PMC article. - Impact of phylogeny on the inference of functional sectors from protein sequence data.
Dietler N, Abbara A, Choudhury S, Bitbol AF. Dietler N, et al. PLoS Comput Biol. 2024 Sep 23;20(9):e1012091. doi: 10.1371/journal.pcbi.1012091. eCollection 2024 Sep. PLoS Comput Biol. 2024. PMID: 39312591 Free PMC article. - Protein interactions in human pathogens revealed through deep learning.
Humphreys IR, Zhang J, Baek M, Wang Y, Krishnakumar A, Pei J, Anishchenko I, Tower CA, Jackson BA, Warrier T, Hung DT, Peterson SB, Mougous JD, Cong Q, Baker D. Humphreys IR, et al. Nat Microbiol. 2024 Oct;9(10):2642-2652. doi: 10.1038/s41564-024-01791-x. Epub 2024 Sep 18. Nat Microbiol. 2024. PMID: 39294458 Free PMC article.
References
- Cusick ME, Klitgord N, Vidal M, Hill DE. Interactome: Gateway into systems biology. Hum Mol Genet. 2005;14((Spec No 2)):R171–R181. - PubMed
- Kortemme T, Baker D. Computational design of protein-protein interactions. Curr Opin Chem Biol. 2004;8:91–97. - PubMed
- Wells JA, McClendon CL. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature. 2007;450:1001–1009. - PubMed
- Suel GM, Lockless SW, Wall MA, Ranganathan R. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat Struct Biol. 2003;10:59–69. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources