Characterization of protein hubs by inferring interacting motifs from protein interactions - PubMed (original) (raw)
Characterization of protein hubs by inferring interacting motifs from protein interactions
Ramon Aragues et al. PLoS Comput Biol. 2007 Sep.
Abstract
The characterization of protein interactions is essential for understanding biological systems. While genome-scale methods are available for identifying interacting proteins, they do not pinpoint the interacting motifs (e.g., a domain, sequence segments, a binding site, or a set of residues). Here, we develop and apply a method for delineating the interacting motifs of hub proteins (i.e., highly connected proteins). The method relies on the observation that proteins with common interaction partners tend to interact with these partners through a common interacting motif. The sole input for the method are binary protein interactions; neither sequence nor structure information is needed. The approach is evaluated by comparing the inferred interacting motifs with domain families defined for 368 proteins in the Structural Classification of Proteins (SCOP). The positive predictive value of the method for detecting proteins with common SCOP families is 75% at sensitivity of 10%. Most of the inferred interacting motifs were significantly associated with sequence patterns, which could be responsible for the common interactions. We find that yeast hubs with multiple interacting motifs are more likely to be essential than hubs with one or two interacting motifs, thus rationalizing the previously observed correlation between essentiality and the number of interacting partners of a protein. We also find that yeast hubs with multiple interacting motifs evolve slower than the average protein, contrary to the hubs with one or two interacting motifs. The proposed method will help us discover unknown interacting motifs and provide biological insights about protein hubs and their roles in interaction networks.
Conflict of interest statement
Competing interests. The authors have declared that no competing interests exist.
Figures
Figure 1. Assigning iMotifs to Proteins and Identifying iMotif–iMotif Interactions
First, the protein interaction network is built. Second, a cluster interaction network is created by placing each protein in a different cluster. Third, clustering is performed until the similarity score drops below a certain threshold. Fourth, an iMotif label is assigned to each cluster with more than one protein, and iMotif assignments and interactions are derived.
Figure 2. Definition of an Interacting Motif (iMotif)
The definition of an iMotif depends on the minimum number of common partners required in order to consider the given binary protein interactions mediated through a common interacting motif. (A) From the protein interaction network perspective, proteins with common partners (two in the example provided) are considered to interact with these partners through a similar feature, and, therefore, are classified as being of the same iMotif. (B) The same process is shown from a structural perspective: proteins interacting through a similar feature (regardless of the feature being two structural domains or a single binding site) are considered to have a common iMotif. To further illustrate the method, we also describe a sample iMotif assignment for prothrombin (UniProt code THRB_HUMAN) (Figure S3).
Figure 3. Performance of the Method in Detecting Proteins with Common SCOP Families
The positive predictive value, sensitivity, and applicability (Methods) are plotted as a function of the number of common interaction partners threshold (N) used for the clustering.
Figure 4. Correlation between the Number of Binding Interfaces and the Number of iMotifs
Each point corresponds to a protein from the test set for which a number of binding interfaces was assigned by Kim et al. [15], and a number of iMotifs was inferred with N set to 20. Both variables were found to be significantly correlated (rs is 0.57 and _p_-value is 0.01). The correlation between the number of interfaces and the number of iMotifs is significant for all N values lower than 23 (Figure S5).
Figure 5. Correlation between the Number of iMotifs and Protein Essentiality
Proteins from PIANA were binned according to their number of iMotifs (A) and to their number of interactions (B), and the fraction of essential proteins was calculated for each bin. Bins with only one protein were not considered for calculating the correlations. (A) Correlation between the number of iMotifs assigned to yeast hub proteins (≥20 interactions) in PIANA and the fraction of essential proteins (rs is 0.61 and _p_-value is 1.6 × 10−5). iMotifs were assigned to yeast hubs using an N threshold of 20. (B) Correlation between the number of interactions of yeast hub proteins in Figure 5A and the fraction of essential proteins (rs is 0.51 and _p_-value is 1.1 × 10−6).
Figure 6. Sequence Patterns in iMotifs
Relationship between the percentage of iMotifs for which a significant sequence pattern was found and the percentage of proteins within the iMotif that contained the pattern. Three different significance cutoffs were used for associating sequence patterns to iMotifs: 10−5 (long dashed line), 10−8 (solid line), and 10−10 (short dashed line).
References
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002;415:141–147. -PubMed
- Pawson T, Gish GD, Nash P. SH2 domains, interaction modules and cellular wiring. Trends Cell Biol. 2001;11:504–511. -PubMed
- Aloy P, Russell RB. Structural systems biology: Modelling protein interactions. Nat Rev Mol Cell Biol. 2006;7:188–197. -PubMed
- Parrish JR, Gulyas KD, Finley RL., Jr Yeast two-hybrid contributions to interactome mapping. Curr Opin Biotechnol. 2006;17:387–393. -PubMed
- Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, et al. A human protein–protein interaction network: A resource for annotating the proteome. Cell. 2005;122:957–968. -PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases