Why do hubs tend to be essential in protein networks? - PubMed (original) (raw)

Why do hubs tend to be essential in protein networks?

Xionglei He et al. PLoS Genet. 2006.

Abstract

The protein-protein interaction (PPI) network has a small number of highly connected protein nodes (known as hubs) and many poorly connected nodes. Genome-wide studies show that deletion of a hub protein is more likely to be lethal than deletion of a non-hub protein, a phenomenon known as the centrality-lethality rule. This rule is widely believed to reflect the special importance of hubs in organizing the network, which in turn suggests the biological significance of network architectures, a key notion of systems biology. Despite the popularity of this explanation, the underlying cause of the centrality-lethality rule has never been critically examined. We here propose the concept of essential PPIs, which are PPIs that are indispensable for the survival or reproduction of an organism. Our network analysis suggests that the centrality-lethality rule is unrelated to the network architecture, but is explained by the simple fact that hubs have large numbers of PPIs, therefore high probabilities of engaging in essential PPIs. We estimate that approximately 3% of PPIs are essential in the yeast, accounting for approximately 43% of essential genes. As expected, essential PPIs are evolutionarily more conserved than nonessential PPIs. Considering the role of essential PPIs in determining gene essentiality, we find the yeast PPI network functionally more robust than random networks, yet far less robust than the potential optimum. These and other findings provide new perspectives on the biological relevance of network structure and robustness.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Essential Edges (Interactions) in PPI Networks

(A) A hypothetical PPI network of 12 proteins. Black and white nodes refer to essential and nonessential proteins, respectively. Thick and thin edges depict essential and nonessential interactions, respectively. Proteins linked by an essential interaction must be essential, whereas an interaction between essential proteins (IBEP) may or may not be essential. (B) More IBEPs in the yeast PPI network than in randomly rewired networks. “Observed” indicates the observed number (807) of IBEPs in the real network. The gray bars show the distribution of the number (m) of IBEPs in 10,000 randomly rewired networks.

Figure 2

Figure 2. The Relationship between the Probability That a Protein Is Essential (P E) and the Connectivity (k) of the Protein

(A) Observed and predicted P E values. The observed values were estimated from the yeast PPI network and the predicted values were computed using Equation 1 with parameters α = 2.92% and β = 12.6%. Error bars show one standard (sampling) error of the observed values. (B) Linear regression between ln(1-P E) and k. Using Equation 2, we estimated from the regression that α = 3.29% and β = 12.8%. The 95% confidence interval for α is between 2.23%–4.35%. The 95% confidence interval for β is between 6.7%–18.6%. Proteins with k > 10 (~ 5% of all proteins) are not considered because of small sample sizes.

Figure 3

Figure 3. Effects of Random Removal of Edges on the Global Structure of the Yeast PPI Network

(A) Effects on network diameter, which is the mean shortest path length among all reachable pairs of nodes in the network. (B) Effects on the proportion of unreachable pairs of nodes in the network. Note that the total number of IBEPs is 807 in the network.

Figure 4

Figure 4. Robustness of PPI Networks

(A) Numbers of essential nodes generated by 220 essential edges in various networks. Black and gray bars depict the distribution of the number of essential nodes from 10,000 replications of random assignments of 220 essential edges to the real yeast PPI network and simulated ER networks, respectively. An ER network has the same number of nodes and edges as in the real network, but the distribution of node connectivity follows a Poisson distribution. Also shown are the minimal and maximal numbers of essential nodes produced by 220 essential edges in any possible network that has the same numbers of nodes and edges as the yeast PPI network. The minimum is 22, because the number of edges among 22 nodes can be as high as 21 × 22/2 = 231 > 220. The maximum is 220 × 2 = 440. (B) Proportions of essential nodes generated by given numbers of essential edges in scale-free (power-law) and ER networks. Both networks contain 4,000 nodes and 4,352 edges. The scale-free network has its node connectivity following the power-law distribution P(k)k −γ, where P(k) is the probability that a node has k edges. We used γ = 2.29, the same as in the real yeast PPI network (see Figure S4). The ER network has a connectivity distribution following the Poisson distribution with mean connectivity per node being 2.176. The result that more essential nodes are produced in ER networks than in scale-free networks by a given number of essential edges applies to other γ values (see Figure S5).

References

    1. Barabasi AL, Albert R. Emergence of scaling in random networks. Science. 1999;286:509–512. - PubMed
    1. Newman MEJ. The structure and function of complex networks. SIAM Review. 2003;45:167–256.
    1. Albert R, Jeong H, Barabasi AL. Error and attack tolerance of complex networks. Nature. 2000;406:378–382. - PubMed
    1. Kamath RS, Fraser AG, Dong Y, Poulin G, Durbin R, et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature. 2003;421:231–237. - PubMed
    1. Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285:901–906. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources