Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution - PubMed (original) (raw)

. 2004 Jul;16(7):1679-91.

doi: 10.1105/tpc.021410. Epub 2004 Jun 18.

Affiliations

Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution

Guillaume Blanc et al. Plant Cell. 2004 Jul.

Abstract

To study the evolutionary effects of polyploidy on plant gene functions, we analyzed functional genomics data for a large number of duplicated gene pairs formed by ancient polyploidy events in Arabidopsis thaliana. Genes retained in duplicate are not distributed evenly among Gene Ontology or Munich Information Center for Protein Sequences functional categories, which indicates a nonrandom process of gene loss. Genes involved in signal transduction and transcription have been preferentially retained, and those involved in DNA repair have been preferentially lost. Although the two members of each gene pair must originally have had identical transcription profiles, less than half of the pairs formed by the most recent polyploidy event still retain significantly correlated profiles. We identified several cases where groups of duplicated gene pairs have diverged in concert, forming two parallel networks, each containing one member of each gene pair. In these cases, the expression of each gene is strongly correlated with the other nonhomologous genes in its network but poorly correlated with its paralog in the other network. We also find that the rate of protein sequence evolution has been significantly asymmetric in >20% of duplicate pairs. Together, these results suggest that functional diversification of the surviving duplicated genes is a major feature of the long-term evolution of polyploids.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Frequency Distributions of Functional Categories in Function of the P Value Attached to the Hypothesis That Polyploidy-Derived Duplicated Genes Are Lost Randomly. (A) Frequency distributions for GO functional categories. (B) Distributions for MIPS functional categories. Analyses with the Blanc and Bowers datasets are shown in white and black, respectively.

Figure 2.

Figure 2.

Correlations of Expression Profiles between Pairs of Arabidopsis Genes. Frequency distributions of the correlation coefficient (r) values obtained from the expression profiles of pairs of duplicated genes formed by the most recent polyploidy event (red), pairs of duplicated genes formed by the old duplication events (blue), and randomly chosen genes (black). Ninety-five percent of the r values obtained from random pairs of genes are smaller than r = 0.52 and appear at the left of the vertical dotted line.

Figure 3.

Figure 3.

Illustration of the Concept of Concerted Divergence of Pairs of Paralogs. In an ancestor, genes 1 and 2 interacted and were coexpressed. After genome duplication, the two pairs of paralogs (1A/1B and 2A/2B) form two separate interacting networks, each with its own expression profile.

Figure 4.

Figure 4.

Groups of Duplicated Genes Showing Evidence of Concerted Divergence in Their Expression Profiles. (A) Schematic representation of clusters of codiverging duplicate pairs. Thick gray lines link paralogous gene pairs (formed by genome duplication) with r ≤ 0.1. Red and green lines link nonhomologous genes with r ≥ 0.7. The r values are indicated beside each line. (B) Clusters of codiverging duplicate pairs for which all interpair r values are ≥ 0.7. Gene names are indicated for each cluster at the right. Paralogous gene pairs are identified by numbers in parentheses with suffixes A and B. Normalized expression intensities for each gene in 62 public microarray experiments are represented by colored squares in the middle grid. Bright red and bright blue correspond to relative high and low expression levels, respectively. The complete hierarchical clustering of the genes is shown to the left of each cluster.

Figure 5.

Figure 5.

Schematic Representation of the Two Competing Trees Used to Detect Differences in the Rates of Sequence Divergence among Duplicated Proteins. Ath1, Ath2, and O stand for two paralogous Arabidopsis proteins formed by genome duplication and an outgroup protein. In Tree A, the branch lengths are unconstrained, whereas in Tree B, the branches leading from node N to Ath1 and Ath2 are forced to be equal. A significantly better maximum likelihood of the triplet of sequences under the model of Tree A compared with the model of Tree B (tested with the Likelihood Ratio Test) indicates that the two paralogous proteins evolved at different rates.

Comment in

References

    1. Acarkan, A., Rossberg, M., Koch, M., and Schmidt, R. (2000). Comparative genome analysis reveals extensive conservation of genome organisation for Arabidopsis thaliana and Capsella rubella. Plant J. 23, 55–62. - PubMed
    1. Adams, K.L., Cronn, R., Percifield, R., and Wendel, J.F. (2003). Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc. Natl. Acad. Sci. USA 100, 4649–4654. - PMC - PubMed
    1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. - PMC - PubMed
    1. Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815. - PubMed
    1. Ashburner, M., et al. (2000). Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet 25, 25–29. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources