Lineage-specific gene expansions in bacterial and archaeal genomes - PubMed (original) (raw)

Comparative Study

Lineage-specific gene expansions in bacterial and archaeal genomes

I K Jordan et al. Genome Res. 2001 Apr.

Abstract

Gene duplication is an important mechanistic antecedent to the evolution of new genes and novel biochemical functions. In an attempt to assess the contribution of gene duplication to genome evolution in archaea and bacteria, clusters of related genes that appear to have expanded subsequent to the diversification of the major prokaryotic lineages (lineage-specific expansions) were analyzed. Analysis of 21 completely sequenced prokaryotic genomes shows that lineage-specific expansions comprise a substantial fraction (approximately 5%-33%) of their coding capacities. A positive correlation exists between the fraction of the genes taken up by lineage-specific expansions and the total number of genes in a genome. Consistent with the notion that lineage-specific expansions are made up of relatively recently duplicated genes, >90% of the detected clusters consists of only two to four genes. The more common smaller clusters tend to include genes with higher pairwise similarity (as reflected by average score density) than larger clusters. Regardless of size, cluster members tend to be located more closely on bacterial chromosomes than expected by chance, which could reflect a history of tandem gene duplication. In addition to the small clusters, almost all genomes also contain rare large clusters of size > or =20. Several examples of the potential adaptive significance of these large clusters are explored. The presence or absence of clusters and their related genes was used as the basis for the construction of a similarity graph for completely sequenced prokaryotic genomes. The topology of the resulting graph seems to reflect a combined effect of common ancestry, horizontal transfer, and lineage-specific gene loss.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Cluster sizes in number of genes (_Y_-axis, gray bars) for four representative species, (A) Campylobacter jejuni, (B) Methanococcus janaschii, (C) Mycoplasma pneumoniae, and (D) Treponema pallidum, compared to the average number of best hits (BeTs) for each cluster (_Y_-axis, black bars) in all other completely sequenced bacterial genomes.

Figure 2

Figure 2

Linear correlation between genome size (in number of genes) and the parameters of lineage-specific expansions. Correlation coefficients (r) and significance levels (P) were determined using ordinary least squares linear regression. (A) For completely sequenced prokaryotic genomes, genome size (_X_-axis) is plotted against the number of genes in lineage-specific clusters (diamonds) and the number of such clusters (squares). (B) Genome size (_X_-axis) is plotted against the percentage of the genome made up of lineage-specific clusters (triangles).

Figure 3

Figure 3

Frequency distribution (99% quantile) of lineage-specific expansion cluster sizes (_X_-axis in numbers of genes). Observed data are shown with diamonds. These data were fit using the logarithmic approximation (line).

Figure 4

Figure 4

Linear correlation between cluster size in number of genes (_X_-axis) and average score density per cluster (_Y_-axis). Correlation coefficients (r) and significance levels (P) were determined using ordinary least squares linear regression. Removal of the two largest clusters (size 67 and 90) results in a greater magnitude of r and a lower P value (i.e., a stronger negative correlation).

Figure 5

Figure 5

Maximum parsimony graph for completely sequenced archaeal and bacterial genomes. The root was provisionally placed between archaea and bacteria.

Similar articles

Cited by

References

    1. Alm RA, Bina J, Andrews BM, Doig P, Hancock RE, Trust TJ. Comparative genomics of helicobacter pylori: Analysis of the outer membrane protein families. Infect Immun. 2000;68:4155–4168. - PMC - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–33402. - PMC - PubMed
    1. Aravind L, Tatusov RL, Wolf YI, Walker DR, Koonin EV. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 1998;14:442–444. - PubMed
    1. Arruda S, Bomfim G, Knights R, Huima-Byron T, Riley LW. Cloning of an M. tuberculosis DNA fragment associated with entry and survival inside cells. Science. 1993;261:1454–1457. - PubMed
    1. Bailey TL, Gribskov M. Combining evidence using p-values: Application to sequence homology searches. Bioinformatics. 1998;14:48–54. - PubMed

Publication types

MeSH terms

LinkOut - more resources