Gain and loss of multiple genes during the evolution of Helicobacter pylori - PubMed (original) (raw)

Gain and loss of multiple genes during the evolution of Helicobacter pylori

Helga Gressmann et al. PLoS Genet. 2005 Oct.

Abstract

Sequence diversity and gene content distinguish most isolates of Helicobacter pylori. Even greater sequence differences differentiate distinct populations of H. pylori from different continents, but it was not clear whether these populations also differ in gene content. To address this question, we tested 56 globally representative strains of H. pylori and four strains of Helicobacter acinonychis with whole genome microarrays. Of the weighted average of 1,531 genes present in the two sequenced genomes, 25% are absent in at least one strain of H. pylori and 21% were absent or variable in H. acinonychis. We extrapolate that the core genome present in all isolates of H. pylori contains 1,111 genes. Variable genes tend to be small and possess unusual GC content; many of them have probably been imported by horizontal gene transfer. Phylogenetic trees based on the microarray data differ from those based on sequences of seven genes from the core genome. These discrepancies are due to homoplasies resulting from independent gene loss by deletion or recombination in multiple strains, which distort phylogenetic patterns. The patterns of these discrepancies versus population structure allow a reconstruction of the timing of the acquisition of variable genes within this species. Variable genes that are located within the cag pathogenicity island were apparently first acquired en bloc after speciation. In contrast, most other variable genes are of unknown function or encode restriction/modification enzymes, transposases, or outer membrane proteins. These seem to have been acquired prior to speciation of H. pylori and were subsequently lost by convergent evolution within individual strains. Thus, the use of microarrays can reveal patterns of gene gain or loss when examined within a phylogenetic context that is based on sequences of core genes.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Genes Present and Absent in 56 Strains of H. pylori and Four Strains of H. acinonychis

CDSs used in microarrays are shown to scale along a virtual genome consisting of CDSs from both 26695 and J99 in the gene order found within 26695. Circle contents from outside to inside: (1) virtual chromosome (1.76 Mb) with ticks every 220 kb (2), GC content indicated in colors (orange, < 39%; purple, > 39%; green, rRNA genes) (3–9), numbers of missing CDSs from individual strains according to population, color-coded according to presence in both 26695 and J99 (gray) or specific to either 26695 (red) or J99 (blue). Circle, population; 3, hpAfrica2; 4, hpAfrica1; 5, hpEurope; 6, hpAsia2; 7, hpEastAsia; 8, AmerindB; 9, H. acinonychis.

Figure 2

Figure 2. Extrapolated Number of Universally Present CDSs in H. pylori

The fraction of CDSs present in a sample of strains (“common CDSs”) was calculated on random samples of one to 56 strains taken without replacement. Mean fractions of common CDSs were calculated from 100 iterations of this sampling procedure. The graph shows the results of fitting an exponential decay model to these calculations, in which y0 approaches the minimum number of universally common CDSs at infinity (0.674 × 1,649 CDSs = 1,111 universally present CDSs).

Figure 3

Figure 3. GC Content of CDSs That Are Universally Present or Variable within H. pylori

CDSs were binned according to GC content in steps of 2% (24–26, 26–28, etc.). Top: Fraction of all CDSs within a bin that are variable. Bottom: Fraction of universally present (n = 1,150) or variable (n = 499) CDSs by GC content. One universally present CDS with a GC content of 62% (HP0359) has been excluded from the figure.

Figure 4

Figure 4. Phylogenetic Structure (Neighbor-Joining Trees) According to (A) Sequences of Seven Core Genes, (B) Microarray Data Excluding cag PAI, and (C) Microarray Data Including the cag PAI for 56 Strains of H. pylori and Four Strains of H. acinonychis

Filled triangles indicate strains possessing the cag PAI, open circles indicate strains lacking it, and filled circles indicate hspAmerind strains that lack HP0536–0548 from the cag PAI. Colors indicate population assignments by Structure based on the sequence data (B. Linz, unpublished data). Numbers at the tips of the twigs are strain numbers (Table S3), while blue numbers next to nodes are bootstrap values over 75% after 250 iterations.

Similar articles

Cited by

References

    1. Suerbaum S, Michetti P. Helicobacter pylori infection. New Engl J Med. 2002;347:1175–1186. - PubMed
    1. Covacci A, Telford JL, Del Giudice G, Parsonnet J, Rappuoli R. Helicobacter pylori virulence and genetic geography. Science. 1999;284:1328–1333. - PubMed
    1. Suerbaum S, Maynard Smith J, Bapumia K, Morelli G, Smith NH, et al. Free recombination within Helicobacter pylori . Proc Natl Acad Sci U S A. 1998;95:12619–12624. - PMC - PubMed
    1. Li L, Genta RM, Go MF, Gutierrez O, Kim JG, et al. Helicobacter pylori strain and the pattern of gastritis among first-degree relatives of patients with gastric carcinoma. Helicobacter. 2002;7:349–355. - PubMed
    1. Kuipers EJ, Israel DA, Kusters JG, Gerrits MM, Weel J, et al. Quasispecies development of Helicobacter pylori observed in paired isolates obtained years apart from the same host. J Infect Dis. 2000;181:273–282. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources