Evolutionary history of the Clostridium difficile pathogenicity locus - PubMed (original) (raw)

Briony Elliott, Esther Robinson, David Griffiths, David W Eyre, Nicole Stoesser, Alison Vaughan, Tanya Golubchik, Warren N Fawley, Mark H Wilcox, Timothy E Peto, A Sarah Walker, Thomas V Riley, Derrick W Crook, Xavier Didelot

Affiliations

Evolutionary history of the Clostridium difficile pathogenicity locus

Kate E Dingle et al. Genome Biol Evol. 2014 Jan.

Abstract

The symptoms of Clostridium difficile infection are caused by toxins expressed from its 19 kb pathogenicity locus (PaLoc). Stable integration of the PaLoc is suggested by its single chromosomal location and the clade specificity of its different genetic variants. However, the PaLoc is variably present, even among closely related strains, and thus resembles a mobile genetic element. Our aim was to explain these apparently conflicting observations by reconstructing the evolutionary history of the PaLoc. Phylogenetic analyses and annotation of the regions spanning the PaLoc were performed using C. difficile population-representative genomes chosen from a collection of 1,693 toxigenic (PaLoc present) and nontoxigenic (PaLoc absent) isolates. Comparison of the core genome and PaLoc phylogenies demonstrated an eventful evolutionary history, with distinct PaLoc variants acquired clade specifically after divergence. In particular, our data suggest a relatively recent PaLoc acquisition in clade 4. Exchanges and losses of the PaLoc DNA have also occurred, via long homologous recombination events involving flanking chromosomal sequences. The most recent loss event occurred ∼30 years ago within a clade 1 genotype. The genetic organization of the clade 3 PaLoc was unique in containing a stably integrated novel transposon (designated Tn6218), variants of which were found at multiple chromosomal locations. Tn6218 elements were Tn916-related but nonconjugative and occasionally contained genes conferring resistance to clinically relevant antibiotics. The evolutionary histories of two contrasting but clinically important genetic elements were thus characterized: the PaLoc, mobilized rarely via homologous recombination, and Tn6218, mobilized frequently through transposition.

Keywords: Clostridium difficile; PaLoc; bacterial evolution; mobile genetic element; pathogenicity locus; toxin.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—

Fig. 1.—

Phylogenetic relationship between toxigenic and nontoxigenic Clostridium difficile isolates. Maximum likelihood tree generated from the genomes of 72 representative isolates, plus CD630 (Sebaihia et al. 2006). Clades are indicated by their designated number. Nontoxigenic isolates are indicated by black branches. Toxigenic isolates are indicated by branches colored according to clade. The ST and PCR-ribotype (in brackets) of a well characterized representative of each clade is indicated.

F<sc>ig</sc>. 2.—

Fig. 2.—

Cross-population phylogeny of the PaLoc. Phylogenies constructed from the catalytic and protease domains of tcdB (A), from the translocation and receptor binding domains of tcdB (B), and from the catalytic, protease, and part of the translocation domain of tcdA (C). Breaks in assembly caused by repetitive sequences in the receptor-binding domain of tcdA precluded its inclusion. Colored shapes indicate clade as in figure 1. Strain labels and bootstrap values are shown in

supplementary figure S2

,

Supplementary Material

online.

F<sc>ig</sc>. 3.—

Fig. 3.—

Dating PaLoc acquisition, loss, and exchange. (A) Time-scaled ClonalFrame tree dating the acquisition of the PaLoc by clade 4 to between 466 and 554 years ago. Genomes of 7 toxigenic isolates (blue) representing 3 STs (indicated above branches), including 2 from GenBank FN668375 (ST37, C00010875) and FN665652 (ST86, C00013999) (He et al. 2010), and 11 nontoxigenic isolates (black) representing 5 STs were included. (B) Time-scaled ClonalFrame tree dating the loss of the PaLoc in ST7 to between 1971 and 1995; 27 nontoxigenic genomes (black) and 23 toxigenic genomes (pale blue) were included. (C) Time-scaled ClonalFrame tree dating the exchange of clade 1 PaLocs within ST58 to between 208 and 417 years ago. Five genomes containing one PaLoc variant (green) and 16 containing the other (red) were included. (D) Time-scaled ClonalFrame tree dating the exchange of clade 1 PaLocs with ST44 to between 196 and 401 years ago; 36 genomes containing one PaLoc variant (purple) and 10 containing the other (turquoise) were included. The four pairs of genomes compared in figure 4 are boxed in each of parts (A)–(D).

F<sc>ig</sc>. 4.—

Fig. 4.—

PaLoc acquisition, loss, and exchange by homologous recombination involving long fragments of chromosomal DNA. (A) PaLoc acquisition and loss. Whole-genome distributions of indels between the two pairs of isolates marked by boxes in figure 3_A_ (toxigenic ST37 and nontoxigenic ST109 outer black ring) and 3_B_ (toxigenic and nontoxigenic ST7 inner black ring). The location of the PaLoc is indicated by blue shading. The two outer rings composed of small red lines indicate the open reading frames annotated on the forward and reverse strands of reference genome CD630 (Sebaihia et al. 2006). (B) PaLoc exchange within clade 1. Whole-genome distributions of polymorphism between the two pairs of isolates marked by boxes in figure 3_C_ (toxigenic ST58, outer black ring) and 3_D_ (toxigenic ST44, inner black ring). (C) Distribution of polymorphism between the four pairs of genomes shown in (A) and (B) within the region of the genome containing the PaLoc. Each row represents a pairwise comparison, and polymorphisms are shown in red. Enlarging the region of the genome flanking the PaLoc in this way allowed the distribution of polymorphisms to be used to estimate the size of the recombination events (black boxes) as ∼55 kb replaced by ∼36 kb during PaLoc loss by ST7, and ∼95 kb or ∼232 kb during PaLoc exchange within ST58 and ST44.

F<sc>ig</sc>. 5.—

Fig. 5.—

The chromosome flanking the PaLoc insertion site in nontoxigenic isolates follows the five clades population structure. (A) Schematic depiction of the PaLoc insertion site of nontoxigenic isolates representing the four clades in which they have been identified; from the top, clade C-I ST177, clade 5 ST168, clade 5 ST167, clade 4 ST39, and clade 1 ST7. The PaLoc (pink), which replaces 115 bp (red box) in toxigenic strains, is represented for ST7. The five genes identified in this location in a single clade 5 strain, WA12 (Elliott et al. 2009), are also found in clade C-I (orange). (B) Maximum likelihood tree constructed from the 75 bp of the “PaLoc replacing” 115 bp sequence common to all nontoxigenic isolates and indicated as a red box in (A). Bootstrap values are indicated. (C) Maximum likelihood trees constructed from the PaLoc flanking genes cdu1 and cdd3, using the isolates shown in figure 1. These genes were chosen because they contained sufficient polymorphism to discriminate clades 1 and 2.

F<sc>ig</sc>. 6.—

Fig. 6.—

Genetic organization of the 9-kb insertion within the clade 3 PaLoc. (A) Schematic depiction of the genetic organization of a typical PaLoc as found in the reference genome CD630 (Sebaihia et al. 2006). A short fragment of endolysin-like sequence is indicated in yellow. (B) Genetic organization of the 9-kb clade 3 PaLoc insertion. Putative functions of the predicted genes were identified on the basis of Blast searches of GenBank. The orientation of each gene is indicated by an arrow. The endolysin, int, and rep genes were fragmented, hence they contain multiple arrows. The endolysin sequence found in the insertion is indicated in dark pink to distinguish it from the fragment common to typical PaLoc variants (yellow). The int, xis, and rep are referred to in the text as a recombination module. xre indicates a putative DNA-binding protein belonging to the xenobiotic (stress) response element family of transcriptional regulators. It occurs upstream of a gene cluster predicted to function in resisting oxidative stress. The genes showing homology to sigma 70 region 4 may be concerned with redirecting promoter recognition by the host RNA polymerase. The 3′ terminal gene is hypothetical but conserved among certain conjugative transposons. (C) Predicted hairpin structure formed by the 317 nt imperfect palindrome, generated using the RNAfold web server (

http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi

, last accessed December 20, 2013) (Gruber et al. 2008). The structure is oriented sideways, the top of hairpin to the right. The colors of the bases (as per the key) indicate their probability of being paired or unpaired as shown.

F<sc>ig</sc>. 7.—

Fig. 7.—

A group of closely related novel mobile genetic elements: Tn6218. (A) Comparison of three mobile elements with the clade 3 PaLoc (top, colored as in fig. 6) generated using the Artemis Comparison Tool (Carver et al. 2005). Genes shown in black are common to the mobile elements, PaLoc insertion (fig. 6), and conjugative transposons (

supplementary fig. S4

,

Supplementary Material

online). The accessory genes (colored) putatively confer resistance to polyketide antibiotics, chloramphenicol (cfr), aminoglycosides (AacA-AphD), and erythromycin (ermAB). (B) Comparison of three mobile elements with the clade 3 PaLoc (top), but the elements are distinguished from the PaLoc insertion by a distinct Rep protein variant. Accessory genes include a multidrug and toxic compound extrusion (MATE) family protein, cfr, and an N-terminal nucleophile (Ntn) hydrolase superfamily protein (includes penicillin acylase). BlastP data used to assign putative functions to accessory genes are summarized in

supplementary table S2

,

Supplementary Material

online. Accession numbers of the sequences submitted to European Nucleotide Archive are indicated.

F<sc>ig</sc>. 8.—

Fig. 8.—

Evidence of recent transposition by three Tn6218 elements. Time-scaled ClonalFrame trees constructed using multiple genomes of the same ST. (A) Presence of the ST6(005) element (fig. 7_B_) is indicated by blue branches. The polymorphism of the element and its seven different chromosomal locations (D) are consistent with multiple independent insertion events occurring since 1995. (B) Presence of ST3(001) element (fig. 7_B_) is indicated by red branches; sequence identity of the element and its single chromosomal insertion site were consistent with one integration prior to 1970 and five subsequent losses. (C) Presence of ST54(012) element (fig. 7_A_) is indicated by green branches, consistent with a single insertion event around 2001. (D) Chromosomal locations of the elements shown in (A), (B), and (C) are colored accordingly.

Similar articles

Cited by

References

    1. Adams V, Lyras D, Farrow KA, Rood JI. The clostridial mobilisable transposons. Cell Mol Life Sci. 2002;59:2033–2043. - PMC - PubMed
    1. Bakker D, Smits WK, Kuijper EJ, Corver J. TcdC does not significantly repress toxin expression in Clostridium difficile 630ΔErm. PLoS One. 2012;7:e43247. - PMC - PubMed
    1. Bauer MP, et al. Clostridium difficile infection in Europe: a hospital-based survey. Lancet. 2011;377:63–73. - PubMed
    1. Bentley DR, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. - PMC - PubMed
    1. Braun V, Hundsberger T, Leukel P, Sauerborn M, Von EichelStreiber C. Definition of the single integration site of the pathogenicity locus in Clostridium difficile. Gene. 1996;27:29–38. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources