ISC, a Novel Group of Bacterial and Archaeal DNA Transposons That Encode Cas9 Homologs - PubMed (original) (raw)

ISC, a Novel Group of Bacterial and Archaeal DNA Transposons That Encode Cas9 Homologs

Vladimir V Kapitonov et al. J Bacteriol. 2015.

Abstract

Bacterial genomes encode numerous homologs of Cas9, the effector protein of the type II CRISPR-Cas systems. The homology region includes the arginine-rich helix and the HNH nuclease domain that is inserted into the RuvC-like nuclease domain. These genes, however, are not linked to cas genes or CRISPR. Here, we show that Cas9 homologs represent a distinct group of nonautonomous transposons, which we denote ISC (insertion sequences Cas9-like). We identify many diverse families of full-length ISC transposons and demonstrate that their terminal sequences (particularly 3' termini) are similar to those of IS605 superfamily transposons that are mobilized by the Y1 tyrosine transposase encoded by the TnpA gene and often also encode the TnpB protein containing the RuvC-like endonuclease domain. The terminal regions of the ISC and IS605 transposons contain palindromic structures that are likely recognized by the Y1 transposase. The transposons from these two groups are inserted either exactly in the middle or upstream of specific 4-bp target sites, without target site duplication. We also identify autonomous ISC transposons that encode TnpA-like Y1 transposases. Thus, the nonautonomous ISC transposons could be mobilized in trans either by Y1 transposases of other, autonomous ISC transposons or by Y1 transposases of the more abundant IS605 transposons. These findings imply an evolutionary scenario in which the ISC transposons evolved from IS605 family transposons, possibly via insertion of a mobile group II intron encoding the HNH domain, and Cas9 subsequently evolved via immobilization of an ISC transposon.

Importance: Cas9 endonucleases, the effectors of type II CRISPR-Cas systems, represent the new generation of genome-engineering tools. Here, we describe in detail a novel family of transposable elements that encode the likely ancestors of Cas9 and outline the evolutionary scenario connecting different varieties of these transposons and Cas9.

Copyright © 2016 Kapitonov et al.

PubMed Disclaimer

Figures

FIG 1

FIG 1

Multiple-sequence alignment of conserved motifs in selected representatives of IscB, TnpB, and Cas9 families. The catalytic residues are shaded black; conserved hydrophobic residues are highlighted in yellow; conserved small residues are highlighted in green; in the bridge helix alignment, positively charged residues are in red. Secondary-structure prediction is shown below the aligned sequences: H denotes α-helix, and E denotes extended conformation (β-strand). The poorly conserved spacers between the alignment blocks are shown by numbers. The bottom sequence shows the RuvC nuclease from Thermus thermophilus (Protein Data Bank [PDB] ID

4EP5

), with the catalytic amino acid residues denoted.

FIG 2

FIG 2

Structures of nonautonomous ISC transposons. (A) Organizations of ISC transposons. Green rectangles, ORFs encoding IscB; RuvC and HNH, nuclease domains; RR, arginine-rich region. The bacterial hosts and the length of each transposon are indicated on the left and right of the transposon schematics, respectively. In the names of transposons, the KR, GS, CC, YF, and SM suffixes denote K. racemifer DSM 44963, Geitlerinema sp. PCC 7105, C. chthonoplastes PCC 7420, Y. fragilis 232.1, and S. mucosus DSM 16094. (B) Termini and target site specificity among diverse groups of IscB transposons. Target sites are shown in green. The conserved nucleotides at the 5′ and 3′ termini are shown in boldface. The unusual nucleotides are shown in red. Subterminal inverted repeats forming imperfect hairpins recognized by the Y1 transposase are marked by arrows above the corresponding sequences.

FIG 3

FIG 3

IS_C2-1_KR_ and IS_605B-1_KR_ transposons share similar termini. (A) Map of IS_C2-1_KR_ copies in the K. racemifer DSM 44963 genome obtained by using Censor. The first three columns from the left contain the GenBank accession numbers and coordinates of GenBank DNA sequences similar to the query sequence (column 4); columns 5 and 6 give the first and last positions of a region in the query sequence that is similar to the corresponding GenBank sequence; column 7 shows the orientation of the GenBank sequence (d, direct; c, complementary); column 8 shows DNA identity (0 to 1). Two 34-bp sequences similar to the 5′ terminus of IS_C2-1_KR_ are shaded in light brown. These two sequences are the 5′ termini of two 95% identical copies of the IS_605B-1_KR_ transposon. The 1,357-bp IS_605B-1_KR_ transposon encodes the TnpB protein in the antisense orientation. The two terminal hairpins are rendered in black and blue. (B) Pairwise alignment of the 5′ and 3′ termini in both transposons; the terminal hairpins are formed by inverted repeats (marked by arrows). The identical GTGG target sites are shown in green.

FIG 4

FIG 4

Organizations of the major groups of transposons of the IS_605_ superfamily. Transposition of nonautonomous transposons encoding IscB and TnpB is predicted to be catalyzed by the Y1 tyrosine transposase encoded by autonomous transposons IS_200_, IS_605_, and IS_C2Y_. The Y1 transposase binds to the transposon termini, which contain imperfect hairpins formed by the subterminal inverted repeats (green and red arrows). Short nonautonomous TEs transposed by the IS_605_-like Y1 transposase that do not encode any proteins are known as PATEs and REPs. Four types of specific 4-bp target sites are listed for transposons from different groups in the 4 columns on the right. In the target site sequences, the vertical lines indicate the exact positions of transposon insertion. Previously reported target sites and the transposon insertion positions are shown in the last column. The dashed box denotes new types of target sites described in this work.

FIG 5

FIG 5

Phylogenetic tree of Y1 transposases encoded by IS_C2Y_ and IS_605_ transposons (IscA and TnpA, respectively). In IS_C2Y_ transposons, IscA (red arrows) and IscB (blue arrows) ORFs overlap by 12 to 82 nucleotides. In IS_605_ transposons, TnpA and TnpB are depicted as red and black arrows. The right and left arrows indicate ORFs encoded by the sense and antisense strands, respectively. The light-green and brown shading highlights the two distinct clades, each of which combines IscA and TnpA. The tree was obtained using the PhyML program (automatic model selection): LG model, discrete gamma model with 6 categories, and estimated gamma shape parameter. The support for internal branches is indicated by Bayes approximation values above 95%. Species abbreviations: KR, K. racemifer DSM 44963; CS, Coprobacillus sp. 3_3_56FAA; EC, E. cecorum DSM 20682 (ATCC 43198); AA, A. acidaminophila DSM 3853; CH, C. haemolyticum NCTC 9693; MA, M. marina ATCC 23134; VB, Vibrio breoganii; BC, Bacteroides coprophilus; MM, Methanosarcina mazei; MZ, Methanosalsum zhilinae; EH, Eubacterium_hallii_ DSM 3353; BSp, Butyrivibrio sp. strain MB2005; BMT2, Bacillus sp. strain MT2; FP, Francisella philomiragia; HP, Helicobacter pylori Hp H-16; RI, Roseburia inulinivorans.

FIG 6

FIG 6

Phylogenetic tree of IscB proteins encoded by the ISC transposons. The unrooted maximum-likelihood phylogenetic tree was constructed using the FastTree program from a multiple-sequence alignment built for a nonredundant set of 443 full-size IscB sequences and containing 207 informative positions. The tree is shown schematically; the complete tree is available in Fig. S8 in the supplemental material. The ISC transposons described in this work are mapped to the respective collapsed branches. The FastTree program was also used to compute bootstrap values indicated for the branches with more than 70% support.

Similar articles

Cited by

References

    1. Chylinski K, Makarova KS, Charpentier E, Koonin EV. 2014. Classification and evolution of type II CRISPR-Cas systems. Nucleic Acids Res 42:6091–6105. doi:10.1093/nar/gku241. - DOI - PMC - PubMed
    1. Makarova KS, Grishin NV, Shabalina SA, Wolf YI, Koonin EV. 2006. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol Direct 1:7. doi:10.1186/1745-6150-1-7. - DOI - PMC - PubMed
    1. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. 2012. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337:816–821. doi:10.1126/science.1225829. - DOI - PMC - PubMed
    1. Gasiunas G, Barrangou R, Horvath P, Siksnys V. 2012. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci U S A 109:E2579–E2586. doi:10.1073/pnas.1208507109. - DOI - PMC - PubMed
    1. Heler R, Samai P, Modell JW, Weiner C, Goldberg GW, Bikard D, Marraffini LA. 2015. Cas9 specifies functional viral targets during CRISPR-Cas adaptation. Nature 519:199–202. doi:10.1038/nature14245. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources