Diversity and evolution of class 2 CRISPR-Cas systems - PubMed (original) (raw)
. 2017 Mar;15(3):169-182.
doi: 10.1038/nrmicro.2016.184. Epub 2017 Jan 23.
Aaron Smargon 3 4, David Scott 3, David Cox 3, Neena Pyzocha 3 5, Winston Yan 3, Omar O Abudayyeh 3 6, Jonathan S Gootenberg 3 7, Kira S Makarova 2, Yuri I Wolf 2, Konstantin Severinov 1 8 9, Feng Zhang 3 6 7 10 11, Eugene V Koonin 2
Affiliations
- PMID: 28111461
- PMCID: PMC5851899
- DOI: 10.1038/nrmicro.2016.184
Diversity and evolution of class 2 CRISPR-Cas systems
Sergey Shmakov et al. Nat Rev Microbiol. 2017 Mar.
Abstract
Class 2 CRISPR-Cas systems are characterized by effector modules that consist of a single multidomain protein, such as Cas9 or Cpf1. We designed a computational pipeline for the discovery of novel class 2 variants and used it to identify six new CRISPR-Cas subtypes. The diverse properties of these new systems provide potential for the development of versatile tools for genome editing and regulation. In this Analysis article, we present a comprehensive census of class 2 types and class 2 subtypes in complete and draft bacterial and archaeal genomes, outline evolutionary scenarios for the independent origin of different class 2 CRISPR-Cas systems from mobile genetic elements, and propose an amended classification and nomenclature of CRISPR-Cas.
Conflict of interest statement
Competing interests statement
The authors declare competing interests: see Web version for details.
Figures
Figure 1. The updated classification scheme for class 2 CRISPR–Cas systems
The class 1 systems are collapsed; all other systems shown are class 2 systems. New class 2 systems that were discovered using the computational pipeline in this study (see BOX 1) are indicated with blue circles for those that were described previously and with red circles for those that are presented here for the first time. For each class 2 system subtype, as well as for the five distinct variants of the provisional V-uncharacterized (V-U) subtype, the locus organization and the domain architecture of the effector and accessory proteins are schematically shown. RuvC-I, RuvC-II and RuvC-III are the three distinct motifs that contribute to the nuclease catalytic centre; numerals in the figure correspond to the respective RuvC motif. The portions of Cas9 proteins that roughly correspond to the recognition lobe and the protospacer-adjacent motif (PAM)-interacting domain are shown by maroon and pink shapes, respectively. The proposed new systematic gene names are shown in bold type in red boxes. Provisional gene names for effector protein candidates are shown below the respective shapes as follows: C2c1–10, class 2 candidate proteins 1–10; for subtype V-A, the previously introduced vernacular cpf1 is indicated. For subtype VI-A, cas1 and cas2 are shown with dashed contours to indicate that only some of these loci include the adaptation module. For the V-U5 variant, the inactivation of the RuvC-like nuclease domain is indicated by a cross. The specific strains of bacteria in which these systems were identified and locus tags for the respective protein-coding genes are also indicated. The abbreviation TM indicates a predicted transmembrane helix. The predicted type of target, namely DNA or RNA, is indicated for each subtype. A question mark next to the target indicates that the activity is only predicted and has not been demonstrated experimentally. The target is not indicated for the type V-U systems because their RNA-guided interference capacity is questionable, which is additionally emphasized by shading. tracrRNA, _trans_-acting CRISPR RNA.
Figure 2. The domain architecture of class 2 CRISPR effector proteins
For the type II and subtype V-A effectors, the crystal structures (indicated here by their RCSB Protein Data Bank (PDB) accession numbers (
5CZZ
and
5B43
, respectively)) are available and the corresponding domain architectures are shown in detail. For the remainder of the proteins, the grey areas indicate structurally and functionally uncharacterized portions. RuvC-I, RuvC-II and RuvC-III, as well as higher eukaryotes and prokaryotes nucleotide-binding I (HEPN I) and HEPN II, denote the catalytic motifs of the respective nuclease domains of the CRISPR effectors. The bridge helix corresponds to an arginine-rich region that follows the RuvC-I motif. Other domains shown in the figure are denoted as follows: PAM interacting, protospacer-adjacent motif (PAM)-interacting domain; HNH, HNH family endonuclease domain, zinc finger domain with a CXXC.. CXXC motif (dots represent the variable distance between the two pairs of cysteines); HTH, putative DNA-binding helix–turn–helix domain; NUC, nuclease domain. The proteins and domains are shown approximately to scale. For each protein, the corresponding number of amino acids is indicated, and a ruler is shown on top of the figure to guide the eye. For the functionally characterized full-length effectors, the proposed new nomenclature (Cas12 and Cas13) is indicated, whereas for the uncharacterized putative effectors of type V-uncharacterized (V-U), only the provisional names are indicated. When, and if, functional evidence of a bona fide CRISPR response is reported for these effectors, they should be referred to as Cas12 proteins with the corresponding specifying letters. The putative V-U1, V-U2 and V-U5 effectors are larger than the typical TnpB proteins, whereas the V-U3 and V-U4 effectors are in the characteristic size range of TnpB. The asterisk at C2c5 indicates that this putative effector protein contains replacements of the catalytic residues of the RuvC-like nuclease domain and lacks the zinc finger.
Figure 3. Phylogenies of the type V and type VI-B effectors
a | A maximum-likelihood phylogenetic tree of TnpB nucleases, including the putative type V-uncharacterized (V-U) effectors that have a predicted active RuvC domain (Supplementary information S1 (box)). The major subtrees of transposon-encoded TnpB proteins are collapsed and indicated by triangles; some of these large groups include tnpB genes that are adjacent to CRISPR arrays, but these do not show evolutionary stability and thus cannot be identified as effectors. The four distinct evolutionarily stable groups of CRISPR-associated TnpB assigned to subtype V-U are shown by red triangles. Altogether, the tree includes 1,770 unique TnpB sequences, 403 of which are TnpB proteins that are encoded next to TnpA (autonomous transposons); 168 of these tnpB genes are adjacent to CRISPR arrays, and of these, 49 are assigned to four variants of subtype V-U (none of these belongs to autonomous transposons). In the subtrees that include the subtype V-U variants, bootstrap values (percentages) are shown for those subtrees that include the distinct V-U variants. For each type V-U variant, the bacterial taxa that harbour the majority of the respective loci are indicated. Dominant bacterial or archaeal lineages, if there are any, are indicated in the triangles. For the complete tree and accession numbers of all sequences, see Supplementary information S2 (box), part c and part h. b | Phylogenetic tree of the subtype VI-B Cas13b effector proteins. The tree was constructed as in part a, and the bootstrap values that are larger than 70% are indicated. The organization of typical cas13b loci for selected representatives (specifically those that are shown in bold) is schematically shown on the right. Variant 1 and variant 2 correspond to the two major branches of the tree and differ with respect to the domain architectures of the second smaller protein encoded in the locus; the domain architectures of these putative accessory proteins are shown above (for variant 1) and below (for variant 2) the respective loci schematics. The CRISPR arrays are shown schematically in brackets. TM indicates a predicted transmembrane domain, shown by blue boxes. Higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domains are shown as maroon boxes. A, diverse archaea; B, diverse bacteria.
Figure 4. Possible routes of evolution for class 2 CRISPR–Cas systems
The figure depicts the three-step pathway of the evolutionary ‘maturation’ of type II, type V and type VI CRISPR–Cas systems. The systematic and/or provisional gene names are indicated below the respective ‘mature’ effector protein schematics and the proposed intermediate forms of type V systems. The first step involves the random insertion of a TnpB-encoding or insertion sequences Cas9-like protein B (IscB)-encoding transposon or a higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domain RNase-encoding gene next to a CRISPR cassette for type II, type V and type VI systems, respectively. During the second step, the functional connection between this protein and the CRISPR array is established and co-evolution begins, in particular, in the form of the accumulation of specific insertions that facilitate CRISPR RNA (crRNA) binding. For type V systems, the intermediate forms that correspond to the first and second step are identified as different type V-uncharacterized (V-U) variants. Additional components of the system could have originated during the second step, such as trans_-acting CRISPR RNA (tracrRNA) in the case of type II systems. During the third step, further insertions lead to increased specificity of crRNA and target binding, and enable interactions with accessory proteins, such as Csn2 for type II-A and a protein with predicted transmembrane (TM) domains for type VI-B. The adaptation module is only inserted into some of the class 2 CRISPR–_cas loci during the third step. TS, target site.
Figure 5. Functional diversity of the experimentally characterized class 2 CRISPR–Cas systems
For each type of the class 2 CRISPR–Cas systems (and two subtypes in the case of type V), a schematic of the complex between the effector protein, the target, crRNA and, in the case of type II and type V-B systems, _trans_-acting CRISPR RNA (tracrRNA), is shown. The position of the protospacer adjacent motif (PAM) or the protospacer flanking site (PFS) is indicated by a red bar. The small red triangles show the position of the cut, or cuts, in the target DNA or RNA molecule. dsDNA, double-stranded DNA; ssRNA, single-stranded RNA.
Similar articles
- Mobile Genetic Elements and Evolution of CRISPR-Cas Systems: All the Way There and Back.
Koonin EV, Makarova KS. Koonin EV, et al. Genome Biol Evol. 2017 Oct 1;9(10):2812-2825. doi: 10.1093/gbe/evx192. Genome Biol Evol. 2017. PMID: 28985291 Free PMC article. Review. - CRISPR Arrays Away from cas Genes.
Shmakov SA, Utkina I, Wolf YI, Makarova KS, Severinov KV, Koonin EV. Shmakov SA, et al. CRISPR J. 2020 Dec;3(6):535-549. doi: 10.1089/crispr.2020.0062. CRISPR J. 2020. PMID: 33346707 Free PMC article. - Characterization and applications of Type I CRISPR-Cas systems.
Hidalgo-Cantabrana C, Barrangou R. Hidalgo-Cantabrana C, et al. Biochem Soc Trans. 2020 Feb 28;48(1):15-23. doi: 10.1042/BST20190119. Biochem Soc Trans. 2020. PMID: 31922192 - CRISPR-Cas systems: beyond adaptive immunity.
Westra ER, Buckling A, Fineran PC. Westra ER, et al. Nat Rev Microbiol. 2014 May;12(5):317-26. doi: 10.1038/nrmicro3241. Epub 2014 Apr 7. Nat Rev Microbiol. 2014. PMID: 24704746 Review. - Diversity, classification and evolution of CRISPR-Cas systems.
Koonin EV, Makarova KS, Zhang F. Koonin EV, et al. Curr Opin Microbiol. 2017 Jun;37:67-78. doi: 10.1016/j.mib.2017.05.008. Epub 2017 Jun 9. Curr Opin Microbiol. 2017. PMID: 28605718 Free PMC article. Review.
Cited by
- Detection of Porcine Circovirus (PCV) Using CRISPR-Cas12a/13a Coupled with Isothermal Amplification.
Wang H, Zhou G, Liu H, Peng R, Sun T, Li S, Chen M, Wang Y, Shi Q, Xie X. Wang H, et al. Viruses. 2024 Sep 30;16(10):1548. doi: 10.3390/v16101548. Viruses. 2024. PMID: 39459882 Free PMC article. Review. - The find of COVID-19 vaccine: Challenges and opportunities.
ElBagoury M, Tolba MM, Nasser HA, Jabbar A, Elagouz AM, Aktham Y, Hutchinson A. ElBagoury M, et al. J Infect Public Health. 2021 Mar;14(3):389-416. doi: 10.1016/j.jiph.2020.12.025. Epub 2020 Dec 30. J Infect Public Health. 2021. PMID: 33647555 Free PMC article. Review. - CRISPR/Cas9 mediated genome editing tools and their possible role in disease resistance mechanism.
Kumari D, Prasad BD, Dwivedi P, Hidangmayum A, Sahni S. Kumari D, et al. Mol Biol Rep. 2022 Dec;49(12):11587-11600. doi: 10.1007/s11033-022-07851-x. Epub 2022 Sep 14. Mol Biol Rep. 2022. PMID: 36104588 Review. - CRISPR-Cas9: A History of Its Discovery and Ethical Considerations of Its Use in Genome Editing.
Gostimskaya I. Gostimskaya I. Biochemistry (Mosc). 2022 Aug;87(8):777-788. doi: 10.1134/S0006297922080090. Biochemistry (Mosc). 2022. PMID: 36171658 Free PMC article. Review. - Current understanding of osteoarthritis pathogenesis and relevant new approaches.
Tong L, Yu H, Huang X, Shen J, Xiao G, Chen L, Wang H, Xing L, Chen D. Tong L, et al. Bone Res. 2022 Sep 20;10(1):60. doi: 10.1038/s41413-022-00226-9. Bone Res. 2022. PMID: 36127328 Free PMC article. Review.
References
- Makarova KS, Grishin NV, Shabalina SA, Wolf YI, Koonin EV. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol. Direct. 2006;1:7. - PMC - PubMed
- Barrangou R, et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007;315:1709–1712. - PubMed
- Barrangou R. CRISPR–Cas systems and RNA-guided interference. Wiley Interdiscip. Rev RNA. 2013;4:267–278. - PubMed
- Marraffini LA. CRISPR–Cas immunity in prokaryotes. Nature. 2015;526:55–61. - PubMed
- Mohanraju P, et al. Diverse evolutionary roots and mechanistic variations of the CRISPR–Cas systems. Science. 2016;353:aad5147. - PubMed
MeSH terms
Substances
Grants and funding
- DP1 MH100706/MH/NIMH NIH HHS/United States
- R01 GM104071/GM/NIGMS NIH HHS/United States
- R01 MH110049/MH/NIMH NIH HHS/United States
- T32 GM007753/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources