The basic building blocks and evolution of CRISPR-CAS systems - PubMed (original) (raw)

Review

The basic building blocks and evolution of CRISPR-CAS systems

Kira S Makarova et al. Biochem Soc Trans. 2013 Dec.

Abstract

CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) is an adaptive immunity system in bacteria and archaea that functions via a distinct self/non-self recognition mechanism that involves unique spacers homologous with viral or plasmid DNA and integrated into the CRISPR loci. Most of the Cas proteins evolve under relaxed purifying selection and some underwent dramatic structural rearrangements during evolution. In many cases, CRISPR-Cas system components are replaced either by homologous or by analogous proteins or domains in some bacterial and archaeal lineages. However, recent advances in comparative sequence analysis, structural studies and experimental data suggest that, despite this remarkable evolutionary plasticity, all CRISPR-Cas systems employ the same architectural and functional principles, and given the conservation of the principal building blocks, share a common ancestry. We review recent advances in the understanding of the evolution and organization of CRISPR-Cas systems. Among other developments, we describe for the first time a group of archaeal cas1 gene homologues that are not associated with CRISPR-Cas loci and are predicted to be involved in functions other than adaptive immunity.

PubMed Disclaimer

Figures

Figure 1

Figure 1. The principal building blocks of CRISPR–Cas system types

Gene names and other identifiers follow the current nomenclature and classification [7,30]. An asterisk indicates the putative small subunit that might be fused to the large subunit in several Type I subtypes [30]. Dispensable genes are indicated by broken outlines.

Figure 2

Figure 2. The Cas1 family evolution and functional associations

(A) The Cas1 family phylogeny. The PSIBLAST program [50] was used to retrieve homologues of the Cas1 family from 2262 completely sequenced genomes in the NCBI nr database. The BLASTCLUST program [51] (length coverage cut-off, 0.8; score density threshold, 1.0) was used to select 205 representative sequences. The multiple alignment was built using the MUSCLE program [52] with default parameters. The FastTree program [53] (JTT evolutionary model, discrete gamma model for site rates with 20 rate categories) was used for tree reconstruction. The branches are coloured according to the assignment of Cas1 genes to system subtypes based on the analysis of ten upstream and ten downstream genes. X denotes systems of unknown type or those that are predicted to be derivatives of the respective system (when coloured). (B) Phylogeny of the two solo Cas1 families. The conserved gene neighbourhood of the second group is shown underneath the trees. (C) Cas1 fusions and operonic associations.

Figure 3

Figure 3. Putative common architectural organization of central Cascade components and the evolutionary scenario for the Cascade origin

(A) The subdomain architectures of Cse1, Cas10 and Cas7. The similarly coloured domains could be homologous but dramatically rearranged. The small subunit of subtype III-B that could be homologous with C-terminal helical domain of Cas10 is also shown. (B) Evolutionary scenario for the origin of the Cascade-like complex. Homologous domains or subdomains are colour-coded and identified by a family name, which follow the modified classification [30]. This scenario is a modification of the previously described one [28].

Figure 4

Figure 4. COG1517, a dormancy/programmed cell death system component associated with CRISPR–Cas systems of Type III

(A) CRISPR–Cas-associated ‘effector’ domains. The bottom pie chart shows the breakdown of 314 CRISPR–Cas-positive genomes by the system types. The pie charts at the top show the breakdown of the respective sectors with respect to the presence of at least one COG1517 family protein and the presence or absence of an ‘effector/toxin’ domain in these proteins. (B) Effector/toxin domain fusions. HD and PD-(D/E)xK are DNA nucleases, and PIN, RelE and HEPN are ribonucleases. The number of fusions identified is indicated in parentheses. (C) Uncharacterized gene families associated with COG1517 genes in putative operons. The number of operons predicted is indicated in parentheses.

Similar articles

Cited by

References

    1. Makarova KS, Wolf YI, Koonin EV. Comparative genomics of defense systems in archaea and bacteria. Nucleic Acids Res. 2013;41:4360–4377. - PMC - PubMed
    1. Roberts RJ, Belfort M, Bestor T, Bhagwat AS, Bickle TA, Bitinaite J, Blumenthal RM, Degtyarev S, Dryden DT, Dybvig K, et al. A nomenclature for restriction enzymes, DNA methyltransferases, homing endonucleases and their genes. Nucleic Acids Res. 2003;31:1805–1812. - PMC - PubMed
    1. Xu T, Yao F, Zhou X, Deng Z, You D. A novel host-specific restriction system associated with DNA backbone S-modification in Salmonella. Nucleic Acids Res. 2010;38:7133–7141. - PMC - PubMed
    1. Barrangou R, Horvath P. CRISPR: new horizons in phage resistance and strain identification. Annu Rev Food Sci Technol. 2012;3:143–162. - PubMed
    1. Wiedenheft B, Sternberg SH, Doudna JA. RNA-guided genetic silencing systems in bacteria and archaea. Nature. 2012;482:331–338. - PubMed

Publication types

MeSH terms

LinkOut - more resources