A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9 - PubMed (original) (raw)

A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9

Lun Cui et al. Nat Commun. 2018.

Abstract

High-throughput CRISPR-Cas9 screens have recently emerged as powerful tools to decipher gene functions and genetic interactions. Here we use a genome-wide library of guide RNAs to direct the catalytically dead Cas9 (dCas9) to block gene transcription in Escherichia coli. Using a machine-learning approach, we reveal that guide RNAs sharing specific 5-nucleotide seed sequences can produce strong fitness defects or even kill E. coli regardless of the other 15 nucleotides of guide sequence. This effect occurs at high dCas9 concentrations and can be alleviated by tuning the expression of dCas9 while maintaining strong on-target repression. Our results also highlight the fact that off-targets with as little as nine nucleotides of homology to the guide RNA can strongly block gene expression. Altogether this study provides important design rules to safely use dCas9 in E. coli.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1

Fig. 1

Effect of dCas9 binding position and orientation. a Distribution of the fitness effect of guide RNAs in our library depending on target gene essentiality and target strand. b Rolling average of the fitness effect produced by guides targeting the promoter of essential genes in both orientations (rolling window size of 50 bp). c Examples of polar effect seen in the cydDC and ycaR-kdsB operons. Gene cydC and kdsB highlighted in red are essential but not gene cydD and ycaR. Guides binding to the coding strand are shown as blue dots. d Example of operons containing an essential gene followed by a non-essential gene. Targeting the downstream non-essential gene usually does not produce a fitness defect. e Rolling average of the fitness effect produced by guides targeting the end of essential genes (rolling window size of 50 bp). f Rolling average of the fitness effect produced by guides along the length of essential genes. Gene start is 0 and gene end is 1 (rolling window size is 5% of the gene length). In all rolling average plots, the shaded area represents the standard deviation

Fig. 2

Fig. 2

Analysis of guides producing unexpected fitness defects. a Fitness effect of guides targeting the non-essential lpoB and hisI gene. Guides T-lpoB, T-lpoB2, C-lpoB, T-hisI, and C-hisI are highlighted. b The highlighted guides were cloned in plasmid psgRNA and introduced in strain LC-E18 carrying dCas9 under the control of a Ptet promoter in the chromosome. Cells were grown overnight and plated on LB agar with 1 nM of aTc. Representative figure from p = 4 independent experiments. c Expression levels of lpoB and hisI when repressed by the T-lpoB, C-lpoB, T-hisI, and C-hisI guide RNAs, as measured by RT-qPCR. Points show biological replicates (n = 4), the black bar is the median. d Effect of the T-lpoB, T-lpoB2, and control guides on the plating efficiency of LC-E18 and LC-E18ΔlpoB. Cells were grown overnight and plated on LB agar with 1 nM of aTc. Representative figure from n = 4 independent experiments

Fig. 3

Fig. 3

A machine-learning approach reveals a toxicity effect determined by the seed sequence. a A locally connected neural network was trained to predict the fitness effect of guide RNAs that target neutral regions, using the one-hot-encoded 60 nt sequence window around the target. Comparison of predicted and actual log2FC values on a held-out test set. b To identify the positions used by the model to make its predictions, we generated a set of 1000 random sequences, mutated each position in silico, and computed the effect of each mutation on the model prediction. The standard deviation of the effect of mutations at each position is plotted. The red bar indicates the position of the GG bases of the PAM. c The model was trained again using only the 20 nt of the guide sequence. The box plots show the distribution of the effects that mutations to all possible bases have on the model prediction. One can see that the effect of specific mutations can be either positive or negative, revealing a strong dependence on the rest of the sequence. d To measure the level of interaction between positions, we generated all possible pairs of mutations for each sequence in a set of 100 random sequences and compared the effect of individual mutations to that of pairs of mutations. Positions are interacting if the effect of a double mutation (Eij) is different from the sum of the effect of the single mutations (Ei + Ej). The heat map shows the average Euclidean distance between Eij and Ei + Ej for all pairs of positions (see Supplementary Fig. 5). Note the strong network of interacting bases in the 5 nt of the seed sequence

Fig. 4

Fig. 4

Specific 5 nt seed sequences produce strong fitness defects that can be alleviated by reducing dCas9 concentration. a Distribution of the fitness effect of guide RNAs that share specific 5 nt seed sequences as compared with the distribution of all the guides in the library targeting the template strand of genes. b Plasmid psgRNA was programmed with various guide RNAs sharing either the TGGAA seed sequence or the ACCCA seed sequence and introduced in strain LC-E18. Guides named R1 and R2 have sequences that do not match any position in the chromosome of E. coli. Cells were grown overnight, serially diluted, and plated on LB agar with 1 nM of aTc. Representative figure from four independent experiments. c Distribution of the fitness effect of guide RNAs in our library depending on target gene essentiality and target strand in strain LC-E75, which expresses dCas9 at a lower concentration than strain LC-E18. d Distribution of the fitness effect of guide RNAs sharing the ACCCA or TGGAA seed sequences in strain LC-E18 and LC-E75. e Average fitness effect of guides sharing specific 5 bp seed sequences in strain LC-E18 (x axis) and LC-E75 (y axis). The red line shows a linear regression (slope = 0.2, _R_2 = 0.47). f Fitness effect of guides targeting gene lpoB and gene hisI in strain LC-E75. The strong fitness defect produced by C-lpoB2 can be explained by the presence of an off-target position in the def essential gene

Fig. 5

Fig. 5

Off-targets with only 9 nt of identity to the seed sequence can produce strong fitness defects. a We plot here the proportion of guide RNAs that have an off-target position in a region where guides consistently produce a strong fitness defect. This proportion is shown for guides that target the template strand of non-essential genes but produce an unexpected fitness defect (orange), as well as for guides in the same genes and orientation but that do not produce a fitness defect (blue). This blue curve can be interpreted as the false-positive rate. The green curve is the difference between the green and blue curves. It can be interpreted as an estimate of the proportion of guides whose fitness defect is due to an off-target effect. The maximum is obtained for a perfect match of 9 nt in the seed sequence, which indicates that 9 nt of identity in the seed sequence is enough to produce a strong fitness defect, but mostly false-positive off-target positions are detected when going down to 8 nt of identity. b Fitness effect of guides targeting gene lgt in strain LC-E75. The strong fitness defect produced by the T-bioC guide can be explained by the presence of an off-target position in the promoter of essential gene lgt. c Off-target position of the T-bioC guide in the promoter of lgt essential gene. d Repression of bioC and lgt expression by the T-bioC guide in strains LC-E18 and LC-E75 as measured by RT-qPCR. Points show biological replicates (n = 4), the black bar shows the median

Similar articles

Cited by

References

    1. Jiang W, Marraffini LA. CRISPR-Cas: new tools for genetic manipulations from bacterial immunity systems. Annu. Rev. Microbiol. 2015;69:209–228. doi: 10.1146/annurev-micro-091014-104441. - DOI - PubMed
    1. Jinek M, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. - DOI - PMC - PubMed
    1. Sternberg SH, Redding S, Jinek M, Greene EC, Doudna JA. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature. 2014;507:62–67. doi: 10.1038/nature13011. - DOI - PMC - PubMed
    1. Mojica FJM, Díez-Villaseñor C, García-Martínez J, Almendros C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology. 2009;155:733–740. doi: 10.1099/mic.0.023960-0. - DOI - PubMed
    1. Semenova E, et al. Interference by clustered regularly interspaced short palindromic repeat (CRISPR) RNA is governed by a seed sequence. Proc. Natl. Acad. Sci. USA. 2011;108:10098–10103. doi: 10.1073/pnas.1104144108. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources