CRISPR-DO for genome-wide CRISPR design and optimization (original) (raw)
Journal Article
,
1 School of Life Science and Technology, Tongji University, Shanghai 200092, China
Search for other works by this author on:
,
2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02115, USA
3 Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA
4 Division of Molecular and Cellular Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute
5 Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
Search for other works by this author on:
,
1 School of Life Science and Technology, Tongji University, Shanghai 200092, China
Search for other works by this author on:
,
1 School of Life Science and Technology, Tongji University, Shanghai 200092, China
Search for other works by this author on:
,
2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02115, USA
3 Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA
Search for other works by this author on:
,
2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02115, USA
3 Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA
Search for other works by this author on:
,
6 State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, the First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, China
Search for other works by this author on:
,
1 School of Life Science and Technology, Tongji University, Shanghai 200092, China
Search for other works by this author on:
,
1 School of Life Science and Technology, Tongji University, Shanghai 200092, China
Search for other works by this author on:
,
1 School of Life Science and Technology, Tongji University, Shanghai 200092, China
*To whom correspondence should be addressed.
Search for other works by this author on:
Received:
05 November 2015
Revision received:
14 June 2016
Cite
Jian Ma, Johannes Köster, Qian Qin, Shengen Hu, Wei Li, Chenhao Chen, Qingyi Cao, Jinzeng Wang, Shenglin Mei, Qi Liu, Han Xu, Xiaole Shirley Liu, CRISPR-DO for genome-wide CRISPR design and optimization, Bioinformatics, Volume 32, Issue 21, November 2016, Pages 3336–3338, https://doi.org/10.1093/bioinformatics/btw476
Close
Navbar Search Filter Mobile Enter search term Search
Abstract
Motivation: Despite the growing popularity in using CRISPR/Cas9 technology for genome editing and gene knockout, its performance still relies on well-designed single guide RNAs (sgRNA). In this study, we propose a web application for the Design and Optimization (CRISPR-DO) of guide sequences that target both coding and non-coding regions in spCas9 CRISPR system across human, mouse, zebrafish, fly and worm genomes. CRISPR-DO uses a computational sequence model to predict sgRNA efficiency, and employs a specificity scoring function to evaluate the potential of off-target effect. It also provides information on functional conservation of target sequences, as well as the overlaps with exons, putative regulatory sequences and single-nucleotide polymorphisms (SNPs). The web application has a user-friendly genome–browser interface to facilitate the selection of the best target DNA sequences for experimental design.
Availability and Implementation: CRISPR-DO is available at http://cistrome.org/crispr/
Contact: qiliu@tongji.edu.cn or hanxu@jimmy.harvard.edu or xsliu@jimmy.harvard.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
1 Introduction
CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9) originally came from bacterial host defense and has provided new insight into site-specific genome editing ( Hsu et al. , 2014 ). The CRISPR/Cas9 technology requires an sgRNA with a ∼20 bp guide sequence to pair with the target DNA, which enables the Cas9 protein loading to the correct location and introduces a DNA double-strand break (DSB) ( Cho et al. , 2013 ; Cong et al. , 2013 ; Jinek et al. , 2012 ; Mali et al. , 2013 ).
To fully utilize the CRISPR genome editing technology, one must consider two essential factors: off-target effect and cleavage efficiency of sgRNA. On one hand, it has been reported that the mismatches in off-target sites, especially those in the 10-12 bases proximal to the PAM, allow less off-target binding ( Cong et al. , 2013 ; Hsu et al. , 2013 ; Jinek et al. , 2012 ; Mali et al. , 2013 ). About 17–19 bp truncated sgRNAs are more sensitive to mismatches thus more specific ( Fu et al. , 2014 ). CRISPR design tools like CRISPR-P ( Lei et al. , 2014 ), E-CRISPR ( Heigwer et al. , 2014 ) and CasOT ( Xiao et al. , 2014 ) are mainly focused on the prediction of off-target effect. Cas-OFFinder ( Bae et al. , 2014 ) is another off-target detection tool with multiple CRISPR Systems supported. On the other hand, for many CRISPR/Cas9 applications, especially for CRISPR screens, sgRNA-induced Cas9 cleavage efficiency is also important. The sgRNA efficiency is predominantly determined by the sequence of the guide and its 3′ flanking region ( Wang et al. , 2014 ; Xu et al. , 2015 ). Currently, CRISPR design tools such as CRISPR-ERA ( Liu et al. , 2015 ), Benchling (Benchling, RRID:SCR_013955) and sgRNA designer from Broad Institute ( Doench et al. , 2016 ) have consideration for both on- and off-target sgRNA design. Meanwhile, other genomic features of an sgRNA target, such as its evolutional conservation, regulatory potential and genetic variations, should also be considered in functional analysis using CRISPR/Cas9 systems ( Shi et al. , 2015 ).
To address this need, we propose a web application for the Design and Optimization (CRISPR-DO) of sgRNA sequences in human, mouse, zebrafish, fly and worm genomes. CRISPR-DO integrates an sgRNA efficiency prediction model and an off-target scoring function, which allows the users to evaluate the ‘goodness’ of an sgRNA in both sensitivity and specificity.
In CRISPR-DO, we annotate each target sequence with the PhastCons conservation score as well as the overlaps with exons, DNase I hypersensitive sites (DHSs), and single-nucleotide polymorphisms (SNPs) for better functional characterization when such data is available. We also integrated our target sequence search result with the powerful WashU Epigenome Browser ( Zhou et al. , 2011 ) to enable loading of other genomic tracks and facilitate the visualization and selection of target sequences. Details of online CRISPR-DO can be found in Supplementary materials .
2 Methods
2.1 CRISPR target sequence scan in the whole genome
The workflow to generate the target sequence database is shown in Supplementary Figure S1 . We first obtained the full human (GRCh37/hg19 and GRCh38/hg38), mouse (NCBI37/mm9 and GRCm38/mm10), zebrafish (danRer7), fly (dm6) and worm (ce10) genome sequences from UCSC genome database. We removed alternate loci, unlocalized and unplaced (random) sequences. Next we performed a genome-wide sgRNA target sequence scan for PAM sequences on both the forward and the reverse strand in each genome. Here, we identified only 5'-NGG-3' as PAM sequences, since PAM-like NAG sequences have much lower Cas9 loading efficiency ( Hsu et al. , 2013 ). We used 19 bp or 20 bp target with its PAM and 7 bp 3' flanking sequence (total 29 bp and 30 bp separately) to build our primary sgRNA target sequence library for further evaluation. The total number of target sequences in each genome is shown in Supplementary Tables S1, S2 and S3 .
2.2 Score sgRNA efficiency
The nucleotide composition of the 3' end of the target sequence influences Cas9 loading ( Wang et al. , 2014 ). Recently, we have developed a model to predict the efficiency of Cas9 cleavage based on the DNA sequence of an sgRNA target and its flanking regions ( Xu et al. , 2015 ). We showed that this model effectively predicts the efficiency of guide RNA in high-throughput CRISPR/Cas9 knockout screens. The coefficients of each nucleotide in the model can be represented as a sequence logo ( Supplementary Fig. S2 ). We applied this model to all target sequences to compute genome-wide efficiency scores. The overall efficiency score distributions are shown in Supplementary Figure S3 .
2.3 Measure sgRNA off-target effect
For each target in our primary database, we first used BWA ( Li and Durbin, 2009 ) to map it back to the genome, allowing maximum three mismatches and no gaps. When examining sgRNAs with mismatched mapping, we removed those not followed by NGG/NAG on both strands. For the remaining mismatched mappings, we calculated a specificity score based on Zhang Lab’s formula ( Hsu et al. , 2013 ) (more details in Supplementary materials ). The distributions of the specificity scores in hg38 and mm10 are shown in Supplementary Figure S4 .
2.4 CRISPR target sequence annotation
We annotated each target sequence to characterize its evolutionary conservation and to exam its overlap (≥1 bp) with referenced exons, regulatory elements or SNPs. The average conservation score is calculated using UCSC PhastCons ( Supplementary Table S4 ). The exon annotation is from UCSC refGene tables. The SNP annotation is from the NCBI dbSNP database. Peaks from each ENCODE DNase-seq data were merged to form the union of DNase I hypersensitivity regions, representing a comprehensive repertoire of putative regulatory elements in the genome ( Consortium, 2012 ). These annotation features give experimentalists more reference in selecting sgRNAs with the best balance of specificity, efficiency and function.
Acknowledgements
We thank Anya Zhang for polishing the writing of this manuscript. We also want to thank Xin Zhou and Ting Wang for their help setting up the WashU EpiGenome Browser.
Funding
The project was partially supported by the National Natural Science Foundation of China [31329003], NIH R01 HG008728, and the Claudia Adams Barr Award in Innovative Basic Cancer Research from Dana-Farber Cancer Institute.
Conflict of Interest : none declared.
References
et al. . (
2014
)
Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases
.
Bioinformatics
,
30
,
1473
–
1475
.
. et al. . (
2013
)
Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease
.
Nat. Biotechnol
.,
31
,
230
–
232
.
. et al. . (
2013
)
Multiplex genome engineering using CRISPR/Cas systems
.
Science
,
339
,
819
–
823
.
(
2012
)
An integrated encyclopedia of DNA elements in the human genome
.
Nature
,
489
,
57
–
74
.
. et al. . (
2016
)
Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9
.
Nat. Biotechnol
.,
34
,
184
–
191
.
. et al. . (
2014
)
Improving CRISPR-Cas nuclease specificity using truncated guide RNAs
.
Nat. Biotechnol
.,
32
,
279
–
284
.
et al. . (
2014
)
E-CRISP: fast CRISPR target site identification
.
Nat. Methods
,
11
,
122
–
123
.
. et al. . (
2013
)
DNA targeting specificity of RNA-guided Cas9 nucleases
.
Nat. Biotechnol
.,
31
,
827
–
832
.
et al. . (
2014
)
Development and applications of CRISPR-Cas9 for genome engineering
.
Cell
,
157
,
1262
–
1278
.
. et al. . (
2012
)
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity
.
Science
,
337
,
816
–
821
.
. et al. . (
2014
)
CRISPR-P: a web tool for synthetic single-guide RNA design of CRISPR-system in plants
.
Mol. Plant
,
7
,
1494
–
1496
.
(
2009
)
Fast and accurate short read alignment with Burrows–Wheeler transform
.
Bioinformatics
,
25
,
1754
–
1760
.
. et al. . (
2015
)
CRISPR-ERA: a comprehensive design tool for CRISPR-mediated gene editing, repression and activation
.
Bioinformatics
,
31
,
3676
–
3678
.
. et al. . (
2013
)
RNA-guided human genome engineering via Cas9
.
Science
,
339
,
823
–
826
.
. et al. . (
2015
)
Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains
.
Nat. Biotechnol
.,
33
,
661
–
667
.
. et al. . (
2014
)
Genetic screens in human cells using the CRISPR-Cas9 system
.
Science
,
343
,
80
–
84
.
. et al. . (
2014
)
CasOT: a genome-wide Cas9/gRNA off-target searching tool
.
Bioinformatics
, [Epub ahead of print].
. et al. . (
2015
)
Sequence determinants of improved CRISPR sgRNA design
.
Genome Res
.,
25
,
1147
–
1157
.
. et al. . (
2011
)
The human epigenome Browser at Washington University
.
Nat. Methods
,
8
,
989
–
990
.
Author notes
Associate Editor: John Hancock
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Supplementary data
Citations
Views
Altmetric
Metrics
Total Views 6,649
5,284 Pageviews
1,365 PDF Downloads
Since 11/1/2016
Month: | Total Views: |
---|---|
November 2016 | 54 |
December 2016 | 20 |
January 2017 | 33 |
February 2017 | 78 |
March 2017 | 66 |
April 2017 | 69 |
May 2017 | 32 |
June 2017 | 32 |
July 2017 | 34 |
August 2017 | 47 |
September 2017 | 28 |
October 2017 | 35 |
November 2017 | 37 |
December 2017 | 71 |
January 2018 | 78 |
February 2018 | 67 |
March 2018 | 104 |
April 2018 | 71 |
May 2018 | 57 |
June 2018 | 69 |
July 2018 | 56 |
August 2018 | 87 |
September 2018 | 35 |
October 2018 | 32 |
November 2018 | 55 |
December 2018 | 323 |
January 2019 | 302 |
February 2019 | 367 |
March 2019 | 403 |
April 2019 | 233 |
May 2019 | 120 |
June 2019 | 76 |
July 2019 | 59 |
August 2019 | 71 |
September 2019 | 84 |
October 2019 | 49 |
November 2019 | 57 |
December 2019 | 33 |
January 2020 | 25 |
February 2020 | 29 |
March 2020 | 47 |
April 2020 | 36 |
May 2020 | 23 |
June 2020 | 90 |
July 2020 | 41 |
August 2020 | 20 |
September 2020 | 29 |
October 2020 | 29 |
November 2020 | 42 |
December 2020 | 40 |
January 2021 | 61 |
February 2021 | 35 |
March 2021 | 53 |
April 2021 | 33 |
May 2021 | 47 |
June 2021 | 44 |
July 2021 | 41 |
August 2021 | 53 |
September 2021 | 48 |
October 2021 | 39 |
November 2021 | 47 |
December 2021 | 34 |
January 2022 | 36 |
February 2022 | 49 |
March 2022 | 49 |
April 2022 | 71 |
May 2022 | 43 |
June 2022 | 40 |
July 2022 | 52 |
August 2022 | 51 |
September 2022 | 151 |
October 2022 | 158 |
November 2022 | 93 |
December 2022 | 85 |
January 2023 | 69 |
February 2023 | 34 |
March 2023 | 48 |
April 2023 | 98 |
May 2023 | 68 |
June 2023 | 49 |
July 2023 | 39 |
August 2023 | 57 |
September 2023 | 42 |
October 2023 | 55 |
November 2023 | 98 |
December 2023 | 69 |
January 2024 | 53 |
February 2024 | 51 |
March 2024 | 54 |
April 2024 | 47 |
May 2024 | 47 |
June 2024 | 54 |
July 2024 | 37 |
August 2024 | 53 |
September 2024 | 69 |
October 2024 | 55 |
November 2024 | 15 |
Citations
43 Web of Science
×
Email alerts
Citing articles via
More from Oxford Academic