CRISPR-DO for genome-wide CRISPR design and optimization (original) (raw)

Journal Article

,

1 School of Life Science and Technology, Tongji University, Shanghai 200092, China

Search for other works by this author on:

,

2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02115, USA

3 Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA

4 Division of Molecular and Cellular Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute

5 Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA

Search for other works by this author on:

,

1 School of Life Science and Technology, Tongji University, Shanghai 200092, China

Search for other works by this author on:

,

1 School of Life Science and Technology, Tongji University, Shanghai 200092, China

Search for other works by this author on:

,

2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02115, USA

3 Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA

Search for other works by this author on:

,

2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02115, USA

3 Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA

Search for other works by this author on:

,

6 State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, the First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, China

Search for other works by this author on:

,

1 School of Life Science and Technology, Tongji University, Shanghai 200092, China

Search for other works by this author on:

,

1 School of Life Science and Technology, Tongji University, Shanghai 200092, China

Search for other works by this author on:

,

1 School of Life Science and Technology, Tongji University, Shanghai 200092, China

*To whom correspondence should be addressed.

Search for other works by this author on:

... Show more

Received:

05 November 2015

Revision received:

14 June 2016

Cite

Jian Ma, Johannes Köster, Qian Qin, Shengen Hu, Wei Li, Chenhao Chen, Qingyi Cao, Jinzeng Wang, Shenglin Mei, Qi Liu, Han Xu, Xiaole Shirley Liu, CRISPR-DO for genome-wide CRISPR design and optimization, Bioinformatics, Volume 32, Issue 21, November 2016, Pages 3336–3338, https://doi.org/10.1093/bioinformatics/btw476
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

Motivation: Despite the growing popularity in using CRISPR/Cas9 technology for genome editing and gene knockout, its performance still relies on well-designed single guide RNAs (sgRNA). In this study, we propose a web application for the Design and Optimization (CRISPR-DO) of guide sequences that target both coding and non-coding regions in spCas9 CRISPR system across human, mouse, zebrafish, fly and worm genomes. CRISPR-DO uses a computational sequence model to predict sgRNA efficiency, and employs a specificity scoring function to evaluate the potential of off-target effect. It also provides information on functional conservation of target sequences, as well as the overlaps with exons, putative regulatory sequences and single-nucleotide polymorphisms (SNPs). The web application has a user-friendly genome–browser interface to facilitate the selection of the best target DNA sequences for experimental design.

Availability and Implementation: CRISPR-DO is available at http://cistrome.org/crispr/

Contact: qiliu@tongji.edu.cn or hanxu@jimmy.harvard.edu or xsliu@jimmy.harvard.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

1 Introduction

CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9) originally came from bacterial host defense and has provided new insight into site-specific genome editing ( Hsu et al. , 2014 ). The CRISPR/Cas9 technology requires an sgRNA with a ∼20 bp guide sequence to pair with the target DNA, which enables the Cas9 protein loading to the correct location and introduces a DNA double-strand break (DSB) ( Cho et al. , 2013 ; Cong et al. , 2013 ; Jinek et al. , 2012 ; Mali et al. , 2013 ).

To fully utilize the CRISPR genome editing technology, one must consider two essential factors: off-target effect and cleavage efficiency of sgRNA. On one hand, it has been reported that the mismatches in off-target sites, especially those in the 10-12 bases proximal to the PAM, allow less off-target binding ( Cong et al. , 2013 ; Hsu et al. , 2013 ; Jinek et al. , 2012 ; Mali et al. , 2013 ). About 17–19 bp truncated sgRNAs are more sensitive to mismatches thus more specific ( Fu et al. , 2014 ). CRISPR design tools like CRISPR-P ( Lei et al. , 2014 ), E-CRISPR ( Heigwer et al. , 2014 ) and CasOT ( Xiao et al. , 2014 ) are mainly focused on the prediction of off-target effect. Cas-OFFinder ( Bae et al. , 2014 ) is another off-target detection tool with multiple CRISPR Systems supported. On the other hand, for many CRISPR/Cas9 applications, especially for CRISPR screens, sgRNA-induced Cas9 cleavage efficiency is also important. The sgRNA efficiency is predominantly determined by the sequence of the guide and its 3′ flanking region ( Wang et al. , 2014 ; Xu et al. , 2015 ). Currently, CRISPR design tools such as CRISPR-ERA ( Liu et al. , 2015 ), Benchling (Benchling, RRID:SCR_013955) and sgRNA designer from Broad Institute ( Doench et al. , 2016 ) have consideration for both on- and off-target sgRNA design. Meanwhile, other genomic features of an sgRNA target, such as its evolutional conservation, regulatory potential and genetic variations, should also be considered in functional analysis using CRISPR/Cas9 systems ( Shi et al. , 2015 ).

To address this need, we propose a web application for the Design and Optimization (CRISPR-DO) of sgRNA sequences in human, mouse, zebrafish, fly and worm genomes. CRISPR-DO integrates an sgRNA efficiency prediction model and an off-target scoring function, which allows the users to evaluate the ‘goodness’ of an sgRNA in both sensitivity and specificity.

In CRISPR-DO, we annotate each target sequence with the PhastCons conservation score as well as the overlaps with exons, DNase I hypersensitive sites (DHSs), and single-nucleotide polymorphisms (SNPs) for better functional characterization when such data is available. We also integrated our target sequence search result with the powerful WashU Epigenome Browser ( Zhou et al. , 2011 ) to enable loading of other genomic tracks and facilitate the visualization and selection of target sequences. Details of online CRISPR-DO can be found in Supplementary materials .

2 Methods

2.1 CRISPR target sequence scan in the whole genome

The workflow to generate the target sequence database is shown in Supplementary Figure S1 . We first obtained the full human (GRCh37/hg19 and GRCh38/hg38), mouse (NCBI37/mm9 and GRCm38/mm10), zebrafish (danRer7), fly (dm6) and worm (ce10) genome sequences from UCSC genome database. We removed alternate loci, unlocalized and unplaced (random) sequences. Next we performed a genome-wide sgRNA target sequence scan for PAM sequences on both the forward and the reverse strand in each genome. Here, we identified only 5'-NGG-3' as PAM sequences, since PAM-like NAG sequences have much lower Cas9 loading efficiency ( Hsu et al. , 2013 ). We used 19 bp or 20 bp target with its PAM and 7 bp 3' flanking sequence (total 29 bp and 30 bp separately) to build our primary sgRNA target sequence library for further evaluation. The total number of target sequences in each genome is shown in Supplementary Tables S1, S2 and S3 .

2.2 Score sgRNA efficiency

The nucleotide composition of the 3' end of the target sequence influences Cas9 loading ( Wang et al. , 2014 ). Recently, we have developed a model to predict the efficiency of Cas9 cleavage based on the DNA sequence of an sgRNA target and its flanking regions ( Xu et al. , 2015 ). We showed that this model effectively predicts the efficiency of guide RNA in high-throughput CRISPR/Cas9 knockout screens. The coefficients of each nucleotide in the model can be represented as a sequence logo ( Supplementary Fig. S2 ). We applied this model to all target sequences to compute genome-wide efficiency scores. The overall efficiency score distributions are shown in Supplementary Figure S3 .

2.3 Measure sgRNA off-target effect

For each target in our primary database, we first used BWA ( Li and Durbin, 2009 ) to map it back to the genome, allowing maximum three mismatches and no gaps. When examining sgRNAs with mismatched mapping, we removed those not followed by NGG/NAG on both strands. For the remaining mismatched mappings, we calculated a specificity score based on Zhang Lab’s formula ( Hsu et al. , 2013 ) (more details in Supplementary materials ). The distributions of the specificity scores in hg38 and mm10 are shown in Supplementary Figure S4 .

2.4 CRISPR target sequence annotation

We annotated each target sequence to characterize its evolutionary conservation and to exam its overlap (≥1 bp) with referenced exons, regulatory elements or SNPs. The average conservation score is calculated using UCSC PhastCons ( Supplementary Table S4 ). The exon annotation is from UCSC refGene tables. The SNP annotation is from the NCBI dbSNP database. Peaks from each ENCODE DNase-seq data were merged to form the union of DNase I hypersensitivity regions, representing a comprehensive repertoire of putative regulatory elements in the genome ( Consortium, 2012 ). These annotation features give experimentalists more reference in selecting sgRNAs with the best balance of specificity, efficiency and function.

Acknowledgements

We thank Anya Zhang for polishing the writing of this manuscript. We also want to thank Xin Zhou and Ting Wang for their help setting up the WashU EpiGenome Browser.

Funding

The project was partially supported by the National Natural Science Foundation of China [31329003], NIH R01 HG008728, and the Claudia Adams Barr Award in Innovative Basic Cancer Research from Dana-Farber Cancer Institute.

Conflict of Interest : none declared.

References

et al. . (

2014

)

Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases

.

Bioinformatics

,

30

,

1473

1475

.

. et al. . (

2013

)

Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease

.

Nat. Biotechnol

.,

31

,

230

232

.

. et al. . (

2013

)

Multiplex genome engineering using CRISPR/Cas systems

.

Science

,

339

,

819

823

.

(

2012

)

An integrated encyclopedia of DNA elements in the human genome

.

Nature

,

489

,

57

74

.

. et al. . (

2016

)

Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9

.

Nat. Biotechnol

.,

34

,

184

191

.

. et al. . (

2014

)

Improving CRISPR-Cas nuclease specificity using truncated guide RNAs

.

Nat. Biotechnol

.,

32

,

279

284

.

et al. . (

2014

)

E-CRISP: fast CRISPR target site identification

.

Nat. Methods

,

11

,

122

123

.

. et al. . (

2013

)

DNA targeting specificity of RNA-guided Cas9 nucleases

.

Nat. Biotechnol

.,

31

,

827

832

.

et al. . (

2014

)

Development and applications of CRISPR-Cas9 for genome engineering

.

Cell

,

157

,

1262

1278

.

. et al. . (

2012

)

A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity

.

Science

,

337

,

816

821

.

. et al. . (

2014

)

CRISPR-P: a web tool for synthetic single-guide RNA design of CRISPR-system in plants

.

Mol. Plant

,

7

,

1494

1496

.

(

2009

)

Fast and accurate short read alignment with Burrows–Wheeler transform

.

Bioinformatics

,

25

,

1754

1760

.

. et al. . (

2015

)

CRISPR-ERA: a comprehensive design tool for CRISPR-mediated gene editing, repression and activation

.

Bioinformatics

,

31

,

3676

3678

.

. et al. . (

2013

)

RNA-guided human genome engineering via Cas9

.

Science

,

339

,

823

826

.

. et al. . (

2015

)

Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains

.

Nat. Biotechnol

.,

33

,

661

667

.

. et al. . (

2014

)

Genetic screens in human cells using the CRISPR-Cas9 system

.

Science

,

343

,

80

84

.

. et al. . (

2014

)

CasOT: a genome-wide Cas9/gRNA off-target searching tool

.

Bioinformatics

, [Epub ahead of print].

. et al. . (

2015

)

Sequence determinants of improved CRISPR sgRNA design

.

Genome Res

.,

25

,

1147

1157

.

. et al. . (

2011

)

The human epigenome Browser at Washington University

.

Nat. Methods

,

8

,

989

990

.

Author notes

Associate Editor: John Hancock

© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

Supplementary data

Citations

Views

Altmetric

Metrics

Total Views 6,649

5,284 Pageviews

1,365 PDF Downloads

Since 11/1/2016

Month: Total Views:
November 2016 54
December 2016 20
January 2017 33
February 2017 78
March 2017 66
April 2017 69
May 2017 32
June 2017 32
July 2017 34
August 2017 47
September 2017 28
October 2017 35
November 2017 37
December 2017 71
January 2018 78
February 2018 67
March 2018 104
April 2018 71
May 2018 57
June 2018 69
July 2018 56
August 2018 87
September 2018 35
October 2018 32
November 2018 55
December 2018 323
January 2019 302
February 2019 367
March 2019 403
April 2019 233
May 2019 120
June 2019 76
July 2019 59
August 2019 71
September 2019 84
October 2019 49
November 2019 57
December 2019 33
January 2020 25
February 2020 29
March 2020 47
April 2020 36
May 2020 23
June 2020 90
July 2020 41
August 2020 20
September 2020 29
October 2020 29
November 2020 42
December 2020 40
January 2021 61
February 2021 35
March 2021 53
April 2021 33
May 2021 47
June 2021 44
July 2021 41
August 2021 53
September 2021 48
October 2021 39
November 2021 47
December 2021 34
January 2022 36
February 2022 49
March 2022 49
April 2022 71
May 2022 43
June 2022 40
July 2022 52
August 2022 51
September 2022 151
October 2022 158
November 2022 93
December 2022 85
January 2023 69
February 2023 34
March 2023 48
April 2023 98
May 2023 68
June 2023 49
July 2023 39
August 2023 57
September 2023 42
October 2023 55
November 2023 98
December 2023 69
January 2024 53
February 2024 51
March 2024 54
April 2024 47
May 2024 47
June 2024 54
July 2024 37
August 2024 53
September 2024 69
October 2024 55
November 2024 15

Citations

43 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic