CHANGE-seq reveals genetic and epigenetic effects on CRISPR–Cas9 genome-wide activity (original) (raw)

Data availability

High-throughput sequencing data generated during the study (CHANGE-seq, GUIDE-seq, ATAC-seq, histone modification ChIP–seq) are available from the NCBI Sequence Read Archive and Gene Expression Omnibus under accession numbers PRJNA625995 and GSE149295, respectively.

Code availability

Open-source code for analyzing CHANGE-seq or CIRCLE-seq ( and GUIDE-seq (, and for performing related machine learning analysis (, are available through GitHub.


We thank Y. Chao and R. Heath from the St. Jude Protein Production Core Facility for recombinant Tn5 production, G. Wu for variant calling, S. Olsen from the St. Jude Hartwell Center for whole-genome sequencing and RNA-seq, and J. Zook for discussions on Genome-in-a-Bottle data. Standard mapping and variant calling were performed by the Center for Applied Bioinformatics, a centralized shared resource, partly funded by NIH award no. P30CA021765. This work was supported by St. Jude Children’s Research Hospital and ALSAC, National Institutes of Health Common Fund Somatic Cell Genome Editing award no. U01EB029373 (to S.Q.T., Y.C., J.M., G.K. and S.M.), St. Jude Children’s Research Hospital Collaborative Research Consortium on Novel Gene Therapies for Sickle Cell Disease and the Doris Duke Charitable Foundation (grant no. 2017093), National Institute of General Medicine Science award no. R35GM133614 (to Y.C.) and National Heart, Lung, and Blood Institute award no. P01HL053749 (to S.Q.T.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. E.S. acknowledges support under the Cooperative Research Agreement between the University of Maryland and the National Institute of Standards and Technology Center for Nanoscale Science and Technology, award no. 70NANB14H209, through the University of Maryland. NIST disclaimer: selected commercial equipment, instruments or materials are identified to specify the adequacy of experimental conditions or reported results. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the equipment, instruments or materials identified are necessarily the best available for the purpose.

Author information

Author notes

  1. Yanghua He
    Present address: Department of Human Nutrition, Food and Animal Sciences, College of Tropical Agriculture and Human Resources, University of Hawaii at Manoa, Honolulu, HI, USA

Authors and Affiliations

  1. Department of Hematology, St Jude Children’s Research Hospital, Memphis, TN, USA
    Cicera R. Lazzarotto, Nikolay L. Malinin, Yichao Li, GaHyun Lee, Yanghua He, Xin Lan, Kasey Jividen, Varun Katta, Qian Qi, Yong Cheng & Shengdar Q. Tsai
  Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Ruochi Zhang, Yang Yang & Jian Ma
    Ruochi Zhang, Yang Yang & Jian Ma
  Roche Sequencing & Life Science, Roche Diagnostics, Indianapolis, IN, USA
Eleanor Cowley
    Eleanor Cowley
  National Institute of Standards and Technology, Gaithersburg, MD, USA
Natalia G. Kolmakova & Samantha Maragh
    Natalia G. Kolmakova & Samantha Maragh
  Department of Bone Marrow Transplantation & Cellular Therapy, St Jude Children's Research Hospital, Memphis, TN, USA
Christopher T. Petersen & Giedre Krenciute
    Christopher T. Petersen & Giedre Krenciute
  Physical Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
Evgheni Strelcov
    Evgheni Strelcov
  Maryland NanoCenter, University of Maryland, College Park, MD, USA
Evgheni Strelcov
    Evgheni Strelcov


  Cicera R. Lazzarotto
  Nikolay L. Malinin
  Yichao Li
  Ruochi Zhang
  Yang Yang
  GaHyun Lee
  Eleanor Cowley
  Yanghua He
  Xin Lan
  Kasey Jividen
  Varun Katta
  Natalia G. Kolmakova
  Christopher T. Petersen
  Qian Qi
  Evgheni Strelcov
  Samantha Maragh
  Giedre Krenciute
  Jian Ma
  Yong Cheng
  Shengdar Q. Tsai
C.R.L. and S.Q.T. conceived of and designed the study. C.R.L., N.L.M., G.L., E.C., Y.H., X.L., K.J., V.K., N.G.K., E.S. and C.T.P. performed experiments. Y.L., Y.Y., R.Z. and Y.H. performed computational analyses. S.M., G.K., J.M., Y.C. and S.Q.T. supervised the project. C.R.L. and S.Q.T. wrote the paper with input from all authors.

Corresponding author

Correspondence toShengdar Q. Tsai.

Ethics declarations

Competing interests

C.R.L. and S.Q.T. have filed a patent application on CHANGE-seq. S.Q.T. is a co-inventor on patents covering CIRCLE-seq and GUIDE-seq. S.Q.T. is a member of the scientific advisory board of Kromatid.

Additional information

Extended data

Extended Data Fig. 1 Detailed overview of CHANGE-seq method.

Genomic DNA is randomly tagmented to an average of ~400 bp with a custom Tn5-transposome with an uracil-containing adapter. 9-nt Tn5-generated gaps in the DNA are filled in with a high-fidelity uracil-tolerant U+ polymerase and sealed with Taq DNA ligase. 4 bp overhangs are released with a mixture of USER enzyme and T4 PNK. DNA molecules are circularized at low concentrations that favor intramolecular ligation. Unwanted linear DNA is degraded with an exonuclease cocktail (comprised of Exonuclease I, Lambda exonuclease and Plasmid-Safe ATP-dependent DNase). Purified circular DNA is treated with Cas9:sgRNA RNP and cleaved DNA ends at on- and off-target sites are released for NGS library preparation, PCR amplification, and pair-end high-throughput sequencing.

Extended Data Fig. 2 Schematic comparison of CIRCLE-seq and CHANGE-seq workflows.

CHANGE-seq eliminates the requirement for specialized equipment for physical DNA shearing along with 9 additional enzymatic or purification steps. The simplified workflow substantially streamlines the process, decreases the requirement of input genomic DNA for circularization by approximately 5-fold and reduces the number of reactions to process each sample by 10- to 20-fold to a single reaction per sample.

Extended Data Fig. 3 CHANGE-seq detects all or nearly all sites detected by GUIDE-seq.

Venn diagrams showing the number of overlapping off-target sites captured by CHANGE-seq (blue) and GUIDE-seq (clear). The top six comparisons are of standard targets; the bottom four comparisons are of repetitive targets commonly used to benchmark genome-wide off-target activity detection methods.

Extended Data Fig. 4 GUIDE-seq optimization for human primary CD4+/CD8+ T cells.

a, Viability of cell population assessed by FACS analysis with DAPI staining 3 days post nucleofection with dsODN with phosphorothioate modifications at 5’ end, 3’ end, both ends or without dsODN (n=3). b, Indel rates at the intended target sites 3 days post nucleofection with dsODN with phosphorothioate modifications modifications at 5’ end, 3’ end, both ends or without dsODN (n=3). c, Integration rates of dsODNs with phosphorothioate modifications at 5’ end, 3’ end, both ends or without dsODN (n=3). d, Viability of cell population assessed by FACS analysis with DAPI staining 3 days post nucleofection with different doses of dsODN with 3’ end modifications (n=3). e, Indels rates at the intended target sites 3 days post nucleofection with different doses of dsODN with 3’ end modifications (n=3). f, dsODN integration rates 3 days post nucleofection with different doses of dsODN with 3’ end modifications (n=3). g, Scatterplots of GUIDE-seq read counts (log scale) between two independently prepared GUIDE-seq libraries for 3 target sites, showing GUIDE-seq technical reproducibility. Correlation between two samples was calculated using Pearson’s correlation coefficient.

Extended Data Fig. 5 Detailed characterization of a specific and active sgRNA targeting the TRAC region.

a, Manhattan plot showing the genome-wide distribution of sites identified in vitro by CHANGE-seq (arrow indicates the on-target site). b, Visualization of sites detected by CHANGE-seq. The intended target sequence is shown in the top line. Cleaved sites (on- and off-target) are shown underneath and are ordered top to bottom by CHANGE-seq read count, with mismatches to the intended target sequence indicated by colored nucleotides. Note that output is truncated to top sites with a full listing in Supplementary Table 3. c, Manhattan plot showing the on-target site detected for TRAC site 3 by GUIDE-seq, with no off-target sites being identified (arrow indicates the on-target site). d, Visualization of sites detected by GUIDE-seq. e, Indels rates at the intended target site 3 days post nucleofection (n=3). f, Flow plot showing distribution of TCR αβ expression in control (red) versus cells edited with sgRNA targeting TRAC site 3 (light blue). These experiments were performed three times with similar results. g, Barplot showing the percentage of TCR disruption 14 days after nucleofection with sgRNA:Cas9 complex measured by flow cytometry analysis (n=3).

Extended Data Fig. 6 GUIDE-seq dsODN tag independent indel frequencies are strongly correlated with tag integration frequencies.

Comparison of standard targeted sequencing and rhAMPSeq, a multiplex targeted sequencing method used in our study to validate on- and off-target site mutations. Scatterplots of indel mutation frequencies (top) and tag integration frequencies (bottom), between standard amplicon sequencing and rhAMPSeq, for sgRNAs targeted against CTLA4 site 9 and TRAC site 2 (See Methods). b, Scatterplots showing correlation between indel frequencies (in cells edited with Cas9 RNPs and no dsODN tag) and tag integration frequencies (in cells edited with Cas9 RNP and dsODN tag) at on- and off-target sites measured by targeted amplicon sequencing. (a-b) Correlation between two samples was calculated using Pearson’s correlation coefficient.

Extended Data Fig. 7 Targeted tag sequencing validation of CHANGE-seq detected off-target sites.

Targeted tag integration frequencies evaluated by standard targeted sequencing (triangle shape) and or rhAMPSeq (circle shape) (See Methods) at on- and off-target sites detected by both GUIDE-seq and CHANGE-seq, or detected by CHANGE-seq only (classes A-D), for sgRNAs targeted to TRAC site 2 and CTLA4 site 9. Panels for sites identified by both GUIDE-seq and CHANGE-seq and classes A and B for TRAC site 2 duplicated from main Fig. 4f for completeness.

Extended Data Fig. 8 GUIDE-seq read counts are strongly correlated with indel and tag integration frequencies in human primary T cells.

a, Scatterplots showing correlation between indel frequencies and GUIDE-seq read counts at on- and off-target sites, and b, tag integration and GUIDE-seq read counts at on- and off-target sites. (a-b) Correlation between two samples was calculated using Pearson’s correlation coefficient.

Extended Data Fig. 9 Influence of chromatin state on CRISPR-Cas9 genome-wide off-target activity.

a, Barplots showing the enrichment of individual epigenetic feature in GUIDE-seq (n=1,196), CHANGE-seq (n=11,000) and Cas-OFFinder (n=11,000). Statistical significance was calculated using two-tailed Welch’s t-test. Error bars indicate 95% confidence interval, estimated from 1000 bootstrap samples. b, Heatmap showing fold enrichment for various genomic annotations computed by ChromHMM for validation of chromatin state annotations. Darker colors represent higher fold enrichment.

Extended Data Fig. 10 CHANGE-seq enables detection of effects of individual genetic variation on genome-wide activity of genome editors.

a, Scatterplots of CHANGE-seq read counts (log scale) between two CHANGE-seq libraries independently prepared from the same source of genomic DNA, evaluating 6 target sites in 7 different genomes, showing that CHANGE-seq is highly reproducible. Correlation between two samples was calculated using Pearson’s correlation coefficient. b, Pairwise M/A plots for visualizing read count differences. The ratio (M) versus the average (A) of CHANGE-seq read counts (log scale) performed on the indicated GIAB or human T-cell sample versus a GM12878 GIAB reference sample. Each point represents an off-target site, and off-target sites that contain a non-reference single-nucleotide variant (SNV) are labelled in red.

Supplementary information

Supplementary Information

Supplementary Protocol, Supplementary Note, Supplementary Table 1, Supplementary Figures 1–3.

Reporting Summary

Supplementary Tables 2–8

This file contains Supplementary Tables 2–8. Supplementary Table 2 contains sgRNA target sites and NGS primers list; Supplementary Table 3 contains complete list of CHANGE-seq-detected sites; Supplementary Table 4 contains a list of CHANGE-seq target site specificity ratios; Supplementary Table 5 contains a list of on-target site targeted sequencing counts; Supplementary Table 6 contains a complete list of GUIDE-seq-detected sites; Supplementary Table 7 contains a list of off-target site targeted sequencing counts (standard targeted sequencing); Supplementary Table 8 contains a list of off-target site targeted sequencing counts (rhAmpSeq).

Cite this article

Lazzarotto, C.R., Malinin, N.L., Li, Y. et al. CHANGE-seq reveals genetic and epigenetic effects on CRISPR–Cas9 genome-wide activity.Nat Biotechnol 38, 1317–1327 (2020).

