Measuring error rates in genomic perturbation screens: gold standards for human functional genomics - PubMed (original) (raw)

Comparative Study

Measuring error rates in genomic perturbation screens: gold standards for human functional genomics

Traver Hart et al. Mol Syst Biol. 2014.

Abstract

Technological advancement has opened the door to systematic genetics in mammalian cells. Genome-scale loss-of-function screens can assay fitness defects induced by partial gene knockdown, using RNA interference, or complete gene knockout, using new CRISPR techniques. These screens can reveal the basic blueprint required for cellular proliferation. Moreover, comparing healthy to cancerous tissue can uncover genes that are essential only in the tumor; these genes are targets for the development of specific anticancer therapies. Unfortunately, progress in this field has been hampered by off-target effects of perturbation reagents and poorly quantified error rates in large-scale screens. To improve the quality of information derived from these screens, and to provide a framework for understanding the capabilities and limitations of CRISPR technology, we derive gold-standard reference sets of essential and nonessential genes, and provide a Bayesian classifier of gene essentiality that outperforms current methods on both RNAi and CRISPR screens. Our results indicate that CRISPR technology is more sensitive than RNAi and that both techniques have nontrivial false discovery rates that can be mitigated by rigorous analytical methods.

Keywords: CRISPR; RNAi; cancer; essential genes; shRNA.

© 2014 The Authors. Published under the terms of the CC BY 4.0 license.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Analytical overview

Half of the matrix of shRNA hairpins was decomposed using linear algebra techniques to find a set of reference essential genes. Reference nonessentials were derived from low-expression genes across a compendium of RNA-seq experiments. For each cell line/timepoint in the 2nd half of the shRNA data, the empirical distributions of training essentials and nonessentials were determined, and for each remaining gene, a Bayes Factor (BF) is calculated which measures which distribution its cognate hairpin data most closely matches.

Figure 2

Figure 2. Screen quality and core essentials

  1. For each screen, genes are ranked by BF and evaluated against a test set of reference essentials and nonessentials, and a precision vs. recall (PR) curve is calculated. Three screens representing the variability in global performance are shown.
  2. Distribution of _F_-measures of the 68 screens used in this study. Screens with_F_-measure > 0.75 (n = 48) were considered high-performing and were retained for downstream analyses.
  3. Histogram of essential gene observations across the 48 performing cell lines. Genes essential in 24/48 lines (n = 291) were considered core essentials. Genes observed in only 1–3 cell lines are highly enriched for false positives.

Figure 3

Figure 3. The cumulative model of essential genes

  1. The top 36 cell lines were rank-ordered by _F_-measure, and the cumulative count of classified essential genes was plotted (black curve). Simulated repeat experiments sampling a population of 1,025 essential genes at 15% FDR yield a similar cumulative count (red curve).
  2. In simulated repeat experiments across parameter space, models sampling 875–1,175 essential genes at 13.5–16.5% FDR (1-Precision) yielded cumulative observation curves similar to what was observed experimentally.
  3. Histogram of observations of essential genes in top-ranked 12 screens (black), genes exclusive to the next set of 12 (blue), and exclusive to the 3rd set of 12 (red). Genes observed in at least 3 of the top 12 screens are classified as global essentials.

Figure 4

Figure 4. Characteristics of essential genes

  1. Essential genes are highly enriched for core protein complexes. Seventeen representative nonoverlapping complexes are shown, with the core essentials (black) and total essentials (gray) shown relative to the total number of subunits in the complex.
  2. Total essentials are separated into categories: those in complexes enriched for essential genes, those in other complexes but which fail enrichment tests, and those not annotated to be in any protein complex. The remaining genes are classified as nonessential.
  3. Fraction of genes in each category whose mouse orthologs are also essential; colors as in (B).
  4. Fraction of genes in each category whose yeast orthologs are also essential; colors as in (B).
  5. Fraction of genes in each category with one or more human paralogs; colors as in (B).

Figure 5

Figure 5. Biological drivers of variation in RNAi screen efficacy

  1. Plotting Ago2 gene expression (measured by microarrays; _y_-axis) versus cell line_F_-measure (_x_-axis) for pancreatic, and ovarian cancer cell lines reveals strong correlation (Pearson's r = 0.59). Inset, distribution of correlations of expressed genes (n = 10,673) versus_F_-measure; Ago2 is the top-ranked gene.
  2. The Pearson's correlation coefficient of absolute copy number vs. Bayes Factor was determined for all genes across 30 pancreatic and ovarian cancer cell lines. Core essential genes show a negative correlation between copy number and essentiality.
  3. Core essential genes were binned by absolute copy number across the 30 samples. In each bin, the fraction of core essentials that were accurately classified in the corresponding screens is plotted. High copy number among core essentials reduces sensitivity to RNAi.

Figure 6

Figure 6. Evaluating other shRNA data and methods

  1. A Analytical approach. CCE reference set was derived from the initial analysis; NE set is identical throughout.
  2. B, C Evaluating other RNAi data sets. (B) LOD scores were calculated for the pooled library shRNA screens in the HCT116 background in (Vizeacoumar et al, 2013) and evaluated against CCE-test and NE-test. Recall, TP/(TP+FN); Precision, TP/(TP+FP). All six screens showed very high accuracy. The filled circle indicates the point on the curve where LOD = 0. (C) LOD scores were calculated for the pooled library shRNA screens in 102 cancer cell lines in (Cheung et al, 2011). Blue points represent recall & precision at LOD = 0 as measured against CCE-test and NE-test. Red, recall and precision for the same cell lines and same reference sets from ATARiS gene solutions at phenotype score = −1.
  3. D Integrating gene expression into the Bayesian classifier. For RNAi screens with matched gene expression data (in this example, PDAC cell line CAPAN-2, black curve), genes are binned by expression level and the fraction of reference essentials in each bin (right_y_-axis) is plotted against the mean expression of genes in the bin (green points). A linear fit on the log-log plot (green dashed line) can be integrated into the Bayesian classifier as an informative prior.
  4. E Integrating expression data improves the performance of the classifier (green) over the base algorithm (blue). Both forms show better performance than other algorithms such as GARP (red) and RIGER (gold).

Figure 7

Figure 7. Evaluating CRISPR negative selection screens

  1. The fold-change distributions of gRNA targeting reference essential and nonessential genes in Shalem et al (2013) are similar to those shown by shRNA hairpins (see Fig1) and enable the application of the Bayes Factor approach.
  2. Published results from Shalem et al (2013), evaluated against CCE-test and NE-test. Dashed line shows that Bayes Factor approach more accurately captures essential genes in the A375 screen, the only screen for which raw data is available.
  3. Whole-screen results from Wang et al (Wang et al, 2013), evaluated against the same sets. NE-test genes are underrepresented in the Wang et al gRNA library, which gives the appearance of an artificial boost in precision when compared to the Shalem et al (2013) results.
  4. Comparing shRNA to CRISPR. Genes are rank-ordered by expression (gray curve, left axis) and binned. For four shRNA screens in pancreatic cancer cell lines withheld from the original analysis (red), the fraction of essential genes (by BF, no prior) in each bin (± s.d., right axis) is plotted against the mean expression of all genes in the bin. Genes with trace expression (log2(FPKM) < −2) are not essential and can therefore estimate background error rate (dashed line). Comparing CRISPR results demonstrates that, for the one dataset available, CRISPR can yield a similar number of essential genes at ˜10-fold lower FPR (green, BF ≥ 20, 660 genes), or double the number of essential genes at similar error rates (blue, BF ≥ 10, 1,319 genes).

Figure 8

Figure 8. The Daisy model of gene essentiality

  1. The Daisy model, where each petal represents a tissue or context in which a gene is essential. Petals overlap to varying degrees but all share a core set of essential housekeeping genes that should be detectable in any cell-based assay. Whole-organism studies will sample from the whole flower, not specific petals.
  2. Human orthologs of mouse essential genes were divided into core and noncore (‘peripheral’) essentials. Peripheral essentials show strong enrichment for disease genes while core essentials do not.
  3. Frequency of putative deleterious mutation by gene class, normalized for transcript length, derived from population exome studies (Tennessen et al, 2012). Inset, fraction of genes by class in which no variant was observed. Little variation is tolerated among core essentials, probably explaining the infrequency with which they are associated with disease.

Similar articles

Cited by

References

    1. Babij C, Zhang Y, Kurzeja RJ, Munzli A, Shehabeldin A, Fernando M, Quon K, Kassner PD, Ruefli-Brasse AA, Watson VJ, Fajardo F, Jackson A, Zondlo J, Sun Y, Ellison AR, Plewa CA, San MT, Robinson J, McCarter J, Schwandner R, et al. STK33 kinase activity is nonessential in KRAS-dependent cancer cells. Cancer Res. 2011;71:5818–5826. - PubMed
    1. Boutros M, Kiger AA, Armknecht S, Kerr K, Hild M, Koch B, Haas SA, Paro R, Perrimon N, Heidelberg Fly Array C. Genome-wide RNAi analysis of growth and viability in Drosophila cells. Science. 2004;303:832–835. - PubMed
    1. Burckstummer T, Banning C, Hainzl P, Schobesberger R, Kerzendorfer C, Pauler FM, Chen D, Them N, Schischlik F, Rebsamen M, Smida M, de laCruz FF, Lapao A, Liszt M, Eizinger B, Guenzl PM, Blomen VA, Konopka T, Gapp B, Parapatics K, et al. A reversible gene trap collection empowers haploid genetics in human cells. Nat Methods. 2013;10:1548–7105. - PMC - PubMed
    1. Carette JE, Guimaraes CP, Varadarajan M, Park AS, Wuethrich I, Godarova A, Kotecki M, Cochran BH, Spooner E, Ploegh HL, Brummelkamp TR. Haploid genetic screens in human cells identify host factors used by pathogens. Science. 2009;326:1231–1235. - PubMed
    1. Castle WE, Little CC. On a modified mendelian ratio among yellow mice. Science. 1910;32:868–870. - PubMed

Publication types

MeSH terms

LinkOut - more resources