Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples - PubMed (original) (raw)

Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples

Kristian Cibulskis et al. Nat Biotechnol. 2013 Mar.

Abstract

Detection of somatic point substitutions is a key step in characterizing the cancer genome. However, existing methods typically miss low-allelic-fraction mutations that occur in only a subset of the sequenced cells owing to either tumor heterogeneity or contamination by normal cells. Here we present MuTect, a method that applies a Bayesian classifier to detect somatic mutations with very low allele fractions, requiring only a few supporting reads, followed by carefully tuned filters that ensure high specificity. We also describe benchmarking approaches that use real, rather than simulated, sequencing data to evaluate the sensitivity and specificity as a function of sequencing depth, base quality and allelic fraction. Compared with other methods, MuTect has higher sensitivity with similar specificity, especially for mutations with allelic fractions as low as 0.1 and below, making MuTect particularly useful for studying cancer subclones and their evolution in standard exome and genome sequencing data.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Overview of somatic point mutation detection using MuTect. MuTect takes as input tumor (T) and normal (N) next generation sequencing data and, after removing low quality reads (Supplementary Methods), determines if there is evidence for a variant beyond the expected random sequencing errors. Candidate variant sites are then passed through six filters to remove artifacts (Table 1). Next, a Panel of Normals is used to screen out remaining false positives caused by rare error modes only detectable in additional samples. Finally, the somatic or germline status of passing variants is determined using the matched normal.

Figure 2

Figure 2

Sensitivity as a function of sequencing depth and allelic fraction. (a) Sensitivity and specificity of MuTect for mutations with an allele fraction of 0.2, tumor depth of 30x and normal depth of 30x using various values of the LOD threshold (θT) (0.1 ≤ θT ≤ 100). Results using a model of independent sequencing errors with uniform Q35 base quality scores and accurate read placement (solid grey) are shown as well as results from the virtual tumor approach for the standard (STD, dashed green) and high-confidence (HC, solid green) configurations. A typical setting of θT = 6.3 is marked with black circles. (b) Sensitivity as a function of tumor sequencing depth and allele fraction (indicated by color) using θT = 6.3. The calculated sensitivity using a model of independent sequencing errors and accurate read placement with uniform Q35 base quality scores (solid lines) are shown as well as results from the virtual tumor approach (circles) and the downsampling of validated colorectal mutations (diamonds). Error bars represent 95% CIs.

Figure 3

Figure 3

Specificity of variant detection and variant classification using virtual tumor approach. (a) Somatic miscall error rate for true reference sites as a function of tumor sequencing depth for the STD (red), HC (blue) and HC+PON (green) configurations of MuTect. Error bars represent 95% CIs. (b) Distribution of allele fraction for all miscalls as a function of tumor sequencing depth. (c) Fraction of events rejected by each filter; hashed regions indicate events rejected exclusively by each filter. (d) Somatic miscall error rate for true germline SNP sites by sequencing depth in the normal when the site is known to be variant in the population (blue) and novel (red). Error bars represent 95% CIs. (e,f) Mean power as a function of sequencing depth in the normal to have classified these events as germline or somatic at novel germline sites (e) and known germline variant sites (f).

Figure 4

Figure 4

Benchmarking mutation detection methods. (a) Comparison of sensitivity as a function of tumor sequencing depth and mutation allele fraction for different mutation detection methods and configurations. (b) Comparison of somatic miscall error rate for true germline sites as a function of sequencing depth in the normal. (c) Comparison of somatic miscall error rate for true reference sites as a function of tumor sequencing depth. (d) Sensitivity as a function of specificity for mutations with an allele fraction of 0.1, tumor depth of 30x and normal depth of 30x between different methods and configurations. Black dotted lines indicate change in sensitivity and specificity between STD and HC configurations for a method. Grey solid lines are the MuTect results of virtual tumor approach from Supplementary Figure 3. (a–c) Error bars represent 95% CIs.

Comment in

Similar articles

Cited by

References

    1. Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474:609–615. - PMC - PubMed
    1. Cancer Genome Atlas Research Network Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–1068. - PMC - PubMed
    1. Banerji S, et al. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature. 2012;486:405–409. - PMC - PubMed
    1. Stransky N, et al. The mutational landscape of head and neck squamous cell carcinoma. Science. 2011;333:1157–1160. - PMC - PubMed
    1. Ding L, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008;455:1069–1075. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources