VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing - PubMed (original) (raw)

VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing

Daniel C Koboldt et al. Genome Res. 2012 Mar.

Abstract

Cancer is a disease driven by genetic variation and mutation. Exome sequencing can be utilized for discovering these variants and mutations across hundreds of tumors. Here we present an analysis tool, VarScan 2, for the detection of somatic mutations and copy number alterations (CNAs) in exome data from tumor-normal pairs. Unlike most current approaches, our algorithm reads data from both samples simultaneously; a heuristic and statistical algorithm detects sequence variants and classifies them by somatic status (germline, somatic, or LOH); while a comparison of normalized read depth delineates relative copy number changes. We apply these methods to the analysis of exome sequence data from 151 high-grade ovarian tumors characterized as part of the Cancer Genome Atlas (TCGA). We validated some 7790 somatic coding mutations, achieving 93% sensitivity and 85% precision for single nucleotide variant (SNV) detection. Exome-based CNA analysis identified 29 large-scale alterations and 619 focal events per tumor on average. As in our previous analysis of these data, we observed frequent amplification of oncogenes (e.g., CCNE1, MYC) and deletion of tumor suppressors (NF1, PTEN, and CDKN2A). We searched for additional recurrent focal CNAs using the correlation matrix diagonal segmentation (CMDS) algorithm, which identified 424 significant events affecting 582 genes. Taken together, our results demonstrate the robust performance of VarScan 2 for somatic mutation and CNA detection and shed new light on the landscape of genetic alterations in ovarian cancer.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

The VarScan 2 mutation and copy number alteration detection algorithms. Alignments in BAM format for a tumor–normal pair are read simultaneously to identify inherited (germline), loss-of-heterozygosity (LOH), and somatic mutation events. Variants in each category are further classified as high confidence (HC) or low confidence (LC). HC variants are filtered to remove false positives from common sequencing- and alignment-related artifacts (see Table 1). The resulting variants are annotated and organized by tier; the average number of “tier 1” coding variants per tumor is shown for each category. At positions with at least 20× coverage (default), copy number alterations are detected by comparison of Q20 read depths from matched tumor–normal pairs, normalized based on the amount of input data for each sample. Raw contiguous regions from VarScan 2 are processed by circular binary segmentation (CBS) and a subsequent merging procedure that joins adjacent segments yields a set of somatic copy number alterations, which are further classified as large-scale (>25% of chromosome arm) or focal (<25%) events. Shown are the average numbers of events detected in 142 ovarian exomes.

Figure 2.

Figure 2.

Detection of large-scale and focal copy number alterations by sequencing- and array-based approaches. (A) Deletions and focal amplifications of chromosome 4 in sample TCGA-24-1103. Copy number estimates from array (gray), WGS (light blue), and exome (dark blue) indicate two regions of deletion as well as a focal amplification (window). Red lines indicate segmented exome CBS calls. (Below) Variant allele frequencies in the normal (blue) and tumor (green) indicate regions of loss of heterozygosity (LOH) in deleted segments. (B) Intersection of large-scale copy number alterations detected by SNP array, whole-genome sequencing, and exome sequencing approaches for five HGS-OVCa cases. For details, see Supplemental Table 3. (C) Intersection of gene-level (focal) copy number alterations detected by SNP array, whole-genome sequencing, and exome sequencing approaches for five HGS-OVCa cases.

Figure 3.

Figure 3.

Recurrent chromosome-arm gains and losses in ovarian cancer. Eight significant gains and 22 significant losses of chromosome arms identified by TCGA in SNP array data for 489 cases were recapitulated using exome data for 142 cases. Observed frequencies were highly correlated between data sets for both gains (r2 = 0.84) and losses (r2 = 0.86).

Figure 4.

Figure 4.

Global copy number alteration profile of ovarian cancer. Average log2 of copy number difference is plotted for chromosomes 1–22 and X. Amplifications are shown in red, deletions in blue, and neutral regions in gray. Significant peaks associated with known oncogenes or tumor suppressor genes are indicated.

Figure 5.

Figure 5.

Frequent copy number alteration of ovarian cancer genes. Exome-based copy number estimates were used to compute the proportion of ovarian cancer tumors (n = 142) exhibiting amplification or deletion of key ovarian cancer genes. Asterisks (*) indicate significantly altered genes identified from SNP array data in our previous study.

Similar articles

Cited by

References

    1. Abyzov A, Urban AE, Snyder M, Gerstein M 2011. CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21: 974–984 - PMC - PubMed
    1. Alam SM, Fujimoto J, Jahan I, Sato E, Tamaya T 2008. Coexpression of EphB4 and ephrinB2 in tumour advancement of ovarian cancers. Br J Cancer 98: 845–851 - PMC - PubMed
    1. Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F, Kitzman JO, Baker C, Malig M, Mutlu O, et al. 2009. Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41: 1061–1067 - PMC - PubMed
    1. Bainbridge MN, Wang M, Burgess DL, Kovar C, Rodesch MJ, D'Ascenzo M, Kitzman J, Wu YQ, Newsham I, Richmond TA, et al. 2010. Whole exome capture in solution with 3 Gbp of data. Genome Biol 11: R62 doi: 10.1186/gb-2010-11-6-r62 - PMC - PubMed
    1. Bast RC Jr, Hennessy B, Mills GB 2009. The biology of ovarian cancer: new opportunities for translation. Nat Rev Cancer 9: 415–428 - PMC - PubMed

MeSH terms

LinkOut - more resources