Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV - PubMed (original) (raw)
Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV
Jarupon Fah Sathirapongsasuti et al. Bioinformatics. 2011.
Abstract
Motivation: The ability to detect copy-number variation (CNV) and loss of heterozygosity (LOH) from exome sequencing data extends the utility of this powerful approach that has mainly been used for point or small insertion/deletion detection.
Results: We present ExomeCNV, a statistical method to detect CNV and LOH using depth-of-coverage and B-allele frequencies, from mapped short sequence reads, and we assess both the method's power and the effects of confounding variables. We apply our method to a cancer exome resequencing dataset. As expected, accuracy and resolution are dependent on depth-of-coverage and capture probe design.
Availability: CRAN package 'ExomeCNV'.
Contact: fsathira@fas.harvard.edu; snelson@ucla.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
Figures
Fig. 1.
Overview of ExomeCNV analysis workflows. Two workflows are present: CNV detection and LOH detection. Each involves similar steps of exon/position/segment-wise CNV/LOH calling, Circular Binary Segmentation, and interval merging. User inputs and parameters are listed at each step.
Fig. 2.
Correlation of depth-of-coverage across exome sequencing samples. To demonstrate the consistency of capture and sequencing efficiency of individual exons represented by the depth-of-coverage per exon, the normalized individual exon coverage in all pairs of six-independent exomes were plotted. All 6 samples were captured using the Agilent SureSelect Human All Exon G3362. Samples 1–5 had mean base coverages of 36~39× as a result of 2 (Samples 1–4) or 3 (Sample 5) lanes of GAIIx single-end sequencing per sample. Sample 6 had mean base coverage of 60× as a result of 1 lane of GAIIx paired-end sequencing and demonstrates substantially different biases in individual exonic depth-of-coverage.
Fig. 3.
Examples of the power of ExomeCNV to detect segmental duplication, deletion and LOH based on an analytical calculation. Power is plotted relative to mean depth-of-coverage in the genomic segment, setting false positive to 1 per genome based on an analytical model of genome-wide power of detection at different window sizes (inset, a–d). Windows are the total length of a given sequence at a given exon or the sum of length of exons adjacent to each other in the genome. The effect of admixture (rate of 30%) on the power to detect deletions and single copy duplications are shown in (c) and (d), respectively. (e) plots the power of LOH detection versus depth-of-coverage of individual polymorphic position (single base pair) with variable admixture rates (inset). The periodicity of the power curve is due to discrete nature of the binomial test. The 35× depth of coverage is chosen because it is a typical minimal average depth of coverage for exome sequencing and is thus a conservative view of power within typical exome sequencing datasets.
Fig. 4.
Analysis of melanoma and paired normal samples. Interpretation of deletion, duplication and LOH from exonic sequence data using ExomeCNV and plotted with Circos. The most outer ring shows the chromosome ideograms in a pter–qter orientation, clockwise with the centromeres in red. From inside to outside, each data track represents (A) B-allele-frequency (BAF) from Omni-1 genotyping array with the region of LOH highlighted in blue underneath the track; (B) Log R Ratio (LRR) from genotyping array with the region of gain highlighted in red and the region of loss highlighted in green; (C) BAF from ExomeCNV output from ~40× depth-of-coverage exome sequencing with the region of LOH highlighted in blue; (D) log ratio of tumor and normal depth-of-coverage with the segment mean in red line, the region of gain highlighted in red and the region of loss highlighted in green. The LOH and CNV for the chromosome X and Y were not called for the genotyping data as genoCN (the algorithm used to call CNV from Omni-1) is not designed to analyze chromosome X and Y. The table in the middle summarizes best achievable specificity and sensitivity of ExomeCNV in detecting CNV and LOH relative to CNV/LOH calls from Omni-1 array assessment.
References
- Choi C.H., et al. Hypermethylation and loss of heterozygosity of tumor suppressor genes on chromosome 3p in cervical cancer. Cancer Lett. 2007;255:26–33. -PubMed
- Chou L.S., et al. DNA sequence capture and enrichment by microarray followed by next-generation sequencing for targeted resequencing: neurofibromatosis type 1 gene as a model. Clin. Chem. 2010;56:62–72. -PubMed
Publication types
MeSH terms
Grants and funding
- 1RC2 HL101715/HL/NHLBI NIH HHS/United States
- RC2 HL101715/HL/NHLBI NIH HHS/United States
- R01 MH071852/MH/NIMH NIH HHS/United States
- P30 CA016042/CA/NCI NIH HHS/United States
- P30 CA16042/CA/NCI NIH HHS/United States
- P30 AR057230/AR/NIAMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Research Materials
Miscellaneous