TumorBoost: normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays - PubMed (original) (raw)

TumorBoost: normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays

Henrik Bengtsson et al. BMC Bioinformatics. 2010.

Abstract

Background: High-throughput genotyping microarrays assess both total DNA copy number and allelic composition, which makes them a tool of choice for copy number studies in cancer, including total copy number and loss of heterozygosity (LOH) analyses. Even after state of the art preprocessing methods, allelic signal estimates from genotyping arrays still suffer from systematic effects that make them difficult to use effectively for such downstream analyses.

Results: We propose a method, TumorBoost, for normalizing allelic estimates of one tumor sample based on estimates from a single matched normal. The method applies to any paired tumor-normal estimates from any microarray-based technology, combined with any preprocessing method. We demonstrate that it increases the signal-to-noise ratio of allelic signals, making it significantly easier to detect allelic imbalances.

Conclusions: TumorBoost increases the power to detect somatic copy-number events (including copy-neutral LOH) in the tumor from allelic signals of Affymetrix or Illumina origin. We also conclude that high-precision allelic estimates can be obtained from a single pair of tumor-normal hybridizations, if TumorBoost is combined with single-array preprocessing methods such as (allele-specific) CRMA v2 for Affymetrix or BeadStudio's (proprietary) XY-normalization method for Illumina. A bounded-memory implementation is available in the open-source and cross-platform R package aroma.cn, which is part of the Aroma Project (http://www.aroma-project.org/).

PubMed Disclaimer

Figures

Figure 1

Figure 1

Genomic signals from genotyping microarrays in two chromosomal regions. Total (relative) copy numbers (a, e) and allele B fractions for the normal (b, f), the tumor (c, g) and the normalized tumor (d, h) for all SNPs on chromosome 2 (left) and chromosome 10 (right) in sample TCGA-23-1027. Homozygous SNPs (SNPs genotyped as AA or BB) are in gray, and heterozygous SNPs (AB) in black. Data are from the Affymetrix platform.

Figure 2

Figure 2

Raw paired allele B fractions. Paired observed allele B fractions, (β N, β T), of tumor TCGA-23-1027 versus its matched normal in six regions of constant PCN for the tumor. Top panels: normal (left), gained (middle), and copy-neutral LOH (right) regions from chromosome 2. Bottom panels: normal (left), deleted (middle), and copy-neutral LOH (right) regions from chromosome 10. SNPs called homozygous (AA and BB) are in gray. Linear models were robustly fitted to the heterozygous SNPs above and below the diagonal (black lines). Black discs mark the center of each cloud.

Figure 3

Figure 3

Paired allele B fractions after TumorBoost normalization. Paired allele B fractions, (β N, formula image), and empirical densities of the raw (β T; dashed) and the normalized (formula image; solid) allele B fractions for sample TCGA-23-1027. The same regions, SNPs and annotation as in Figure 2 are used.

Figure 4

Figure 4

Differences in (true) decrease in heterozygosity. Differences in (true) decrease in heterozygosity (for heterozygous SNPs) between different pairs of flanking PCN regions as a function of tumor purity (κ).

Figure 5

Figure 5

ROC evaluation (Chr 2). (a) Left panels: The region 108.0-140.0 Mb on Chr 2 in tumor-normal sample TCGA-23-1027 has a change point at approximately 124.0 Mb, which separates a normal diploid state from a gain. 1,171 loci in each of these two states are used for the evaluation. All 79 loci in the safety region have been excluded. (b) Right panels: The region 125.0-157.0 Mb on Chr 2 in tumor-normal sample TCGA-23-1027 has a change point at approximately 141.0 Mb, which separates a normal diploid state from a gain. 986 loci in each of these two states are used for the evaluation. All 64 loci in the safety region have been excluded. The top three rows show the total CNs (C), and the raw (ρ) and normalized (formula image) heterozygous DHs, respectively. A 1000 kb safety region (dashed gray frame) around the change point is excluded from the evaluation. The full resolution data points are colored black and the binned (H = 4) ones are colored blue. The three panels in the bottom row show the ROC performance of the TCNs (dotted green) and the raw (dashed black) and normalized (solid red and dot-dashed blue for naive and population-based genotypes, respectively) DHs at the full resolution (H = 1; no binning), and after binning in non-overlapping windows of size H = 2 and H = 4 SNPs, respectively.

Figure 6

Figure 6

ROC evaluation (Chr 10). (a) Left panels: The region 80.0-109.0 Mb on Chr 10 in tumor-normal sample TCGA-23-1027 has a change point at approximately 94.0 Mb, which separates a normal diploid state from a deletion. 1,276 loci in each of these two states are used for the evaluation. All 53 loci in the safety region have been excluded. (b) Right panels: The region 106.5-113.5 Mb on Chr 10 in tumor-normal sample TCGA-23-1027 has a change point at approximately 110.0 Mb, which separates a copy-neutral LOH region from a deletion. 254 loci in each of these two states are used for the evaluation. All 59 loci in the safety region have been excluded. The outline is the same as in Figure 5.

Figure 7

Figure 7

Influence of TumorBoost normalization on allele-specific copy numbers. Allele-specific CNs, (C TA, C TB), of tumor TCGA-23-1027 before (top panels) and after (bottom panels) TumorBoost normalization in a normal region (column 1), in a copy-neutral LOH region (column 2), in a gain (column 3), and in a deletion (column 4). These are some of the regions in Figure 2 using the same SNPs and annotations.

Similar articles

Cited by

References

    1. Albertson DG, Collins C, McCormick F, Gray JW. Chromosome aberrations in solid tumors. Nat Genet. 2003;34(4):369–376. doi: 10.1038/ng1215. - DOI - PubMed
    1. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70. doi: 10.1016/S0092-8674(00)81683-9. - DOI - PubMed
    1. Affymetrix Inc. Genome-Wide Human SNP Nsp/Sty 6.0 user guide. Affymetrix Inc; 2007. [Rev 1.]
    1. Gunderson KL, Steemers FJ, Lee G, Mendoza LG, Chee MS. A genome-wide scalable SNP genotyping assay using microarray technology. Nat Genet. 2005;37(5):549–554. doi: 10.1038/ng1547. - DOI - PubMed
    1. Peiffer DA, Le JM, Steemers FJ, Chang W, Jenniges T, Garcia F, Haden K, Li J, Shaw CA, Belmont J, Cheung SW, Shen RM, Barker DL, Gunderson KL. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Res. 2006;16(9):1136–1148. doi: 10.1101/gr.5402306. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources