A faster circular binary segmentation algorithm for the analysis of array CGH data - PubMed (original) (raw)
A faster circular binary segmentation algorithm for the analysis of array CGH data
E S Venkatraman et al. Bioinformatics. 2007.
Abstract
Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number. The algorithm tests for change-points using a maximal t-statistic with a permutation reference distribution to obtain the corresponding P-value. The number of computations required for the maximal test statistic is O(N2), where N is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster algorithm.
Results: We present a hybrid approach to obtain the P-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analyses of array CGH data from breast cancer cell lines to show the impact of the new approaches on the analysis of real data.
Availability: An R version of the CBS algorithm has been implemented in the "DNAcopy" package of the Bioconductor project. The proposed hybrid method for the P-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher.
Similar articles
- Robust smooth segmentation approach for array CGH data analysis.
Huang J, Gusnanto A, O'Sullivan K, Staaf J, Borg A, Pawitan Y. Huang J, et al. Bioinformatics. 2007 Sep 15;23(18):2463-9. doi: 10.1093/bioinformatics/btm359. Epub 2007 Jul 27. Bioinformatics. 2007. PMID: 17660206 - A fast and flexible method for the segmentation of aCGH data.
Ben-Yaacov E, Eldar YC. Ben-Yaacov E, et al. Bioinformatics. 2008 Aug 15;24(16):i139-45. doi: 10.1093/bioinformatics/btn272. Bioinformatics. 2008. PMID: 18689815 - A segmentation/clustering model for the analysis of array CGH data.
Picard F, Robin S, Lebarbier E, Daudin JJ. Picard F, et al. Biometrics. 2007 Sep;63(3):758-66. doi: 10.1111/j.1541-0420.2006.00729.x. Biometrics. 2007. PMID: 17825008 - Recent advances in array comparative genomic hybridization technologies and their applications in human genetics.
Lockwood WW, Chari R, Chi B, Lam WL. Lockwood WW, et al. Eur J Hum Genet. 2006 Feb;14(2):139-48. doi: 10.1038/sj.ejhg.5201531. Eur J Hum Genet. 2006. PMID: 16288307 Review. - Array comparative genomic hybridization copy number profiling: a new tool for translational research in solid malignancies.
Costa JL, Meijer G, Ylstra B, Caldas C. Costa JL, et al. Semin Radiat Oncol. 2008 Apr;18(2):98-104. doi: 10.1016/j.semradonc.2007.10.005. Semin Radiat Oncol. 2008. PMID: 18314064 Review.
Cited by
- Genomic profiling of circulating tumor DNA for childhood cancers.
Lei S, Jia S, Takalkar S, Chang TC, Ma X, Szlachta K, Xu K, Cheng Z, Hui Y, Koo SC, Mead PE, Gao Q, Kumar P, Bailey CP, Sunny J, Pappo AS, Federico SM, Robinson GW, Gajjar A, Rubnitz JE, Jeha S, Pui CH, Inaba H, Wu G, Klco JM, Tatevossian RG, Mullighan CG. Lei S, et al. Leukemia. 2024 Nov 10. doi: 10.1038/s41375-024-02461-x. Online ahead of print. Leukemia. 2024. PMID: 39523434 - Genomic Amplification of TBC1D31 Promotes Hepatocellular Carcinoma Through Reducing the Rab22A-Mediated Endolysosomal Trafficking and Degradation of EGFR.
Cao P, Chen H, Zhang Y, Zhang Q, Shi M, Han H, Wang X, Jin L, Guo B, Hao R, Zhao X, Li Y, Gao C, Liu X, Wang Y, Yang A, Yang C, Si A, Li H, Song Q, He F, Zhou G. Cao P, et al. Adv Sci (Weinh). 2024 Oct;11(40):e2405459. doi: 10.1002/advs.202405459. Epub 2024 Aug 29. Adv Sci (Weinh). 2024. PMID: 39206796 Free PMC article. - DTDHM: detection of tandem duplications based on hybrid methods using next-generation sequencing data.
Yuan T, Dong J, Jia B, Jiang H, Zhao Z, Zhou M. Yuan T, et al. PeerJ. 2024 Jul 26;12:e17748. doi: 10.7717/peerj.17748. eCollection 2024. PeerJ. 2024. PMID: 39076774 Free PMC article. - A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data.
Vinceti A, Iannuzzi RM, Boyle I, Trastulla L, Campbell CD, Vazquez F, Dempster JM, Iorio F. Vinceti A, et al. Genome Biol. 2024 Jul 19;25(1):192. doi: 10.1186/s13059-024-03336-1. Genome Biol. 2024. PMID: 39030569 Free PMC article. - Deconstructing Intratumoral Heterogeneity through Multiomic and Multiscale Analysis of Serial Sections.
Schupp PG, Shelton SJ, Brody DJ, Eliscu R, Johnson BE, Mazor T, Kelley KW, Potts MB, McDermott MW, Huang EJ, Lim DA, Pieper RO, Berger MS, Costello JF, Phillips JJ, Oldham MC. Schupp PG, et al. Cancers (Basel). 2024 Jul 1;16(13):2429. doi: 10.3390/cancers16132429. Cancers (Basel). 2024. PMID: 39001492 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources