Computational methods for discovering structural variation with next-generation sequencing (original) (raw)
Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nat. Genet.36, 949–951 (2004). ArticleCASPubMed Google Scholar
Tuzun, E. et al. Fine-scale structural variation of the human genome. Nat. Genet.37, 727–732 (2005). ArticleCASPubMed Google Scholar
Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science318, 420–426 (2007). One of the first studies to use NGS data to detect structural variants, including using the linking signature for detecting insertions larger than the insert size. ArticleCASPubMedPubMed Central Google Scholar
Feuk, L., Carson, A.R. & Scherer, S.W. Structural variation in the human genome. Nat. Rev. Genet.7, 85–97 (2006). ArticleCASPubMed Google Scholar
McCarroll, S.A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet.40, 1166–1174 (2008). ArticleCASPubMed Google Scholar
Cooper, G.M., Zerr, T., Kidd, J.M., Eichler, E.E. & Nickerson, D.A. Systematic assessment of copy number variant detection via genome-wide SNP genotyping. Nat. Genet.40, 1199–1203 (2008). ArticleCASPubMedPubMed Central Google Scholar
Lee, S., Cheran, E. & Brudno, M. A robust framework for detecting structural variations in a genome. Bioinformatics24, i59–i67 (2008). ArticleCASPubMedPubMed Central Google Scholar
Bentley, D.R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature456, 53–59 (2008). The first high coverage NGS dataset of an individual. This data set has been used in many subsequent studies. ArticleCASPubMedPubMed Central Google Scholar
Hormozdiari, F., Alkan, C., Eichler, E.E. & Sahinalp, S.C. Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res.19, 1270–1278 (2009). One of the first comprehensive tools for structural variant detection; supports most basic signatures and uses soft clustering. ArticleCASPubMedPubMed Central Google Scholar
Lee, S., Hormozdiari, F., Alkan, C. & Brudno, M. MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions. Nat. Methods6, 473–474 (2009). The first method to use a distribution-based clustering approach, allowing the detection of smaller indels, and explicitly modeling heterozygosity. ArticleCASPubMed Google Scholar
McCarroll, S.A. & Altshuler, D.M. Copy-number variation and association studies of human disease. Nat. Genet.39, S37–S42 (2007). ArticleCASPubMed Google Scholar
Cooper, G.M., Nickerson, D.E. & Eichler, E.E. Mutational and selective effects on copy-number variants in the human genome. Nat. Genet.39, S22–S29 (2007). ArticleCASPubMed Google Scholar
Volik, S. et al. End-sequence profiling: Sequence-based analysis of aberrant genomes. Proc. Natl. Acad. Sci. USA100, 7696–7701 (2003). ArticlePubMedPubMed Central Google Scholar
Raphael, B.J., Volik, S., Collins, C. & Pevzner, P.A. Reconstructing tumor genome architectures. Bioinformatics19 (suppl. 2), 162–171 (2003). Google Scholar
Singleton, A.B. et al. Alpha-synuclein locus triplication causes Parkinson's disease. Science302, 841 (2003). ArticleCASPubMed Google Scholar
Kim, P.M. et al. Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history. Genome Res.18, 1865–1874 (2008). ArticleCASPubMedPubMed Central Google Scholar
Pinkel, D. et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet.20, 207–211 (1998). ArticleCASPubMed Google Scholar
Sebat, J. et al. Large-scale copy number polymorphism in the human genome. Science305, 525–528 (2004). ArticleCASPubMed Google Scholar
International HapMap Consortium. The International HapMap Project. Nature437, 1299–1320 (2005).
Conrad, D.F., Andrews, T.D., Carter, N.P., Hurles, M.E. & Pritchard, J.K. A high-resolution survey of deletion polymorphism in the human genome. Nat. Genet.38, 75–81 (2006). ArticleCASPubMed Google Scholar
Hinds, D.A., Kloek, A.P., Jen, M., Chen, X. & Frazer, K.A. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat. Genet.38, 82–85 (2006). ArticleCASPubMed Google Scholar
McCarroll, S.A. et al. Common deletion polymorphisms in the human genome. Nat. Genet.38, 86–92 (2006). ArticleCASPubMed Google Scholar
Sindi, S. & Raphael, B. Identification and frequency estimation of inversion polymorphisms from haplotype data. in Research in Computational Molecular Biology: Proc. RECOMB 2009 vol. 5541 (ed. Batzoglou, S.) 418–433 (Springer, Berlin, 2009). Chapter Google Scholar
Bashir, A., Volik, S., Collins, C., Bafna, V. & Raphael, B.J. Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer. PLOS Comput. Biol.4, e1000051 (2008). ArticlePubMedPubMed Central Google Scholar
Campbell, P.J. et al. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet.40, 722–729 (2008). The first study to use the DOC signatures in NGS data, detecting CNVs in tumor samples. ArticleCASPubMedPubMed Central Google Scholar
Ye, K., Schulz, M.H., Long, Q., Apweiler, R. & Ning, Z. Pindel: a pattern growth approach to detect breakpoints of large deletions and medium sized insertions from paired-end short reads. Bioinformatics published online, doi:10.1093/bioinformatics/btp394 (26 June 2009). A method that is able to detect indels with base-pair breakpoint resolution using NGS data, on the basis of the anchored split mapping signature.
Bailey, J.A. et al. Recent segmental duplications in the human genome. Science297, 1003–1007 (2002). ArticleCASPubMed Google Scholar
Cheng, Z. et al. A genome-wide comparison of recent chimpanzee and human segmental duplications. Nature437, 88–93 (2005). ArticleCASPubMed Google Scholar
Dohm, J.C., Lottaz, C., Borodina, T. & Himmelbauer, H. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res.36, e105 (2008). ArticlePubMedPubMed Central Google Scholar
Harismendy, O. et al. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol.10, R32 (2009). ArticlePubMedPubMed Central Google Scholar
Korbel, J.O. et al. PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol.10, R23 (2009). ArticlePubMedPubMed Central Google Scholar
McKernan, K.J. et al. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res.19, 1527–1541; doi:10.1101/gr.091868.109 (22 June 2009). ArticleCASPubMedPubMed Central Google Scholar
Chen, K. et al. BreakDancer: An algorithm for high resolution mapping of genomic structural variation. Nat. Methods6, 677–681; doi:10.1038/nmeth.1363 (9 August 2009). ArticleCASPubMedPubMed Central Google Scholar
Chiang, D.Y. et al. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods6, 99–103 (2009). ArticleCASPubMed Google Scholar
Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science323, 133–138 (2009). ArticleCASPubMed Google Scholar
Wong, K.K. et al. A comprehensive analysis of common copy-number variations in the human genome. Am. J. Hum. Genet.80, 91–104 (2007). ArticleCASPubMed Google Scholar
Locke, D.P. et al. Refinement of a chimpanzee pericentric inversion breakpoint to a segmental duplication cluster. Genome Biol.4, R50 (2003). ArticlePubMedPubMed Central Google Scholar