Detecting and annotating genetic variations using the HugeSeq pipeline (original) (raw)
- Correspondence
- Published: 07 March 2012
- Cuiping Pan1,
- Michael J Clark1,
- Phil Lacroute1,
- Rui Chen1,
- Rajini Haraksingh1,
- Maeve O'Huallachain1,
- Mark B Gerstein2,3,4,
- Jeffrey M Kidd1,
- Carlos D Bustamante1 &
- …
- Michael Snyder1
Nature Biotechnology volume 30, pages 226–229 (2012)Cite this article
- 7399 Accesses
- 80 Citations
- 20 Altmetric
- Metrics details
Subjects
To the Editor:
Deciphering genome sequences is important for the mapping of genetic diseases and prediction of their risks. Advances in high-throughput DNA sequencing technologies using short read lengths have enabled rapid sequencing of entire human genomes and unlocked the potential for comprehensive identification of their underlying genetic variations. Various computational algorithms for identifying and characterizing variants have been developed; however, most of these computational methods are neither integrated nor interoperable, making it difficult for biologists to extract all the genetic information from billions of sequences generated by these sequencing technologies. Here, we present HugeSeq, an integrated computational pipeline to fully automate the process of variant detection from alignment of these genomic sequences to detection and annotation of all types of genetic variations (single nucleotide polymorphisms (SNPs), short insertions or deletions (indels) and larger structural variations (SVs)).
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
References
- Dean, J. & Ghemawat, S. MapReduce: simplified data processing on large clusters. in OSDI'04 Proceedings of the 6th Symposium on Operating Systems Design and Implementation (San Francisco, 2004).
Google Scholar - Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar - Li, H. et al. Bioinformatics 25, 2078–2079 (2009).
Article Google Scholar - McKenna, A. et al. Genome Res. 20, 1297–1303 (2010).
Article CAS Google Scholar - Albers, C.A. et al. Genome Res. 21, 961–973 (2011).
Article CAS Google Scholar - 1000 Genomes Project Consortium. Nature 467, 1061–1073 (2010).
- Chen, K. et al. Nat. Methods 6, 677–681 (2009).
Article CAS Google Scholar - Ye, K. et al. Bioinformatics 25, 2865–2871 (2009).
Article CAS Google Scholar - Abyzov, A. et al. Genome Res. 21, 974–984 (2011).
Article CAS Google Scholar - Lam, H.Y.K. et al. Nat. Biotechnol. 28, 47–55 (2010).
Article CAS Google Scholar - Danecek, P. et al. Bioinformatics 27, 2156–2158 (2011).
Article CAS Google Scholar - Quinlan, A.R. & Hall, I.M. Bioinformatics. 26, 841–842 (2010).
Article CAS Google Scholar - Mills, R.E. et al. Nature 470, 59–65 (2011).
Article CAS Google Scholar - Wang, K., Li, M. & Hakonarson, H. Nucleic Acids Res. 38, e164 (2010).
Article Google Scholar - Ng, P.C. & Henikoff, S. Annu. Rev. Genomics Hum. Genet. 7, 61–80 (2006).
Article CAS Google Scholar - Ramensky, V., Bork, P. & Sunyaev, S. Nucleic Acids Res. 30, 3894–3900 (2002).
Article CAS Google Scholar - Sanders, S.J. et al. Neuron 70, 863–885 (2011).
Article CAS Google Scholar - Ashley, E.A. et al. Lancet 375, 1525–1535 (2010).
Article CAS Google Scholar
Acknowledgements
We acknowledge support from the US National Institutes of Health. We also thank K. Ye, K. Chen and A. Abyzov for helpful discussions.
Author information
Author notes
- Hugo Y K Lam
Present address: Present address: Personalis, Inc., Palo Alto, California, USA.,
Authors and Affiliations
- Department of Genetics, Stanford University, Stanford, California, USA
Hugo Y K Lam, Cuiping Pan, Michael J Clark, Phil Lacroute, Rui Chen, Rajini Haraksingh, Maeve O'Huallachain, Jeffrey M Kidd, Carlos D Bustamante & Michael Snyder - Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
Mark B Gerstein - Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA
Mark B Gerstein - Department of Computer Science, Yale University, New Haven, Connecticut, USA
Mark B Gerstein
Authors
- Hugo Y K Lam
You can also search for this author inPubMed Google Scholar - Cuiping Pan
You can also search for this author inPubMed Google Scholar - Michael J Clark
You can also search for this author inPubMed Google Scholar - Phil Lacroute
You can also search for this author inPubMed Google Scholar - Rui Chen
You can also search for this author inPubMed Google Scholar - Rajini Haraksingh
You can also search for this author inPubMed Google Scholar - Maeve O'Huallachain
You can also search for this author inPubMed Google Scholar - Mark B Gerstein
You can also search for this author inPubMed Google Scholar - Jeffrey M Kidd
You can also search for this author inPubMed Google Scholar - Carlos D Bustamante
You can also search for this author inPubMed Google Scholar - Michael Snyder
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toMichael Snyder.
Ethics declarations
Competing interests
M.S. is a scientific advisory board member for Genapsys, Inc.; a scientific advisory board member and cofounder of Personalis, Inc.; and a scientific advisory board member for DNA Nexus.
Supplementary information
Rights and permissions
About this article
Cite this article
Lam, H., Pan, C., Clark, M. et al. Detecting and annotating genetic variations using the HugeSeq pipeline.Nat Biotechnol 30, 226–229 (2012). https://doi.org/10.1038/nbt.2134
- Published: 07 March 2012
- Issue Date: March 2012
- DOI: https://doi.org/10.1038/nbt.2134