Detecting and annotating genetic variations using the HugeSeq pipeline (original) (raw)

Correspondence
Published: 07 March 2012
Cuiping Pan 1,
Michael J Clark 1,
Phil Lacroute 1,
Rui Chen 1,
Rajini Haraksingh 1,
Maeve O'Huallachain 1,
Mark B Gerstein 2,3,4,
Jeffrey M Kidd 1,
Carlos D Bustamante 1 &
…
Michael Snyder 1

Nature Biotechnology volume 30, pages 226–229 (2012)Cite this article

7399 Accesses
80 Citations
20 Altmetric
Metrics details

Subjects

To the Editor:

Deciphering genome sequences is important for the mapping of genetic diseases and prediction of their risks. Advances in high-throughput DNA sequencing technologies using short read lengths have enabled rapid sequencing of entire human genomes and unlocked the potential for comprehensive identification of their underlying genetic variations. Various computational algorithms for identifying and characterizing variants have been developed; however, most of these computational methods are neither integrated nor interoperable, making it difficult for biologists to extract all the genetic information from billions of sequences generated by these sequencing technologies. Here, we present HugeSeq, an integrated computational pipeline to fully automate the process of variant detection from alignment of these genomic sequences to detection and annotation of all types of genetic variations (single nucleotide polymorphisms (SNPs), short insertions or deletions (indels) and larger structural variations (SVs)).

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Subscribe to this journal

Receive 12 print issues and online access

$209.00 per year

only $17.42 per issue

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

References

Dean, J. & Ghemawat, S. MapReduce: simplified data processing on large clusters. in OSDI'04 Proceedings of the 6th Symposium on Operating Systems Design and Implementation (San Francisco, 2004).
Google Scholar
Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar
Li, H. et al. Bioinformatics 25, 2078–2079 (2009).
Article Google Scholar
McKenna, A. et al. Genome Res. 20, 1297–1303 (2010).
Article CAS Google Scholar
Albers, C.A. et al. Genome Res. 21, 961–973 (2011).
Article CAS Google Scholar
1000 Genomes Project Consortium. Nature 467, 1061–1073 (2010).
Chen, K. et al. Nat. Methods 6, 677–681 (2009).
Article CAS Google Scholar
Ye, K. et al. Bioinformatics 25, 2865–2871 (2009).
Article CAS Google Scholar
Abyzov, A. et al. Genome Res. 21, 974–984 (2011).
Article CAS Google Scholar
Lam, H.Y.K. et al. Nat. Biotechnol. 28, 47–55 (2010).
Article CAS Google Scholar
Danecek, P. et al. Bioinformatics 27, 2156–2158 (2011).
Article CAS Google Scholar
Quinlan, A.R. & Hall, I.M. Bioinformatics. 26, 841–842 (2010).
Article CAS Google Scholar
Mills, R.E. et al. Nature 470, 59–65 (2011).
Article CAS Google Scholar
Wang, K., Li, M. & Hakonarson, H. Nucleic Acids Res. 38, e164 (2010).
Article Google Scholar
Ng, P.C. & Henikoff, S. Annu. Rev. Genomics Hum. Genet. 7, 61–80 (2006).
Article CAS Google Scholar
Ramensky, V., Bork, P. & Sunyaev, S. Nucleic Acids Res. 30, 3894–3900 (2002).
Article CAS Google Scholar
Sanders, S.J. et al. Neuron 70, 863–885 (2011).
Article CAS Google Scholar
Ashley, E.A. et al. Lancet 375, 1525–1535 (2010).
Article CAS Google Scholar

Download references

Acknowledgements

We acknowledge support from the US National Institutes of Health. We also thank K. Ye, K. Chen and A. Abyzov for helpful discussions.

Author information

Author notes

Hugo Y K Lam
Present address: Present address: Personalis, Inc., Palo Alto, California, USA.,

Authors and Affiliations

Department of Genetics, Stanford University, Stanford, California, USA
Hugo Y K Lam, Cuiping Pan, Michael J Clark, Phil Lacroute, Rui Chen, Rajini Haraksingh, Maeve O'Huallachain, Jeffrey M Kidd, Carlos D Bustamante & Michael Snyder
Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
Mark B Gerstein
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA
Mark B Gerstein
Department of Computer Science, Yale University, New Haven, Connecticut, USA
Mark B Gerstein

Authors

Hugo Y K Lam
You can also search for this author inPubMed Google Scholar
Cuiping Pan
You can also search for this author inPubMed Google Scholar
Michael J Clark
You can also search for this author inPubMed Google Scholar
Phil Lacroute
You can also search for this author inPubMed Google Scholar
Rui Chen
You can also search for this author inPubMed Google Scholar
Rajini Haraksingh
You can also search for this author inPubMed Google Scholar
Maeve O'Huallachain
You can also search for this author inPubMed Google Scholar
Mark B Gerstein
You can also search for this author inPubMed Google Scholar
Jeffrey M Kidd
You can also search for this author inPubMed Google Scholar
Carlos D Bustamante
You can also search for this author inPubMed Google Scholar
Michael Snyder
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toMichael Snyder.

Ethics declarations

Competing interests

M.S. is a scientific advisory board member for Genapsys, Inc.; a scientific advisory board member and cofounder of Personalis, Inc.; and a scientific advisory board member for DNA Nexus.

Supplementary information

Rights and permissions

About this article

Cite this article

Lam, H., Pan, C., Clark, M. et al. Detecting and annotating genetic variations using the HugeSeq pipeline.Nat Biotechnol 30, 226–229 (2012). https://doi.org/10.1038/nbt.2134

Download citation

Published: 07 March 2012
Issue Date: March 2012
DOI: https://doi.org/10.1038/nbt.2134