Identification of genetic variants using bar-coded multiplexed sequencing (original) (raw)

Nature Methods volume 5, pages 887–893 (2008)Cite this article

Abstract

We developed a generalized framework for multiplexed resequencing of targeted human genome regions on the Illumina Genome Analyzer using degenerate indexed DNA bar codes ligated to fragmented DNA before sequencing. Using this method, we simultaneously sequenced the DNA of multiple HapMap individuals at several Encyclopedia of DNA Elements (ENCODE) regions. We then evaluated the use of Bayes factors for discovering and genotyping polymorphisms. For polymorphisms that were either previously identified within the Single Nucleotide Polymorphism database (dbSNP) or visually evident upon re-inspection of archived ENCODE traces, we observed a false positive rate of 11.3% using strict thresholds for predicting variants and 69.6% for lax thresholds. Conversely, false negative rates were 10.8–90.8%, with false negatives at stricter cut-offs occurring at lower coverage (<10 aligned reads). These results suggest that >90% of genetic variants are discoverable using multiplexed sequencing provided sufficient coverage at the polymorphic base.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

Accession codes

Accessions

GenBank/EMBL/DDBJ

References

  1. International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
  2. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
  3. Zondervan, K.T. & Cardon, L.R. Designing candidate gene and genome-wide case-control association studies. Nat. Protoc. 2, 2492–2501 (2007).
    Article CAS Google Scholar
  4. Meyer, M., Stenzel, U., Myles, S., Prüfer, K. & Hofreiter, M. Targeted high-throughput sequencing of tagged nucleic acid samples. Nucleic Acids Res. 35, e97 (2007).
    Article Google Scholar
  5. Parameswaran, P. et al. A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing. Nucleic Acids Res. 35, e130 (2007).
    Article Google Scholar
  6. Milosavljevic, A. et al. Pooled genomic indexing of rhesus macaque. Genome Res. 15, 292–301 (2005).
    Article CAS Google Scholar
  7. Hamady, M., Walker, J.J., Harris, J.K., Gold, N.J. & Knight, R. Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nat. Methods 5, 235–237 (2008).
    Article CAS Google Scholar
  8. ENCODE Project Consortium et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).
  9. Albert, T.J. et al. Direct selection of human genomic loci by microarray hybridization. Nat. Methods 4, 903–905 (2007).
    Article CAS Google Scholar
  10. Hodges, E. et al. Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39, 1522–1527 (2007).
    Article CAS Google Scholar
  11. Porreca, G.J. et al. Multiplex amplification of large sets of human exons. Nat. Methods 4, 931–936 (2007).
    Article CAS Google Scholar
  12. Okou, D.T. et al. Microarray-based genomic selection for high-throughput resequencing. Nat. Methods 4, 907–909 (2007).
    Article CAS Google Scholar
  13. Jeck, W.R. et al. Extending assembly of short DNA sequences to handle error. Bioinformatics 23, 2942–2944 (2007).
    Article CAS Google Scholar

Download references

Acknowledgements

We acknowledge funding from the state of Arizona, US National Heart Lung and Blood Institute (U01 HL086528), the Stardust foundation, Science Foundation Arizona, and National Institute for Neurological Disorders and Strokes (R01 N5059873).

Author information

Author notes

  1. David W Craig, John V Pearson and Szabolcs Szelinger: These authors contributed equally to this work.

Authors and Affiliations

  1. The Translational Genomics Research Institute, 445 N. 5th St. 5th Floor, Phoenix, 85004, Arizona, USA
    David W Craig, John V Pearson, Szabolcs Szelinger, Aswin Sekar, Margot Redman, Jason J Corneveaux, Traci L Pawlowski, Trisha Laub, Dietrich A Stephan, Nils Homer & Matthew J Huentelman
  2. Illumina, 9885 Town Centre Drive, San Diego, 92121, California, USA
    Gary Nunn

Authors

  1. David W Craig
    You can also search for this author inPubMed Google Scholar
  2. John V Pearson
    You can also search for this author inPubMed Google Scholar
  3. Szabolcs Szelinger
    You can also search for this author inPubMed Google Scholar
  4. Aswin Sekar
    You can also search for this author inPubMed Google Scholar
  5. Margot Redman
    You can also search for this author inPubMed Google Scholar
  6. Jason J Corneveaux
    You can also search for this author inPubMed Google Scholar
  7. Traci L Pawlowski
    You can also search for this author inPubMed Google Scholar
  8. Trisha Laub
    You can also search for this author inPubMed Google Scholar
  9. Gary Nunn
    You can also search for this author inPubMed Google Scholar
  10. Dietrich A Stephan
    You can also search for this author inPubMed Google Scholar
  11. Nils Homer
    You can also search for this author inPubMed Google Scholar
  12. Matthew J Huentelman
    You can also search for this author inPubMed Google Scholar

Contributions

D.W.C., J.V.P., M.J.H., G.N. and D.A.S. contributed to initial experimental design. S.S., A.S., M.R., J.J.C., T.L. and T.L.P. contributed to development and execution of exact experimental protocols. J.V.P., D.W.C. and N.H. contributed to the development of bioinformatics and analysis pipelines.

Corresponding author

Correspondence toDavid W Craig.

Ethics declarations

Competing interests

G.N. is an employee of Illumina.

Supplementary information

Rights and permissions

About this article

Cite this article

Craig, D., Pearson, J., Szelinger, S. et al. Identification of genetic variants using bar-coded multiplexed sequencing.Nat Methods 5, 887–893 (2008). https://doi.org/10.1038/nmeth.1251

Download citation

This article is cited by