Targeted capture and massively parallel sequencing of 12 human exomes (original) (raw)

Nature volume 461, pages 272–276 (2009)Cite this article

Abstract

Genome-wide association studies suggest that common genetic variants explain only a modest fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability1,2. Although DNA sequencing costs have fallen markedly3, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions (‘exomes’), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of 12 humans. These include eight HapMap individuals representing three populations4, and four unrelated individuals with a rare dominantly inherited disorder, Freeman–Sheldon syndrome (FSS)5. We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for Mendelian disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of non-synonymous variants by predicted functional impact.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 51 print issues and online access

$199.00 per year

only $3.90 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

References

  1. Cohen, J. C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004)
    Article ADS CAS Google Scholar
  2. Frazer, K. A., Murray, S. S., Schork, N. J. & Topol, E. J. Human genetic variation and its contribution to complex traits. Nature Rev. Genet. 10, 241–251 (2009)
    Article CAS Google Scholar
  3. Shendure, J. & Ji, H. Next-generation DNA sequencing. Nature Biotechnol. 26, 1135–1145 (2008)
    Article CAS Google Scholar
  4. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)
  5. Toydemir, R. M. et al. Mutations in embryonic myosin heavy chain (MYH3) cause Freeman-Sheldon syndrome and Sheldon-Hall syndrome. Nature Genet. 38, 561–565 (2006)
    Article CAS Google Scholar
  6. Sjoblom, T. et al. The consensus coding sequences of human breast and colorectal cancers. Science 314, 268–274 (2006)
    Article ADS Google Scholar
  7. Olson, M. Enrichment of super-sized resequencing targets from the human genome. Nature Methods 4, 891–892 (2007)
    Article CAS Google Scholar
  8. Hodges, E. et al. Genome-wide in situ exon capture for selective resequencing. Nature Genet. 39, 1522–1527 (2007)
    Article CAS Google Scholar
  9. National Center for Biotechnology Information. Consensus CDS protein set <http://www.ncbi.nlm.nih.gov/projects/CCDS> (2009)
  10. Ng, P. C. et al. Genetic variation in an individual human exome. PLoS Genet. 4, e1000160 (2008)
    Article Google Scholar
  11. Kidd, J. M. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008)
    Article ADS CAS Google Scholar
  12. Bentley, D. R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008)
    Article ADS CAS Google Scholar
  13. Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008)
    Article CAS Google Scholar
  14. Campbell, P. J. et al. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nature Genet. 40, 722–729 (2008)
    Article CAS Google Scholar
  15. Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8, 186–194 (1998)
    Article CAS Google Scholar
  16. Turner, E. H., Lee, C., Ng, S. B. & Shendure, J. Massively parallel exon capture and library-free resequencing across 16 individuals. Nature Methods 6, 315–316 (2009)
    Article CAS Google Scholar
  17. Kidd, J. M. et al. Haplotype sorting using human fosmid clone end-sequence pairs. Genome Res. 18, 2016–2023 (2008)
    Article CAS Google Scholar
  18. Albert, T. J. et al. Direct selection of human genomic loci by microarray hybridization. Nature Methods 4, 903–905 (2007)
    Article CAS Google Scholar
  19. Wheeler, D. A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008)
    Article ADS CAS Google Scholar
  20. Wang, J. et al. The diploid genome sequence of an Asian individual. Nature 456, 60–65 (2008)
    Article ADS CAS Google Scholar
  21. Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007)
    Article Google Scholar
  22. Ley, T. J. et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008)
    Article ADS CAS Google Scholar
  23. Boyko, A. R. et al. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083 (2008)
    Article Google Scholar
  24. Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)
    Article CAS Google Scholar
  25. Yngvadottir, B. et al. A genome-wide survey of the prevalence and evolutionary forces acting on human nonsense SNPs. Am. J. Hum. Genet. 84, 224–234 (2009)
    Article CAS Google Scholar
  26. Olson, M. V. When less is more: gene loss as an engine of evolutionary change. Am. J. Hum. Genet. 64, 18–23 (1999)
    Article CAS Google Scholar
  27. Cohen, J. et al. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9 . Nature Genet. 37, 161–165 (2005)
    Article CAS Google Scholar
  28. Jones, S. et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science 324, 217 (2009)
    Article ADS CAS Google Scholar
  29. Siva, N. 1000 Genomes project. Nature Biotechnol. 26, 256 (2008)
    Article Google Scholar
  30. Kryukov, G. V., Shpunt, A., Stamatoyannopoulos, J. A. & Sunyaev, S. R. Power of deep, all-exon resequencing for discovery of human trait genes. Proc. Natl Acad. Sci. USA 106, 3871–3876 (2009)
    Article ADS CAS Google Scholar

Download references

Acknowledgements

For discussions or assistance with genotyping data, we thank P. Green, J. Akey, R. Patwardhan, G. Cooper, J. Kidd, D. Gordon, J. Smith, I. Stanaway and M. Rieder. For assistance with project management, computation, data management and submission, we thank E. Torskey, S. Thompson, T. Amburg, B. McNally, S. Hearsey, M. Shumway and L. Hillier. For Human1M-Duo genotype data on HapMap samples, we thank Illumina. Our work was supported in part by grants from the National Institutes of Health/National Heart Lung and Blood Institute, the National Institutes of Health/National Human Genome Research Institute, National Institutes of Health/National Institute of Child Health and Human Development, and the Washington Research Foundation. S.B.N. is supported by the Agency for Science, Technology and Research, Singapore. E.H.T. and A.W.B. are supported by a training fellowship from the National Institutes of Health/National Human Genome Research Institute. E.E.E. is an investigator of the Howard Hughes Medical Institute.

Author Contributions The project was conceived and experiments planned by S.B.N., E.H.T., A.B., E.E.E., M.B., D.A.N. and J.S. Experiments were performed by S.B.N., E.H.T., C.L. and M.W. Algorithm development and data analysis were performed by S.B.N., P.D.R., S.D.F., A.W.B., T.S., M.B., D.A.N. and J.S. The manuscript was written by S.B.N. and J.S. All aspects of the study were supervised by J.S.

Author information

Authors and Affiliations

  1. Department of Genome Sciences,,
    Sarah B. Ng, Emily H. Turner, Peggy D. Robertson, Steven D. Flygare, Choli Lee, Tristan Shaffer, Michelle Wong, Evan E. Eichler, Deborah A. Nickerson & Jay Shendure
  2. Department of Pediatrics, University of Washington,
    Abigail W. Bigham & Michael Bamshad
  3. Howard Hughes Medical Institute, Seattle, Washington 98195, USA ,
    Evan E. Eichler
  4. Agilent Technologies, Santa Clara, California 95051, USA ,
    Arindam Bhattacharjee

Authors

  1. Sarah B. Ng
    You can also search for this author inPubMed Google Scholar
  2. Emily H. Turner
    You can also search for this author inPubMed Google Scholar
  3. Peggy D. Robertson
    You can also search for this author inPubMed Google Scholar
  4. Steven D. Flygare
    You can also search for this author inPubMed Google Scholar
  5. Abigail W. Bigham
    You can also search for this author inPubMed Google Scholar
  6. Choli Lee
    You can also search for this author inPubMed Google Scholar
  7. Tristan Shaffer
    You can also search for this author inPubMed Google Scholar
  8. Michelle Wong
    You can also search for this author inPubMed Google Scholar
  9. Arindam Bhattacharjee
    You can also search for this author inPubMed Google Scholar
  10. Evan E. Eichler
    You can also search for this author inPubMed Google Scholar
  11. Michael Bamshad
    You can also search for this author inPubMed Google Scholar
  12. Deborah A. Nickerson
    You can also search for this author inPubMed Google Scholar
  13. Jay Shendure
    You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence toSarah B. Ng or Jay Shendure.

Ethics declarations

Competing interests

COMPETING INTERESTS: A.B. is an employee of Agilent Technologies. Agilent supplies arrays that can be used for exome capture as described.

Additional information

The authors declare competing financial interests: details accompany the full-text HTML version of the paper at www.nature.com/nature.

Supplementary information

Supplementary Information

This file contains Supplementary Figures 1-6 with Legends and Supplementary Tables 1-5. (PDF 161 kb)

Supplementary Data 1

This file lists intervals within the targeted exome that were excluded from consideration based on poor anticipated mappability with 76 bp single-end reads. (TXT 211 kb)

Supplementary Data 2

This file lists the fraction of targeted coding bases in each gene that were covered in each of 12 individuals (either with >=1x coverage or with sufficient coverage to variant call). (TXT 2828 kb)

PowerPoint slides

Rights and permissions

About this article

Cite this article

Ng, S., Turner, E., Robertson, P. et al. Targeted capture and massively parallel sequencing of 12 human exomes.Nature 461, 272–276 (2009). https://doi.org/10.1038/nature08250

Download citation

This article is cited by