Accurate de novo and transmitted indel detection in exome-capture data using microassembly (original) (raw)

Nature Methods volume 11, pages 1033–1036 (2014)Cite this article

Subjects

Abstract

We present an open-source algorithm, Scalpel (http://scalpel.sourceforge.net/), which combines mapping and assembly for sensitive and specific discovery of insertions and deletions (indels) in exome-capture data. A detailed repeat analysis coupled with a self-tuning _k_-mer strategy allows Scalpel to outperform other state-of-the-art approaches for indel discovery, particularly in regions containing near-perfect repeats. We analyzed 593 families from the Simons Simplex Collection and demonstrated Scalpel's power to detect long (≥30 bp) transmitted events and enrichment for de novo likely gene-disrupting indels in autistic children.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

Accession codes

Primary accessions

Sequence Read Archive

References

  1. DePristo, M.A. et al. Nat. Genet. 43, 491–498 (2011).
    Article CAS PubMed PubMed Central Google Scholar
  2. O'Rawe, J. et al. Genome Med. 5, 28 (2013).
    Article CAS PubMed PubMed Central Google Scholar
  3. Zook, J.M. et al. Nat. Biotechnol. 32, 246–251 (2014).
    Article CAS PubMed Google Scholar
  4. Mullaney, J.M., Mills, R.E., Pittard, W.S. & Devine, S.E. Hum. Mol. Genet. 19, R131–R136 (2010).
    Article CAS PubMed PubMed Central Google Scholar
  5. Pearson, C.E., Edamura, N.K. & Cleary, J.D. Nat. Rev. Genet. 6, 729–742 (2005).
    Article CAS PubMed Google Scholar
  6. Iossifov, I. et al. Neuron 74, 285–299 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  7. Li, H., Ruan, J. & Durbin, R. Genome Res. 18, 1851–1858 (2008).
    Article CAS PubMed PubMed Central Google Scholar
  8. Montgomery, S.B. et al. Genome Res. 23, 749–761 (2013).
    Article CAS PubMed PubMed Central Google Scholar
  9. Albers, C.A. et al. Genome Res. 21, 961–973 (2011).
    Article CAS PubMed PubMed Central Google Scholar
  10. Ye, K., Schulz, M.H., Long, Q., Apweiler, R. & Ning, Z. Bioinformatics 25, 2865–2871 (2009).
    Article CAS PubMed PubMed Central Google Scholar
  11. Karakoc, E. et al. Nat. Methods 9, 176–178 (2012).
    Article CAS Google Scholar
  12. Li, Y. et al. Nat. Biotechnol. 29, 723–730 (2011).
    Article CAS PubMed Google Scholar
  13. Li, H. Bioinformatics 28, 1838–1844 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  14. Li, S. et al. Genome Res. 23, 195–200 (2013).
    Article PubMed PubMed Central Google Scholar
  15. Iqbal, Z., Caccamo, M., Turner, I., Flicek, P. & McVean, G. Nat. Genet. 44, 226–232 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  16. Chen, K. et al. Genome Res. 24, 310–317 (2014).
    Article PubMed PubMed Central Google Scholar
  17. Li, H. et al. Bioinformatics 25, 2078–2079 (2009).
    Article PubMed PubMed Central Google Scholar
  18. Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at http://arxiv.org/abs/1207.3907v2 (2012).
  19. Gymrek, M., Golan, D., Rosset, S. & Erlich, Y. Genome Res. 22, 1154–1162 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  20. Highnam, G. et al. Nucleic Acids Res. 41, e32 (2013).
    Article CAS PubMed Google Scholar
  21. MacArthur, D.G. & Tyler-Smith, C. Hum. Mol. Genet. 19, R125–R130 (2010).
    Article CAS PubMed PubMed Central Google Scholar
  22. Sjödin, P., Bataillon, T. & Schierup, M.H. PLoS ONE 5, e8650 (2010).
    Article PubMed PubMed Central Google Scholar
  23. Sanders, S.J. et al. Nature 485, 237–241 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  24. O'Roak, B.J. et al. Nature 485, 246–250 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  25. Neale, B.M. et al. Nature 485, 242–245 (2012).
    CAS PubMed PubMed Central Google Scholar
  26. Darnell, J.C. et al. Cell 146, 247–261 (2011).
    Article CAS PubMed PubMed Central Google Scholar
  27. Nagarajan, N. & Pop, M. Nat. Rev. Genet. 14, 157–167 (2013).
    Article CAS PubMed Google Scholar
  28. Li, H. & Durbin, R. Bioinformatics 26, 589–595 (2010).
    Article PubMed PubMed Central Google Scholar
  29. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at http://arxiv.org/abs/1303.3997v1 (2013).
  30. Smith, T.F. & Waterman, M.S. J. Mol. Biol. 147, 195–197 (1981).
    Article CAS PubMed Google Scholar
  31. Medvedev, P., Georgiou, K., Myers, G. & Brudno, M. Lect. Notes Comput. Sci. 4645, 289–301 (2007).
    Article Google Scholar
  32. Jackson, B.G. & Aluru, S. in 37th Int. Conf. Parallel Process. 346–353 (ICPP, 2008).
  33. Narzisi, G. & Mishra, B. Bioinformatics 27, 153–160 (2011).
    Article CAS PubMed Google Scholar
  34. Langmead, B. & Salzberg, S. Nat. Methods 9, 357–359 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  35. Fischbach, G.D. & Lord, C. Neuron 68, 192–195 (2010).
    Article CAS PubMed Google Scholar

Download references

Acknowledgements

The project was supported in part by the US National Institutes of Health (R01-HG006677) and US National Science Foundation (DBI-1350041) to M.C.S. and by the Cold Spring Harbor Laboratory (CSHL) Cancer Center Support Grant (5P30CA045508), the Stanley Institute for Cognitive Genomics and the Simons Foundation (SF51 and SF235988) to M.W. The DNA samples used in this work are included within SSC release 13. Approved researchers can obtain the SSC population data set described in this study by applying at https://base.sfari.org/. We thank S. Eskipehlivan for the technical assistance with the MiSeq validation experiments. We thank M. Bekritsky, S. Neuburgerand, M. Ronemus, D. Levy, B. Yamron and B. Mishra for helpful discussions and comments on the paper. We thank R. Aboukhalil for testing the software.

Author information

Authors and Affiliations

  1. Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
    Giuseppe Narzisi, Ivan Iossifov, Yoon-ha Lee, Zihua Wang, Michael Wigler & Michael C Schatz
  2. New York Genome Center, New York, USA
    Giuseppe Narzisi
  3. Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
    Jason A O'Rawe, Han Fang, Yiyang Wu & Gholson J Lyon
  4. Stony Brook University, Stony Brook, New York, USA
    Jason A O'Rawe, Han Fang, Yiyang Wu & Gholson J Lyon

Authors

  1. Giuseppe Narzisi
    You can also search for this author inPubMed Google Scholar
  2. Jason A O'Rawe
    You can also search for this author inPubMed Google Scholar
  3. Ivan Iossifov
    You can also search for this author inPubMed Google Scholar
  4. Han Fang
    You can also search for this author inPubMed Google Scholar
  5. Yoon-ha Lee
    You can also search for this author inPubMed Google Scholar
  6. Zihua Wang
    You can also search for this author inPubMed Google Scholar
  7. Yiyang Wu
    You can also search for this author inPubMed Google Scholar
  8. Gholson J Lyon
    You can also search for this author inPubMed Google Scholar
  9. Michael Wigler
    You can also search for this author inPubMed Google Scholar
  10. Michael C Schatz
    You can also search for this author inPubMed Google Scholar

Contributions

G.N. developed the software and conducted the computational experiments. G.N. and M.C.S. designed and analyzed the experiments. Y.W. assisted in designing the primers and performed the MiSeq validation experiments. J.A.O. designed the primers and analyzed the MiSeq data. H.F. and J.A.O. assisted with the computational experiments for the comparative analysis between different variant-detection pipelines. G.J.L. planned and supervised the experimental design for indel validation. Z.W. designed the primers and performed experiments for the validation of de novo and transmitted indels in the SSC. I.I., Y.-h.L. and M.W. assisted with the analysis of the SSC. G.N. and M.C.S. wrote the manuscript with input from all authors. All of the authors have read and approved the final manuscript.

Corresponding author

Correspondence toGiuseppe Narzisi.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Rights and permissions

About this article

Cite this article

Narzisi, G., O'Rawe, J., Iossifov, I. et al. Accurate de novo and transmitted indel detection in exome-capture data using microassembly.Nat Methods 11, 1033–1036 (2014). https://doi.org/10.1038/nmeth.3069

Download citation

This article is cited by

Associated content