Predicting the molecular complexity of sequencing libraries (original) (raw)

Nature Methods volume 10, pages 325–327 (2013)Cite this article

Subjects

Abstract

Predicting the molecular complexity of a genomic sequencing library is a critical but difficult problem in modern sequencing applications. Methods to determine how deeply to sequence to achieve complete coverage or to predict the benefits of additional sequencing are lacking. We introduce an empirical Bayesian method to accurately characterize the molecular complexity of a DNA sample for almost any sequencing application on the basis of limited preliminary sequencing.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

References

  1. Lander, E. & Waterman, M. Genomics 2, 231–239 (1988).
    Article CAS Google Scholar
  2. Chen, Y. et al. Nat. Methods 9, 609–614 (2012).
    Article CAS Google Scholar
  3. Fisher, R.A., Corbet, S. & Williams, C.B. J. Anim. Ecol. 12, 42–58 (1943).
    Article Google Scholar
  4. Good, I.J. & Toulmin, G.H. Biometrika 43, 45–63 (1956).
    Article Google Scholar
  5. Kivioja, T. et al. Nat. Methods 9, 72–74 (2012).
    Article CAS Google Scholar
  6. Efron, B. & Thisted, R. Biometrika 63, 435–447 (1976).
    Google Scholar
  7. Baker, G. & Graves-Morris, P. Pade Approximants (Cambrige University Press, Cambridge, UK, 1996).
  8. Molaro, A. et al. Cell 146, 1029–1041 (2011).
    Article CAS Google Scholar
  9. Ribeiro de Almeida, C. et al. Immunity 35, 501–513 (2011).
    Article CAS Google Scholar
  10. Lister, R. et al. Nature 471, 68–73 (2011).
    Article CAS Google Scholar
  11. Link, W. Biometrics 59, 1123–1130 (2003).
    Article Google Scholar
  12. Mao, C. & Lindsay, B. Ann. Stat. 35, 917–930 (2007).
    Article Google Scholar
  13. Keating, K., Quinn, J., Ivie, M. & Ivie, L. Ecol. Appl. 8, 1239–1249 (1998).
    Google Scholar
  14. Hardy, G. Divergent series (Oxford University Press, London, 1949).
  15. Simon, B. Adv. Math. 137, 82–203 (1998).
    Article Google Scholar
  16. McCabe, J.H. Math. Comput. 41, 183–197 (1983).
    Google Scholar
  17. Blanch, G. SIAM Rev. 6, 383–421 (1964).
    Article Google Scholar

Download references

Acknowledgements

We thank S. Tavaré, M. Waterman, P. Calabrese, G. Hannon, and members of the Hannon lab and the Smith lab for their help, advice and input. This work was supported by US National Institutes of Health National Human Genome Research Institute grants (R01 HG005238 and P50 HG002790).

Author information

Authors and Affiliations

  1. Department of Mathematics, University of Southern California, Los Angeles, California, USA
    Timothy Daley
  2. Molecular and Computational Biology, University of Southern California, Los Angeles, California, USA
    Andrew D Smith

Authors

  1. Timothy Daley
    You can also search for this author inPubMed Google Scholar
  2. Andrew D Smith
    You can also search for this author inPubMed Google Scholar

Contributions

T.D. and A.D.S. designed the method, implemented the software, performed the analysis and wrote the manuscript.

Corresponding author

Correspondence toAndrew D Smith.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Rights and permissions

About this article

Cite this article

Daley, T., Smith, A. Predicting the molecular complexity of sequencing libraries.Nat Methods 10, 325–327 (2013). https://doi.org/10.1038/nmeth.2375

Download citation

This article is cited by