Cloud computing and the DNA data race (original) (raw)

Nature Biotechnology volume 28, pages 691–693 (2010)Cite this article

Subjects

Given the accumulation of DNA sequence data sets at ever-faster rates, what are the key factors you should consider when using distributed and multicore computing systems for analysis?

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Subscribe to this journal

Receive 12 print issues and online access

$209.00 per year

only $17.42 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Figure 1: Map-shuffle-scan framework used by Crossbow.

References

  1. Stein, L.D. Genome Biol. 11, 207 (2010).
    Article Google Scholar
  2. Moore, G.E. Electronics 38, 4–7 (1965).
    Google Scholar
  3. Dongarra, J.J., Otto, S.W., Snir, M. & Walker, D. Commun. Assoc. Comput. Machinery 39, 84–90 (1996).
    Google Scholar
  4. Litzkow, M., Livny, M. & Mutka, M. in Proceedings of the 8th International Conference of Distributed Computing Systems 104–111 (IEEE, Washington DC, 1988).
    Google Scholar
  5. Dagum, L. & Menon, R. IEEE Comput. Sci. Eng. 5, 46–55 (1998).
    Article Google Scholar
  6. Markoff, J. & Hansell, S. Hiding in plain sight, Google seeks more power. New York Times http://www.nytimes.com/2006/06/14/technology/14search.html (14 June 2006).
    Google Scholar
  7. Foley, J. Eli Lilly on what's next in cloud computing. Plug Into the Cloud http://www.informationweek.com/cloud-computing/blog/archives/2009/01/whats_next_in_t.html (14 January 2009).
    Google Scholar
  8. Netflix selects Amazon web services to power mission-critical technology infrastructure. Amazon.com http://phx.corporate-ir.net/phoenix.zhtml?c=176060&p=irol-newsArticle&ID=1423977 (7 May 2010).
  9. AWS case study: Harvard Medical School. Amazon Web Services http://aws.amazon.com/solutions/case-studies/harvard/.
  10. Jeffrey, D. & Sanjay, G. Commun. Assoc. Comput. Machinery 51, 107–113 (2008).
    Google Scholar
  11. Lin, J. & Dyer, C. Synthesis Lectures on Human Language Technologies 3, 1–177 (2010).
    Article Google Scholar
  12. Chu, C.-T. et al. Adv. Neural Inf. Process. Syst. 19, 281–288 (2007).
    Google Scholar
  13. Schatz, M.C. Bioinformatics 25, 1363–1369 (2009).
    Article CAS Google Scholar
  14. Brin, S. & Page, L. Comput. Netw. ISDN Syst. 30, 107–117 (1998).
  15. Matthews, S.J. & Williams, T.L. BMC Bioinformatics 11 Suppl 1, S15 (2010).
  16. Langmead, B., Schatz, M.C., Lin, J., Pop, M. & Salzberg, S.L. Genome Biol. 10, R134 (2009).
  17. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).
  18. Li, R. et al. Genome Res. 19, 1124–1132 (2009).
    Article CAS Google Scholar
  19. Wall, D. et al. BMC Bioinformatics 11, 259 (2010).
    Article Google Scholar
  20. Giardine, B. et al. Genome Res. 15, 1451–1455 (2005).
    Article CAS Google Scholar
  21. Anonymous. Creating HIPAA-compliant medical data applications with AWS. Amazon Web Services http://aws.amazon.com/about-aws/whats-new/2009/04/06/whitepaper-hipaa/ (April 2009).
  22. Yu, Y. et al. DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language. Symposium on Operating System Design and Implementation (OSDI), San Diego, California, 8–10 December 2008.
    Google Scholar
  23. Malewicz, G. et al. in PODC 09: Proceedings of the 28th ACM Symposium on Principles of Distributed Computing 6 (ACM, 2009).
    Book Google Scholar
  24. Matsunaga, A., Tsugawa, M. & Fortes, J. in Proceedings of the IEEE Fourth International Conference on eScience, 222–229 (IEEE, Washington, DC, 2008).
    Google Scholar

Download references

Acknowledgements

The authors were supported in part by US National Science Foundation grant IIS-0844494 and by US National Institutes of Health grant R01-LM006845.

Author information

Authors and Affiliations

  1. Michael C. Schatz and Steven L. Salzberg are at the Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA.,
    Michael C Schatz & Steven L Salzberg
  2. Ben Langmead is at the Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA.,
    Ben Langmead

Authors

  1. Michael C Schatz
    You can also search for this author inPubMed Google Scholar
  2. Ben Langmead
    You can also search for this author inPubMed Google Scholar
  3. Steven L Salzberg
    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toMichael C Schatz.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

About this article

Cite this article

Schatz, M., Langmead, B. & Salzberg, S. Cloud computing and the DNA data race.Nat Biotechnol 28, 691–693 (2010). https://doi.org/10.1038/nbt0710-691

Download citation

This article is cited by