The case for cloud computing in genome informatics - PubMed (original) (raw)
The case for cloud computing in genome informatics
Lincoln D Stein. Genome Biol. 2010.
Abstract
With DNA sequencing now getting cheaper more quickly than data storage or computation, the time may have come for genome informatics to migrate to the cloud.
Figures
Figure 1
The old genome informatics ecosystem. Under the traditional flow of genome information, sequencing laboratories transmit raw and interpreted sequencing information across the internet to one of several sequencing archives. This information is accessed either directly by casual users or indirectly via a website run by one of the value-added genome integrators. Power users typically download large datasets from the archives onto their local compute clusters for computationally intensive number crunching. Under this model, the sequencing archives, value-added integrators and power users all maintain their own compute and storage clusters and keep local copies of the sequencing datasets.
Figure 2
Historical trends in storage prices versus DNA sequencing costs. The blue squares describe the historic cost of disk prices in megabytes per US dollar. The long-term trend (blue line, which is a straight line here because the plot is logarithmic) shows exponential growth in storage per dollar with a doubling time of roughly 1.5 years. The cost of DNA sequencing, expressed in base pairs per dollar, is shown by the red triangles. It follows an exponential curve (yellow line) with a doubling time slightly slower than disk storage until 2004, when next generation sequencing (NGS) causes an inflection in the curve to a doubling time of less than 6 months (red line). These curves are not corrected for inflation or for the 'fully loaded' cost of sequencing and disk storage, which would include personnel costs, depreciation and overhead.
Figure 3
The 'new' genome informatics ecosystem based on cloud computing. In this model, the community's storage and compute resources are co-located in a 'cloud' maintained by a large service provider. The sequence archives and value-added integrators maintain servers and storage systems within the cloud, and use more or less capacity as needed for daily and seasonal fluctuations in usage. Casual users continue to access the data via the websites of the archives and integrators, but power users now have the option of creating virtual on-demand compute clusters within the cloud, which have direct access to the sequencing datasets.
Similar articles
- Gathering clouds and a sequencing storm: why cloud computing could broaden community access to next-generation sequencing.
[No authors listed] [No authors listed] Nat Biotechnol. 2010 Jan;28(1):1. doi: 10.1038/nbt0110-1. Nat Biotechnol. 2010. PMID: 20062015 No abstract available. - Businesses ready whole-genome analysis services for researchers.
Stokes T. Stokes T. Nat Med. 2011 Oct 11;17(10):1161. doi: 10.1038/nm1011-1161. Nat Med. 2011. PMID: 21988969 No abstract available. - Genome sequencing and assembly.
Grabherr MG, Mauceli E, Ma LJ. Grabherr MG, et al. Methods Mol Biol. 2011;722:1-9. doi: 10.1007/978-1-61779-040-9_1. Methods Mol Biol. 2011. PMID: 21590409 - Application of 'next-generation' sequencing technologies to microbial genetics.
MacLean D, Jones JD, Studholme DJ. MacLean D, et al. Nat Rev Microbiol. 2009 Apr;7(4):287-96. doi: 10.1038/nrmicro2122. Nat Rev Microbiol. 2009. PMID: 19287448 Review. - Computational solutions to large-scale data management and analysis.
Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP. Schadt EE, et al. Nat Rev Genet. 2010 Sep;11(9):647-57. doi: 10.1038/nrg2857. Nat Rev Genet. 2010. PMID: 20717155 Free PMC article. Review.
Cited by
- Personalized Medicine Approach to Proteomics and Metabolomics of Cytochrome P450 Enzymes: A Narrative Review.
Fetse J, Olawode EO, Deb S. Fetse J, et al. Eur J Drug Metab Pharmacokinet. 2024 Sep 13. doi: 10.1007/s13318-024-00912-5. Online ahead of print. Eur J Drug Metab Pharmacokinet. 2024. PMID: 39269556 Review. - FecalSeq enrichment with RAD Sequencing from non-invasive environmental samples holds promise for genetic monitoring of an imperiled lagomorph.
Scott AM, Kovach AI. Scott AM, et al. Sci Rep. 2024 Jul 30;14(1):17575. doi: 10.1038/s41598-024-67764-6. Sci Rep. 2024. PMID: 39080335 Free PMC article. - Sapporo: A workflow execution service that encourages the reuse of workflows in various languages in bioinformatics.
Suetake H, Tanjo T, Ishii M, P Kinoshita B, Fujino T, Hachiya T, Kodama Y, Fujisawa T, Ogasawara O, Shimizu A, Arita M, Fukusato T, Igarashi T, Ohta T. Suetake H, et al. F1000Res. 2024 Jun 24;11:889. doi: 10.12688/f1000research.122924.2. eCollection 2022. F1000Res. 2024. PMID: 39070189 Free PMC article. - Unraveling the role of cloud computing in health care system and biomedical sciences.
Sachdeva S, Bhatia S, Al Harrasi A, Shah YA, Anwer K, Philip AK, Shah SFA, Khan A, Ahsan Halim S. Sachdeva S, et al. Heliyon. 2024 Apr 2;10(7):e29044. doi: 10.1016/j.heliyon.2024.e29044. eCollection 2024 Apr 15. Heliyon. 2024. PMID: 38601602 Free PMC article. Review. - Application and progress of the detection technologies in hepatocellular carcinoma.
Yan Q, Sun YS, An R, Liu F, Fang Q, Wang Z, Xu T, Chen L, Du J. Yan Q, et al. Genes Dis. 2022 Apr 22;10(5):1857-1869. doi: 10.1016/j.gendis.2022.04.003. eCollection 2023 Sep. Genes Dis. 2022. PMID: 37492708 Free PMC article. Review.
References
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Edgar R. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009;37:D885–D890. doi: 10.1093/nar/gkn764. - DOI - PMC - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources