The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update - PubMed (original) (raw)
. 2016 Jul 8;44(W1):W3-W10.
doi: 10.1093/nar/gkw343. Epub 2016 May 2.
Dannon Baker 1, Marius van den Beek 2, Daniel Blankenberg 3, Dave Bouvier 3, Martin Čech 3, John Chilton 3, Dave Clements 1, Nate Coraor 3, Carl Eberhard 1, Björn Grüning 4, Aysam Guerler 1, Jennifer Hillman-Jackson 3, Greg Von Kuster 5, Eric Rasche 6, Nicola Soranzo 7, Nitesh Turaga 1, James Taylor 8, Anton Nekrutenko 9, Jeremy Goecks 10
Affiliations
- PMID: 27137889
- PMCID: PMC4987906
- DOI: 10.1093/nar/gkw343
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update
Enis Afgan et al. Nucleic Acids Res. 2016.
Abstract
High-throughput data production technologies, particularly 'next-generation' DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical and computational methods, as well as substantial computational power. This has led to an acute crisis in life sciences, as researchers without informatics training attempt to perform computation-dependent analyses. Since 2005, the Galaxy project has worked to address this problem by providing a framework that makes advanced computational tools usable by non experts. Galaxy seeks to make data-intensive research more accessible, transparent and reproducible by providing a Web-based environment in which users can perform computational analyses and have all of the details automatically tracked for later inspection, publication, or reuse. In this report we highlight recently added features enabling biomedical analyses on a large scale.
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Figures
Figure 1.
Galaxy analysis interface consisting of tool menu (left pane), tool interface (center pane), history (right pane).
Figure 2.
Galaxy's graphical workflow editor, show part of a sample workflow.
Figure 3.
Multi-history viewer. Datasets can be copied among histories by dragging. Here one can also see the results of dynamic search functionality: in history PNAS 2014 the search bar contains a partial keyword,
Mark
, causing the history to refresh and to show only datasets produced by the tool
MarkDuplicates
. As this particular history is very large (thousands of items) this functionality greatly simplifies analyses.
Figure 4.
(A) Dataset collections simplify analysis of large numbers of files. A Galaxy history with a paired-end DNA re-sequencing dataset from 28 individuals contains 56 files (each green box is a file). It is difficult to understand this history because there are so many files and because forward (R1) and reverse (R2) reads are unordered. As these files are analyzed and more outputs/files are created, it becomes very difficult to navigate around the history and understand how files are connected as inputs and outputs of particular tools or analyses. Dataset collections make analysis of this mix of files straightforward by grouping all files into a collection that can be analyzed as a single unit. This example demonstrates using collections with paired end data, but collections can be created for any set of files. (B) Creation of a paired collection from the history shown in panel A. Because dataset names use a uniform nomenclature for forward and reverse reads, the collection creation form can automatically determine pairings. (C) Pairing these datasets generates a single item (a Collection) in Galaxy's history. (D) Clicking on this newly created Collection expands it and shows its content (only first three datasets are shown). (E) Galaxy's BWA interface takes the entire dataset collection as a single input.
Figure 5.
Selection panel for Galaxy numerical visualizations showing the variety of plots that can be created.
Similar articles
- The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update.
Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Cech M, Chilton J, Clements D, Coraor N, Grüning BA, Guerler A, Hillman-Jackson J, Hiltemann S, Jalili V, Rasche H, Soranzo N, Goecks J, Taylor J, Nekrutenko A, Blankenberg D. Afgan E, et al. Nucleic Acids Res. 2018 Jul 2;46(W1):W537-W544. doi: 10.1093/nar/gky379. Nucleic Acids Res. 2018. PMID: 29790989 Free PMC article. - qPortal: A platform for data-driven biomedical research.
Mohr C, Friedrich A, Wojnar D, Kenar E, Polatkan AC, Codrea MC, Czemmel S, Kohlbacher O, Nahnsen S. Mohr C, et al. PLoS One. 2018 Jan 19;13(1):e0191603. doi: 10.1371/journal.pone.0191603. eCollection 2018. PLoS One. 2018. PMID: 29352322 Free PMC article. - The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2020 update.
Jalili V, Afgan E, Gu Q, Clements D, Blankenberg D, Goecks J, Taylor J, Nekrutenko A. Jalili V, et al. Nucleic Acids Res. 2020 Jul 2;48(W1):W395-W402. doi: 10.1093/nar/gkaa434. Nucleic Acids Res. 2020. PMID: 32479607 Free PMC article. - Web tools for predictive toxicology model building.
Jeliazkova N. Jeliazkova N. Expert Opin Drug Metab Toxicol. 2012 Jul;8(7):791-801. doi: 10.1517/17425255.2012.685158. Epub 2012 May 12. Expert Opin Drug Metab Toxicol. 2012. PMID: 22577953 Review. - Informatics in radiology: GridCAD: grid-based computer-aided detection system.
Pan TC, Gurcan MN, Langella SA, Oster SW, Hastings SL, Sharma A, Rutt BG, Ervin DW, Kurc TM, Siddiqui KM, Saltz JH, Siegel EL. Pan TC, et al. Radiographics. 2007 May-Jun;27(3):889-97. doi: 10.1148/rg.273065153. Radiographics. 2007. PMID: 17495299 Review.
Cited by
- Complete Genome Sequence of Chlamydia abortus MRI-10/19, Isolated from a Sheep Vaccinated with the Commercial Live C. abortus 1B Vaccine Strain.
Livingstone M, Caspe SG, Longbottom D. Livingstone M, et al. Microbiol Resour Announc. 2021 May 6;10(18):e00203-21. doi: 10.1128/MRA.00203-21. Microbiol Resour Announc. 2021. PMID: 33958416 Free PMC article. - Year-Long Microbial Succession on Microplastics in Wastewater: Chaotic Dynamics Outweigh Preferential Growth.
Tagg AS, Sperlea T, Labrenz M, Harrison JP, Ojeda JJ, Sapp M. Tagg AS, et al. Microorganisms. 2022 Sep 2;10(9):1775. doi: 10.3390/microorganisms10091775. Microorganisms. 2022. PMID: 36144377 Free PMC article. - Diversity and evolution of cytochrome P450s of Jacobaea vulgaris and Jacobaea aquatica.
Chen Y, Klinkhamer PGL, Memelink J, Vrieling K. Chen Y, et al. BMC Plant Biol. 2020 Jul 20;20(1):342. doi: 10.1186/s12870-020-02532-y. BMC Plant Biol. 2020. PMID: 32689941 Free PMC article. - BioFlow-Insight: facilitating reuse of Nextflow workflows with structure reconstruction and visualization.
Marchment G, Brancotte B, Schmit M, Lemoine F, Cohen-Boulakia S. Marchment G, et al. NAR Genom Bioinform. 2024 Aug 6;6(3):lqae092. doi: 10.1093/nargab/lqae092. eCollection 2024 Sep. NAR Genom Bioinform. 2024. PMID: 39108637 Free PMC article. - A reference genome, mitochondrial genome and associated transcriptomes for the critically endangered swift parrot ( Lathamus discolor).
Silver LW, Stojanovic D, Farquharson KA, Alexander L, Peel E, Belov K, Hogg CJ. Silver LW, et al. F1000Res. 2024 Aug 27;13:251. doi: 10.12688/f1000research.144352.2. eCollection 2024. F1000Res. 2024. PMID: 39301273 Free PMC article.
References
- Oinn T., Addis M., Ferris J., Marvin D., Senger M., Greenwood M., Carver T., Glover K., Pocock M.R., Wipat A., et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics. 2004;20:3045–3054. - PubMed
- Wolstencroft K., Haines R., Fellows D., Williams A., Withers D., Owen S., Soiland-Reyes S., Dunlop I., Nenadic A., Fisher P., et al. The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Res. 2013;41:W557–W561. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
- R01 HG004909/HG/NHGRI NIH HHS/United States
- R21 HG005133/HG/NHGRI NIH HHS/United States
- RC2 HG005542/HG/NHGRI NIH HHS/United States
- U41 HG006620/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases