Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences - PubMed (original) (raw)
Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
Jeremy Goecks et al. Genome Biol. 2010.
Abstract
Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis.
Figures
Figure 1
Galaxy analysis workspace. The Galaxy analysis workspace is where users perform genomic analyses. The workspace has four areas: the navigation bar, tool panel (left column), detail panel (middle column), and history panel (right column). The navigation bar provides links to Galaxy's major components, including the analysis workspace, workflows, data libraries, and user repositories (histories, workflows, Pages). The tool panel lists the analysis tools and data sources available to the user. The detail panel displays interfaces for tools selected by the user. The history panel shows data and the results of analyses performed by the user, as well as automatically tracked metadata and user-generated annotations. Every action by the user generates a new history item, which can then be used in subsequent analyses, downloaded, or visualized. Galaxy's history panel helps to facilitate reproducibility by showing provenance of data and by enabling users to extract a workflow from a history, rerun analysis steps, visualize output datasets, tag datasets for searching and grouping, and annotate steps with information about their purpose or importance. Here, step 12 is being rerun.
Figure 2
Galaxy workflow editor. Galaxy's workflow editor provides a graphical user interface for creating and modifying workflows. The editor has four areas: navigation bar, tool bar (left column), editor panel (middle column), and details panel. A user adds tools from the tool panel to the editor panel and configures each step in the workflow using the details panel. The details panel also enables a user to add tags to a workflow and annotate a workflow and workflow steps. Workflows are run in Galaxy's analysis workspace; like all tools executed in Galaxy, Galaxy automatically generates history items and provenance information for each tool executed via a workflow.
Figure 3
Galaxy public repositories and published items. (a) Galaxy's public repository for Pages; there are also public repositories for histories and workflows. Repositories can be searched by name, annotation, owner, and community tags. (b) A published Galaxy workflow. Each shared or published item is displayed in a webpage with its metadata (for example, execution details, user annotations), a link for copying the item into a user's workspace, and links for viewing related items.
Figure 4
Galaxy Pages. Galaxy Page that is an online, interactive supplement for a metagenomic study performed in Galaxy [21]. The Page communicates all facets of the experiment via increasing levels of detail, starting with supplementary text, two embedded histories, and an embedded workflow. Readers can open the embedded items and view details for each step, including provenance information, parameter settings, and annotations. For history steps, readers can view corresponding datasets (red arrow). Readers can also copy histories (green arrow) or the workflow (blue arrow) into their analysis workspace and both reproduce and extend the experiment's analyses without leaving Galaxy or their web browser.
Similar articles
- The missing graphical user interface for genomics.
Schatz MC. Schatz MC. Genome Biol. 2010;11(8):128. doi: 10.1186/gb-2010-11-8-128. Epub 2010 Aug 25. Genome Biol. 2010. PMID: 20804568 Free PMC article. - The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update.
Afgan E, Baker D, van den Beek M, Blankenberg D, Bouvier D, Čech M, Chilton J, Clements D, Coraor N, Eberhard C, Grüning B, Guerler A, Hillman-Jackson J, Von Kuster G, Rasche E, Soranzo N, Turaga N, Taylor J, Nekrutenko A, Goecks J. Afgan E, et al. Nucleic Acids Res. 2016 Jul 8;44(W1):W3-W10. doi: 10.1093/nar/gkw343. Epub 2016 May 2. Nucleic Acids Res. 2016. PMID: 27137889 Free PMC article. - Galaxy HiCExplorer: a web server for reproducible Hi-C data analysis, quality control and visualization.
Wolff J, Bhardwaj V, Nothjunge S, Richard G, Renschler G, Gilsbach R, Manke T, Backofen R, Ramírez F, Grüning BA. Wolff J, et al. Nucleic Acids Res. 2018 Jul 2;46(W1):W11-W16. doi: 10.1093/nar/gky504. Nucleic Acids Res. 2018. PMID: 29901812 Free PMC article. - The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update.
Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Cech M, Chilton J, Clements D, Coraor N, Grüning BA, Guerler A, Hillman-Jackson J, Hiltemann S, Jalili V, Rasche H, Soranzo N, Goecks J, Taylor J, Nekrutenko A, Blankenberg D. Afgan E, et al. Nucleic Acids Res. 2018 Jul 2;46(W1):W537-W544. doi: 10.1093/nar/gky379. Nucleic Acids Res. 2018. PMID: 29790989 Free PMC article. - Integrating diverse databases into an unified analysis framework: a Galaxy approach.
Blankenberg D, Coraor N, Von Kuster G, Taylor J, Nekrutenko A; Galaxy Team. Blankenberg D, et al. Database (Oxford). 2011 Apr 29;2011:bar011. doi: 10.1093/database/bar011. Print 2011. Database (Oxford). 2011. PMID: 21531983 Free PMC article.
Cited by
- Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics.
Giacomoni F, Le Corguillé G, Monsoor M, Landi M, Pericard P, Pétéra M, Duperier C, Tremblay-Franco M, Martin JF, Jacob D, Goulitquer S, Thévenot EA, Caron C. Giacomoni F, et al. Bioinformatics. 2015 May 1;31(9):1493-5. doi: 10.1093/bioinformatics/btu813. Epub 2014 Dec 19. Bioinformatics. 2015. PMID: 25527831 Free PMC article. - MOCAT: a metagenomics assembly and gene prediction toolkit.
Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR, Arumugam M, Pan Q, Liu B, Qin J, Wang J, Bork P. Kultima JR, et al. PLoS One. 2012;7(10):e47656. doi: 10.1371/journal.pone.0047656. Epub 2012 Oct 17. PLoS One. 2012. PMID: 23082188 Free PMC article. - Pseudomonas aeruginosa Biofilm Response and Resistance to Cold Atmospheric Pressure Plasma Is Linked to the Redox-Active Molecule Phenazine.
Mai-Prochnow A, Bradbury M, Ostrikov K, Murphy AB. Mai-Prochnow A, et al. PLoS One. 2015 Jun 26;10(6):e0130373. doi: 10.1371/journal.pone.0130373. eCollection 2015. PLoS One. 2015. PMID: 26114428 Free PMC article. - Bioinformatics for personal genome interpretation.
Capriotti E, Nehrt NL, Kann MG, Bromberg Y. Capriotti E, et al. Brief Bioinform. 2012 Jul;13(4):495-512. doi: 10.1093/bib/bbr070. Epub 2012 Jan 13. Brief Bioinform. 2012. PMID: 22247263 Free PMC article. Review. - Hidden treasures in unspliced EST data.
Engelhardt J, Stadler PF. Engelhardt J, et al. Theory Biosci. 2012 May;131(1):49-57. doi: 10.1007/s12064-012-0151-6. Epub 2012 Apr 8. Theory Biosci. 2012. PMID: 22485013
References
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007;4:651–657. doi: 10.1038/nmeth1068. - DOI - PubMed
- Statistics Using R with Biological Examples. http://cran.r-project.org/doc/contrib/Seefeld_StatsRBio.pdf
- Introduction to Sequence Analysis using EMBOSS. http://emboss.sourceforge.net/docs/emboss_tutorial/emboss_tutorial.html
Publication types
MeSH terms
Grants and funding
- HG005133/HG/NHGRI NIH HHS/United States
- R01 HG004909/HG/NHGRI NIH HHS/United States
- HG004909/HG/NHGRI NIH HHS/United States
- R01 DK065806/DK/NIDDK NIH HHS/United States
- U41 HG006620/HG/NHGRI NIH HHS/United States
- HG005542/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources