Web-based visual analysis for high-throughput genomics (original) (raw)

Abstract

Background: Visualization plays an essential role in genomics research by making it possible to observe correlations and trends in large datasets as well as communicate findings to others. Visual analysis, which combines visualization with analysis tools to enable seamless use of both approaches for scientific investigation, offers a powerful method for performing complex genomic analyses. However, there are numerous challenges that arise when creating rich, interactive Web-based visualizations/visual analysis applications for high-throughput genomics. These challenges include managing data flow from Web server to Web browser, integrating analysis tools and visualizations, and sharing visualizations with colleagues. Results: We have created a platform simplifies the creation of Web-based visualization/visual analysis applications for high-throughput genomics. This platform provides components that make it simple to efficiently query very large datasets, draw common representations of genomic data, integrate with analysis tools, and share or publish fully interactive visualizations. Using this platform, we have created a Circos-style genome-wide viewer, a generic scatter plot for correlation analysis, an interactive phylogenetic tree, a scalable genome browser for next-generation sequencing data, and an application for systematically exploring tool parameter spaces to find good parameter values. All visualizations are interactive and fully customizable. The platform is integrated with the Galaxy (http:// galaxyproject.org) genomics workbench, making it easy to integrate new visual applications into Galaxy. Conclusions: Visualization and visual analysis play an important role in high-throughput genomics experiments, and approaches are needed to make it easier to create applications for these activities. Our framework provides a foundation for creating Web-based visualizations and integrating them into Galaxy. Finally, the visualizations we have created using the framework are useful tools for high-throughput genomics experiments.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (21)

Nielsen CB, Cantor M, Dubchak I, Gordon D, Wang T: Visualizing genomes: techniques and challenges. Nat Methods 2010, 7(3 Suppl):S5-S15.
Kent WJ: The human genome browser at UCSC. Genome Res 2002, 12:996-1006.
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP: Integrative genomics viewer. Nat Biotech 2011, 29(1):24-26.
Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream M-A, Barrell B: Artemis: sequence visualization and annotation. Bioinformatics 2000, 16(10):944-945.
Nielsen CB, Younesy H, O'Geen H, Xu X, Jackson AR, Milosavljevic A, Wang T, Costello JF, Hirst M, Farnham PJ, et al: Spark: a navigational paradigm for genomic data exploration. Genome Res 2012, 22(11):2262-2269.
Lex A, Streit M, Schulz H, Partl C, Schmalstieg D, Park P, Gehlenborg N: StratomeX: visual analysis of large-scale heterogeneous genomics data for cancer subtype characterization. Comput Graph Forum (EuroVis 12) 2012, 31:1175-1184.
Fiume M, Williams V, Brook A, Brudno M: Savant: genome browser for high-throughput sequencing data. Bioinformatics 2010, 26(16):1938-1944.
Fiume M, Smith EJ, Brook A, Strbenac D, Turner B, Mezlini AM, Robinson MD, Wodak SJ, Brudno M: Savant genome browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res 2012, 40:W615-621.
Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80-R80.
Goecks J, Nekrutenko A, Taylor J, The Galaxy Team: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 2010, 11(8):R86-R86.
Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M, Nekrutenko A, Taylor J: Galaxy: a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol 2010, 89(19):19.10.11-19.10.21.
Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: an information aesthetic for comparative genomics. Genome Res 2009, 19(9):1639-1645.
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and cufflinks. Nat Protoc 2012, 7(3):562-578.
Zhang H, Gao S, Lercher MJ, Hu S, Chen WH: EvolView, an online tool for visualizing, annotating and managing phylogenetic trees. Nucleic Acids Res 2012, 40:W569-572.
Pethica R, Barker G, Kovacs T, Gough J: TreeVector: scalable, interactive, phylogenetic trees for the web. PLoS One 2010, 5(1):e8934.
Smits SA, Ouverney CC: JsPhyloSVG: a javascript library for visualizing interactive and vector-based phylogenetic trees on the web. PLoS One 2010, 5(8):e12267.
Bostock M, Ogievetsky V, Heer J: D3: data-driven documents. Visualization and Computer Graphics, IEEE Transactions on 2011, 17(12):2301-2309.
Kim P, Yoon S, Kim N, Lee S, Ko M, Lee H, Kang H, Kim J: ChimerDB 2.0-A knowledgebase for fusion genes updated. Nucleic Acids Res 2010, 38:D81-85.
Pak TR, Roth FP: ChromoZoom: a flexible, fluid, web-based genome browser. Bioinformatics 2012, 29(3):384-386.
Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH: JBrowse: a next-generation genome browser. Genome Res 2009, 19(9):1630-1638.
Goecks J, Coraor N, The Galaxy Team, Nekrutenko A, Taylor J: NGS analyses by visualization with Trackster. Nat Biotechnol 2012, 30(11):1036-1039. doi:10.1186/1471-2164-14-397 Cite this article as: Goecks et al.: Web-based visual analysis for high- throughput genomics. BMC Genomics 2013 14:397.