Sequenceserver: a modern graphical user interface for custom BLAST databases (original) (raw)
Related papers
BMC Bioinformatics, 2014
Background: Advances in sequencing efficiency have vastly increased the sizes of biological sequence databases, including many thousands of genome-sequenced species. The BLAST algorithm remains the main search engine for retrieving sequence information, and must consequently handle data on an unprecedented scale. This has been possible due to high-performance computers and parallel processing. However, the raw BLAST output from contemporary searches involving thousands of queries becomes ill-suited for direct human processing. Few programs attempt to directly visualize and interpret BLAST output; those that do often provide a mere basic structuring of BLAST data. Results: Here we present a bioinformatics application named BLASTGrabber suitable for high-throughput sequencing analysis. BLASTGrabber, being implemented as a Java application, is OS-independent and includes a user friendly graphical user interface. Text or XML-formatted BLAST output files can be directly imported, displayed and categorized based on BLAST statistics. Query names and FASTA headers can be analysed by text-mining. In addition to visualizing sequence alignments, BLAST data can be ordered as an interactive taxonomy tree. All modes of analysis support selection, export and storage of data. A Java interface-based plugin structure facilitates the addition of customized third party functionality. Conclusion: The BLASTGrabber application introduces new ways of visualizing and analysing massive BLAST output data by integrating taxonomy identification, text mining capabilities and generic multi-dimensional rendering of BLAST hits. The program aims at a non-expert audience in terms of computer skills; the combination of new functionalities makes the program flexible and useful for a broad range of operations.
BLAST output visualization in the new sequencing era
Briefings in Bioinformatics, 2013
The Basic Local Alignment Search Tool (BLAST) algorithm remains one of the most widely used bioinformatic programs. For many projects, new sequencing technologies and increased database sizes will increase the BLAST output significantly. Frequently, this output is so large that it is no longer able to be processed manually. As BLAST users are increasingly recruited from mainstream biology without any bioinformatic background, user-friendly programs capable of BLAST output visualization, analysis and post-processing are in demand. In this review, freely available BLAST output processing programs are categorized as BLAST output interpreters, BLAST environments, BLAST output parsers or specialized tools. They are evaluated according to their user-friendliness, analysis features and high-throughput data processing capabilities.
NASQAR: a web-based platform for high-throughput sequencing data analysis and visualization
BMC Bioinformatics
Background As high-throughput sequencing applications continue to evolve, the rapid growth in quantity and variety of sequence-based data calls for the development of new software libraries and tools for data analysis and visualization. Often, effective use of these tools requires computational skills beyond those of many researchers. To ease this computational barrier, we have created a dynamic web-based platform, NASQAR (Nucleic Acid SeQuence Analysis Resource). Results NASQAR offers a collection of custom and publicly available open-source web applications that make extensive use of a variety of R packages to provide interactive data analysis and visualization. The platform is publicly accessible at http://nasqar.abudhabi.nyu.edu/. Open-source code is on GitHub at https://github.com/nasqar/NASQAR, and the system is also available as a Docker image at https://hub.docker.com/r/aymanm/nasqarall. NASQAR is a collaboration between the core bioinformatics teams of the NYU Abu Dhabi and...
BOV – a web-based BLAST output visualization tool
BMC Genomics, 2008
The BLAST program is one of the most widely used sequence similarity search tools for genomic research, even by those biologists lacking extensive bioinformatics training. As the availability of sequence data increases, more researchers are downloading the BLAST program for local installation and performing larger and more complex tasks, including batch queries. In order to manage and interpret the results of batch queries, a host of software packages have been developed to assist with data management and post-processing. Among these programs, there is almost a complete lack of visualization tools to provide graphic representation of complex BLAST pair-wise alignments. We have developed a web-based program, BLAST Output Visualization Tool (BOV), that allows users to interactively visualize the matching regions of query and database hit sequences, thereby allowing the user to quickly and easily dissect complex matching patterns.
Next generation tools for genomic data generation, distribution, and visualization
2010
BackgroundWith the rapidly falling cost and availability of high throughput sequencing and microarray technologies, the bottleneck for effectively using genomic analysis in the laboratory and clinic is shifting to one of effectively managing, analyzing, and sharing genomic data.ResultsHere we present three open-source, platform independent, software tools for generating, analyzing, distributing, and visualizing genomic data. These include a next generation sequencing/microarray LIMS and analysis project center (GNomEx); an application for annotating and programmatically distributing genomic data using the community vetted DAS/2 data exchange protocol (GenoPub); and a standalone Java Swing application (GWrap) that makes cutting edge command line analysis tools available to those who prefer graphical user interfaces. Both GNomEx and GenoPub use the rich client Flex/Flash web browser interface to interact with Java classes and a relational database on a remote server. Both employ a pub...
2016
Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools. These preconfigured workflows provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks, and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Exper...
SisGen: A CORBA–Based Data Management Program for DNA Sequencing Projects
Lecture Notes in Computer Science
Biological data deluge has challenged researchers over the last decade. Expressed sequence tag (EST) analyzes provide a rapid and economical means to identify candidate genes, gene expression profiles in different cell conditions, as well as functional annotation of putative gene products. Although EST analysis tools are publicly available there is still a lack of comprehensive data analysis and management programs. This work presents SisGen, an integrated software system capable of efficiently managing multiuser genomic projects. SisGen is a Java clientserver application that uses CORBA as a middleware in a multi-layer architecture. The software integrates data management an annotation pipeline in a rich graphical visualization environment. The architectural design is presented and highlights the advantages in terms of portability, interconnectivity, modularity and user interface that can be achieved with this concept.
VisRseq: R-based visual framework for analysis of sequencing data
BMC bioinformatics, 2015
Several tools have been developed to enable biologists to perform initial browsing and exploration of sequencing data. However the computational tool set for further analyses often requires significant computational expertise to use and many of the biologists with the knowledge needed to interpret these data must rely on programming experts. We present VisRseq, a framework for analysis of sequencing datasets that provides a computationally rich and accessible framework for integrative and interactive analyses without requiring programming expertise. We achieve this aim by providing R apps, which offer a semi-auto generated and unified graphical user interface for computational packages in R and repositories such as Bioconductor. To address the interactivity limitation inherent in R libraries, our framework includes several native apps that provide exploration and brushing operations as well as an integrated genome browser. The apps can be chained together to create more powerful ana...
BMC bioinformatics, 2007
BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of p...
VSQual: a visual system to assist DNA sequencing quality control
A lack of pliant software tools that support small-to medium-scale DNA sequencing efforts is a major hindrance for recording and using laboratory workflow information to monitor the overall quality of data production. Here we describe VSQual, a set of Perl programs intended to provide simple and powerful tools to check several quality features of the sequencing data generated by automated DNA sequencing machines. The core program of VSQual is a flexible Perlbased pipeline, designed to be accessible and useful for both programmers and non-programmers. This pipeline directs the processing steps and can be easily customized for laboratory needs. Basically, the raw DNA sequencing trace files are processed by Phred and Cross_match, then the outputs are parsed, reformatted into Web-based graphical reports, and added to a Web site structure. The result is a set of real time sequencing reports easily accessible and understood by common laboratory people. These reports facilitate the monitoring of DNA sequencing as well as the management of laboratory workflow, significantly reducing operational costs and ensuring high quality and scientifically reliable results.