Taverna: a tool for the composition and enactment of bioinformatics workflows (original) (raw)

Biowep: a workflow enactment portal for bioinformatics applications

BMC Bioinformatics, 2007

Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics – LITBIO.

Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

BMC bioinformatics, 2005

Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces) in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported.

BioFlow: A Web-based Declarative Workflow Language for Life Sciences

Scientific workflows in Life Sciences are usually complex , and use many online databases, analysis tools, publication repositories and customized computation intensive desktop software in a coherent manner to respond to investigative queries. These investigative queries are generally ad hoc, ill-formed, and often, used only once to test a single hypothesis. In such cases, developing customized work-flows becomes a major undertaking, rendering the effort truly expensive, prohibitive and resource intensive. Such high development costs often act as deterrents to many interesting queries and promising on-time scientific discoveries. In this paper, we introduce a new query language that combines workflow features for scientific applications, called BioFlow, that exploits many recent developments in internet communication, databases, wrapper and mediator technologies, ontology, and data integration. BioFlow is a declarative language that abstracts these features to help hide most procedural aspects of mediation, data integration , communication protocols, data extraction and workflow details. We will demonstrate that fairly complex workflows can be effortlessly and declaratively expressed in BioFlow in an ad hoc fashion at minimal costs. We also report a prototype implementation of BioFlow in Windows VB .NET that includes most of its powerful and representative features as proof of feasibility of our proposal.

e-BioFlow: Improving Practical Use of Workflow Systems in Bioinformatics

2010

Workflow management systems (WfMSs) are useful tools for bioinformaticians. As experiences with using WfMSs accumulate, shortcomings of current systems become apparent. In this paper, we focus on practical issues that hinder WfMS users and that arise in the design and execution of workflows, and in access of web services. We present e-BioFlow, a workflow engine that demonstrates in which way a number of these problems can be solved. e-BioFlow offers an improved user interface, can deal with large data volumes, stores all provenance, and has a powerful provenance browser. e-BioFlow also offers the possibility to design and run workflows step by step, allowing its users an explorative research style.

BioWMS: a web-based Workflow Management System for bioinformatics

BMC Bioinformatics, 2007

Background: An in-silico experiment can be naturally specified as a workflow of activities implementing, in a standardized environment, the process of data and control analysis. A workflow has the advantage to be reproducible, traceable and compositional by reusing other workflows. In order to support the daily work of a bioscientist, several Workflow Management Systems (WMSs) have been proposed in bioinformatics. Generally, these systems centralize the workflow enactment and do not exploit standard process definition languages to describe, in order to be reusable, workflows. While almost all WMSs require heavy stand-alone applications to specify new workflows, only few of them provide a web-based process definition tool. Results: We have developed BioWMS, a Workflow Management System that supports, through a web-based interface, the definition, the execution and the results management of an in-silico experiment. BioWMS has been implemented over an agent-based middleware. It dynamically generates, from a user workflow specification, a domain-specific, agent-based workflow engine. Our approach exploits the proactiveness and mobility of the agent-based technology to embed, inside agents behaviour, the application domain features. Agents are workflow executors and the resulting workflow engine is a multiagent system-a distributed, concurrent system-typically open, flexible, and adaptative. A demo is available at http://litbio.unicam.it:8080/biowms. Conclusion: BioWMS, supported by Hermes mobile computing middleware, guarantees the flexibility, scalability and fault tolerance required to a workflow enactment over distributed and heterogeneous environment. BioWMS is funded by the FIRB project LITBIO (Laboratory for Interdisciplinary Technologies in Bioinformatics).

Experiences with workflows for automating data-intensive bioinformatics

Biology Direct, 2015

High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution.

A Novel Approach for Bioinformatics Workflow Discovery

International Journal of Advanced Computer Science and Applications, 2014

Workflow systems are typical fit for in the explorative research of bioinformaticians. These systems can help bioinformaticians to design and run their experiments and to automatically capture and store the data generated at runtime. On the other hand, Web services are increasingly used as the preferred method for accessing and processing the information coming from the diverse life science sources. In this work we provide an efficient approach for creating bioinformatic workflow for all-service architecture systems (i.e., all system components are services). This architecture style simplifies the user interaction with workflow systems and facilitates both the change of individual components, and the addition of new components to adopt to other workflow tasks if required. We finally present a case study for the bioinformatics domain to elaborate the applicability of our proposed approach. Index Terms-Web services; In-silico Workflows; Quality of Services (QoS); Web services for bioinformatics; Bioinformatics services.

A declarative language and toolkit for scientific workflow implementation and execution

International Journal of Business Process Integration and Management, 2010

Scientific workflow design is usually complex and demands access to and integration of numerous non-conventional resources. Geographical distribution and semantic heterogeneity of these resources add to this complexity. The cost effectiveness of such workflow design, thus, depends upon the lifespan of the application and its anticipated use. Shorter application lifespan usually entails prohibitive development costs. In this paper, we present an alternative platform for declarative workflow design using BioFlow in such environments. BioFlow is being developed as the query language for a scientific data management system called LifeDB that aims to support on-the-fly data integration and workflow support for life sciences applications. We argue that a declarative and ad hoc workflow design using BioFlow is more efficient and cost effective compared to traditional approaches using systems such as Taverna or Kepler. To demonstrate the advantages of BioFlow, we compare a canonical microarray data analysis workflow application design approach using BioFlow with Taverna and a gene regulation application using BioFlow and Kepler. We show that BioFlow supports ad hoc and modular application design at a throw away cost and produces a superior maintainable application that can adapt to changes in the source without significant effort compared to both Taverna and Kepler.

Well Constructed Workflows in Bioinformatics

2005

Abstract We demonstrate and discuss the advantages of bioinformatical workflows which have no side-effects over those which do have them. In particular, we describe a method to give a formal, mathematical semantics to the side-effect-free workflows defined in SCUFL, the workflow definition language of Taverna. This is achieved by translating them into a natural extension of the Nested Relational Calculus NRC.

Perspectives on automated composition of workflows in the life sciences

F1000Research

Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the “big picture” of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and communit...