Design and implementation of a parallel architecture for biological sequence comparison (original) (raw)

On the parallelisation of bioinformatics applications

Briefings in Bioinformatics, 2001

This paper surveys the computational strategies followed to parallelise the most used software in the bioinformatics arena. The studied algorithms are computationally expensive and their computational patterns range from regular, such as database-searching applications, to very irregularly structured patterns (phylogenetic trees). Fine-and coarse-grained parallel strategies are discussed for these very diverse sets of applications. This overview outlines computational issues related to parallelism, physical machine models, parallel programming approaches and scheduling strategies for a broad range of computer architectures. In particular, it deals with shared, distributed and shared/distributed memory architectures.

Exploiting Different Levels of Parallelism In the Biological Sequence Comparison Problem

sarc-ip.org

In the last years the fast growth of bioinformatics field has atracted the attention of computer scientists. At the same time, de exponential growth of databases that contains biological information (such as protein and DNA data) demands great efforts to improve the performance of computational platforms. In this work, we investigate how bioinformatics applications benefit from parallel architectures that combine different alternatives to exploit coarse-and fine-grain parallelism. As a case of analysis, we study the performance behavior of the Ssearch application that implements the Smith-Waterman algorithm (SW), which is a dynamic programing approach that explores the similarity between a pair of sequences. The inherent large parallelism of the application makes it ideal for architectures supporting multiple dimensions of parallelism (thread-level parallelism, TLP; data-level parallelism, DLP; instruction-level parallelism, ILP). We study how this algorithm can take advantage of different parallel machines like the SGI Altix, IBM Power6, IBM Cell BE and MareNostrum machines. Our study includes a qualitative analysis of the parallelization opportunities and also the quantification of the performance in terms of speedup and execution time. These measures are collected taking into account the specific characteristics of each architecture. As an example, our results show that a share memory multiprocessor architecture (SMP) like the PowerPC 970MP of Marenostrum machine can surpasses a heterogeneous multiprocessor machine like the current IBM Cell BE.

Parallel Biological Sequence Comparison on Heterogeneous High Performance Computing Platforms with BSP++

2011 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

Biological Sequence Comparison is an important operation in Bioinformatics that is often used to relate organisms. Smith and Waterman proposed an exact algorithm (SW) that compares two sequences in quadratic time and space. Due to high computing and memory requirements, SW is usually executed on HPC platforms such as multicore clusters and CellBEs. Since HPC architectures exhibit very different hardware characteristics, porting an application between them is an error-prone time-consuming task. BSP++ is an implementation of BSP that aims to reduce the effort to write parallel code. In this paper, we propose and evaluate a parallel BSP++ strategy to execute SW in multiple platforms like MPI, OpenMP, MPI/OpenMP, CellBE and MPI/CellBE. The results obtained with real DNA sequences show that the performance of our versions is comparable to the ones in the literature, evidencing the appropriateness and flexibility of our approach.

A Multithreaded Parallel Implementation of a Dynamic Programming Algorithm for Sequence Comparison

2001

This paper discusses the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a generalpurpose parallel computing platform based on a ne-grain event-driven multithreaded program execution model. Fine-grain multithreading permits e cient parallelism exploitation in this application both by taking advantage of asynchronous point-to-point synchronizations and communication with low o v erheads and by e ectively tolerating latency through the overlapping of computation and communication. We have implemented our scheme on EARTH, a ne-grain event-driven multithreaded execution and architecture model which has been ported to a numberof parallel machines with o -the-shelf processors. Our experimental results show that the dynamic programming algorithm can be e ciently implemented on EARTH systems with high performance e.g., speedup of 90 on 120 nodes, good programmability and reasonable cost.

A Comparison of Computation Techniques for Dna Sequence Comparison

This Project shows a comparison survey done on DNA sequence comparison techniques. The various techniques implemented are sequential comparison, multithreading on a single computer and multithreading using parallel processing. This Project shows the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general purpose parallel computing platform Tiling is an important technique for extraction of parallelism. Informally, tiling consists of partitioning the iteration space into several chunks of computation called tiles (blocks) such that sequential traversal of the tiles covers the entire iteration space. The idea behind tiling is to increase the granularity of computation and decrease the amount of communication incurred between processors. This makes tiling more suitable for distributed memory architectures where communication startup costs are very high and hence frequent communication is undesirable. Our work to develop sequencecomparison mechanism and software supports the identification of sequences of DNA.

Efficient Parallelization of a Protein Sequence Comparison Algorithm on Manycore Architecture

2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies, 2008

This paper introduces the Godson-T manycore architecture and demonstrates the efficiency of its synchronization mechanism through a computation intensive bioinformatics application: the comparison of protein banks. The parallel part of the protein sequence comparison algorithm can nearly get a linear speed-up thanks to a fine tuning of the synchronization mechanism provided by the Godson-T chip.

Scalable multicore architectures for long DNA sequence comparison

Concurrency and Computation: Practice and Experience, 2011

Biological sequence comparison is one of the most important tasks in Bioinformatics. Due to the growth of biological databases, sequence comparison is becoming an important challenge for high performance computing, especially when very long sequences are compared. The Smith-Waterman (SW) algorithm is an exact method based on dynamic programming to quantify local similarity between sequences. The inherent large parallelism of the algorithm makes it ideal for architectures supporting multiple dimensions of parallelism (TLP, DLP and ILP). In this work, we show how long sequences comparison takes advantage of current and future multicore architectures. We analyze two different SW implementations on the CellBE and use simulation tools to study the performance scalability in a multicore architecture. We study the memory organization that delivers the maximum bandwidth with the minimum cost. Our results show that a heterogeneous architecture is an valid alternative to execute challenging bioinformatic workloads.

Whole Genome Comparison using Commodity Hardware

Whole genome comparison consists of comparing or aligning two genome sequences in the hope that analogous functional or physical characteristics may be observed. Sequence comparison is done via a number of slow rigorous algorithms, or faster heuristic approaches. However, due to the large size of genomic sequences, the capacity of current software is limited.

Parallel Computation of Gene Sequence Matching

One of the main challenges in bioinformatics nowadays is to create a framework to compare efficiently new DNA sequence information to large existing sequence and structure databases. Optimal methods, such as the Smith-Waterman algorithm, provide more sensitive results than heuristic algorithms such as the Dot matrix plot, FASTA and BLAST, with the drawback of increased computational complexity. FPGA implementations of Smith-Waterman exploit the intrinsic parallelism of the algorithm and achieve reductions in computation time of several orders of magnitude. In this paper we propose an implementation of the Smith-Waterman algorithm based on a linear systolic array that doubles the speed of current approaches with a minimum increase of area. The design was performed taking into account the bus I/O bottleneck (i.e. PCI), so the processing speed improvement is still available even when the systolic array is connected to a bus. The implementation results on Xilinx Virtex and Virtex2 FPGA ...