Whole genome comparison using commodity workstations (original) (raw)

Whole Genome Comparison using Commodity Hardware

Whole genome comparison consists of comparing or aligning two genome sequences in the hope that analogous functional or physical characteristics may be observed. Sequence comparison is done via a number of slow rigorous algorithms, or faster heuristic approaches. However, due to the large size of genomic sequences, the capacity of current software is limited.

Whole Genome Comparison on a Network of Workstations

2007

Whole genome comparison consists of comparing or aligning genome sequences with a goal of finding similarities between them. Previously we have shown how SIMD Extensions used in Intel processors can be used to efficiently implement the, genome comparing, Smith-Waterman algorithm. Here we present distributed version of that algorithm. We show that on somewhat outdated hardware we can achieve speeds upwards of 8000 MCUPS; one of the fastest implementations of the Smith-Waterman algorithm.

Applying SIMD approach to whole genome comparison on commodity hardware

2008

Whole genome comparison compares (aligns) two genome sequences assuming that analogous characteristics may be found. In this paper, we present an SIMD version of the Smith-Waterman algorithm utilizing Streaming SIMD Extensions (SSE), running on Intel Pentium processors. We compare two approaches, one requiring explicit data dependency handling and one built to automatically handle dependencies and establish their optimal performance conditions.

A distributed scheme for efficient pair-wise comparison of complete genomes

iciis, 1999

The comparisons of newly sequenced genomes against a genome with known functionality of genes provide important clues to the structure and function of genes and identification of metabolic pathways in newly sequenced organisms. New and more complex organisms are being added to biological databases at an increasing rate. Time-efficient, automated computational methods are needed to analyze the increasing amount of data in realistic time. This paper describes a distributed technique and a CORBA-based implementation to compare and align gene sequences in large complete genomes, using multiple heterogeneous distributed processors on a distributed network. The performance evaluation suggests that the distributed technique can significantly reduce the computational time.

Scalable multicore architectures for long DNA sequence comparison

Concurrency and Computation: Practice and Experience, 2011

Biological sequence comparison is one of the most important tasks in Bioinformatics. Due to the growth of biological databases, sequence comparison is becoming an important challenge for high performance computing, especially when very long sequences are compared. The Smith-Waterman (SW) algorithm is an exact method based on dynamic programming to quantify local similarity between sequences. The inherent large parallelism of the algorithm makes it ideal for architectures supporting multiple dimensions of parallelism (TLP, DLP and ILP). In this work, we show how long sequences comparison takes advantage of current and future multicore architectures. We analyze two different SW implementations on the CellBE and use simulation tools to study the performance scalability in a multicore architecture. We study the memory organization that delivers the maximum bandwidth with the minimum cost. Our results show that a heterogeneous architecture is an valid alternative to execute challenging bioinformatic workloads.

Exploiting Different Levels of Parallelism In the Biological Sequence Comparison Problem

sarc-ip.org

In the last years the fast growth of bioinformatics field has atracted the attention of computer scientists. At the same time, de exponential growth of databases that contains biological information (such as protein and DNA data) demands great efforts to improve the performance of computational platforms. In this work, we investigate how bioinformatics applications benefit from parallel architectures that combine different alternatives to exploit coarse-and fine-grain parallelism. As a case of analysis, we study the performance behavior of the Ssearch application that implements the Smith-Waterman algorithm (SW), which is a dynamic programing approach that explores the similarity between a pair of sequences. The inherent large parallelism of the application makes it ideal for architectures supporting multiple dimensions of parallelism (thread-level parallelism, TLP; data-level parallelism, DLP; instruction-level parallelism, ILP). We study how this algorithm can take advantage of different parallel machines like the SGI Altix, IBM Power6, IBM Cell BE and MareNostrum machines. Our study includes a qualitative analysis of the parallelization opportunities and also the quantification of the performance in terms of speedup and execution time. These measures are collected taking into account the specific characteristics of each architecture. As an example, our results show that a share memory multiprocessor architecture (SMP) like the PowerPC 970MP of Marenostrum machine can surpasses a heterogeneous multiprocessor machine like the current IBM Cell BE.

Long DNA Sequence Comparison on Multicore Architectures

Lecture Notes in Computer Science, 2010

Biological sequence comparison is one of the most important tasks in Bioinformatics. Due to the growth of biological databases, sequence comparison is becoming an important challenge for high performance computing, especially when very long sequences are compared. The Smith-Waterman (SW) algorithm is an exact method based on dynamic programming to quantify local similarity between sequences. The inherent large parallelism of the algorithm makes it ideal for architectures supporting multiple dimensions of parallelism (TLP, DLP and ILP). In this work, we show how long sequences comparison takes advantage of current and future multicore architectures. We analyze two different SW implementations on the CellBE and use simulation tools to study the performance scalability in a multicore architecture. We study the memory organization that delivers the maximum bandwidth with the minimum cost. Our results show that a heterogeneous architecture is an valid alternative to execute challenging bioinformatic workloads.

Design and implementation of a parallel architecture for biological sequence comparison

Lecture Notes in Computer Science, 1996

New generations of scienti c codes trend to mix di erent types of parallelism. Algorithms are de ned as a set of modules, with data parallelism inside modules and task parallelism between them. With high speed networks, tasks running on a heterogeneous computing environment can exchange data in a reasonable delay. Therefore dataparallel tasks distributed on di erent parallel computers can interact e ciently by reading or writing Data Parallel Objects. These objects are distributed on the physical nodes according to the mapping directives. Migrations of data parallel objects from one parallel computer to another lead us to de ne e cient algorithms for runtime array redistribution. In this work, we have specially cared about the ability to handle distinct source and target processor sets while performing redistribution and the ability to overlap communications and computations. Performance results on a farm of ALPHA processors are discussed.

Parallel Biological Sequence Comparison on Heterogeneous High Performance Computing Platforms with BSP++

2011 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

Biological Sequence Comparison is an important operation in Bioinformatics that is often used to relate organisms. Smith and Waterman proposed an exact algorithm (SW) that compares two sequences in quadratic time and space. Due to high computing and memory requirements, SW is usually executed on HPC platforms such as multicore clusters and CellBEs. Since HPC architectures exhibit very different hardware characteristics, porting an application between them is an error-prone time-consuming task. BSP++ is an implementation of BSP that aims to reduce the effort to write parallel code. In this paper, we propose and evaluate a parallel BSP++ strategy to execute SW in multiple platforms like MPI, OpenMP, MPI/OpenMP, CellBE and MPI/CellBE. The results obtained with real DNA sequences show that the performance of our versions is comparable to the ones in the literature, evidencing the appropriateness and flexibility of our approach.