Large-Scale Pairwise Sequence Alignments on a Large-Scale GPU Cluster (original) (raw)

A distributed CPU-GPU framework for pairwise alignments on large-scale sequence datasets

2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors, 2013

Several problems in computational biology require the all-against-all pairwise comparisons of tens of thousands of individual biological sequences. Each such comparison can be performed with the well-known Needleman-Wunsch alignment algorithm. However, with the rapid growth of biological databases, performing all possible comparisons with this algorithm in serial becomes extremely time-consuming. The massive computational power of graphics processing units (GPUs) makes them an appealing choice for accelerating these computations. As such, CPU-GPU clusters can enable all-againstall comparisons on large datasets.

MSA — A GPU-based, fast and accurate algorithm for multiple sequence alignment

Journal of Parallel and Distributed Computing, 2013

Multiple sequence alignment (MSA) methods are essential in biological analysis. Several MSA algorithms have been proposed in recent years. The quality of the results produced by those methods is reasonable, but there is no single method that consistently outperforms others. Additionally, the increasing number of sequences in the biological databases is perceived as one of the upcoming challenges for alignment methods in the nearest future. The lack of performance concerns not only the alignment problems, but may be observed in many areas of biologically related research. To overcome this problem in the field of pairwise alignment, several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of GPU platform. Therefore, our main idea was to design and implement an MSA method which can take advantage of modern graphics cards. Our solution is based on T-Coffee-well known for its high accuracy MSA algorithm. Its computational time, however, is often unacceptable. Performed tests show that our method, named G-MSA, is highly efficient achieving up to 193-fold speedup on a single GPU while the quality of its results remains very good. Due to effective memory usage the method can perform alignment for huge sets of sequences that previously could only be aligned on computer clusters. Moreover, multiple GPUs support with load balancing makes the application very scalable.

Grabfast: A CUDA based GPU accelerated fast short sequence alignment algorithm

2012

Next Generation Sequencing (NGS) platforms typically produce short reads of size 50-150 base pairs (bp). The number of such short reads can be up to 6 billion per run. To align these short reads to a large genome is a computationally challenging problem. In this paper, we address this problem by considering the design and optimization of parallel sequence alignment on GPU based hybrid architectures. Even though the sequence alignment algorithm is inherently data-parallel, issues such as (a) space-time trade-offs in the Indexing schema, (b) need for fast candidate location search (CAL) on GPU, (c) maintaining low divergence along with low space for the dynamic programming based local alignment, make this a very challenging problem. We present the design of our novel parallel algorithm Graphics processor Accelerated BFAST (GrABFAST) for large scale read alignment that overcomes these challenges and demonstrates superior performance compared to Intel multicore architectures. Using 5 large genomes including those of Humans, Maize, Horse, Dog and Bacteria, we demonstrate a speedup of around 6× using Fermi Tesla C2070 GPUs vs the BFAST algorithm on 16 core Intel Xeon 5570 architecture.

Accurate Sequence Alignment using Distributed Filtering on GPU Clusters

2011

Abstract: Advent of next generation gene sequencing machines has led to computationally intensive alignment problems that can take many hours on a modern computer. Considering the fast increasing rate of introduction of new short sequences that are sequenced, the large number of existing sequences and inaccuracies in the sequencing machines, short sequence alignment has become a major challenge in High Performance Computing.

Accelerating Smith-Waterman Local Sequence Alignment on GPU Cluster

Proceedings of the Annual International Conference on Advances in Distributed and Parallel Computing ADPC 2010 ADPC 2010, 2010

With a high accuracy, the Smith-Waterman local sequence alignment algorithm requires a very large amount of memory and computation, making implementations on common computing systems become less practical. In this paper, we present swGPUCluster-an implementation of the Smith-Waterman algorithm on a cluster equipped with NVIDIA GPU graphics cards (called GPU Cluster). Our test was performed on a cluster of two nodes, one node is equipped with a dual graphics card NVIDIA GeForce GTX 295, a Tesla C1060 card, and the remaining node is equipped with 2 dual graphics cards NVIDIA GeForce GTX 295. Results show that the performance has increased significantly compared with the previous best implementations such as SWPS3 or CUDASW++. swGPUCluster's performance has increased along with the lengths of query sequences, from 37.328 GCUPS to 46.706 GCUPS. These results demonstrate the great computing power of graphics cards and their high applicability in solving bioinformatics problems.

QuickProbs—A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors

PLoS ONE, 2014

Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We selected the two most time consuming stages of MSAProbs to be redesigned for GPU execution: the posterior matrices calculation and the consistency transformation. Experiments on three popular benchmarks (BAliBASE, PREFAB, OXBench-X) on quad-core PC equipped with high-end graphics card show QuickProbs to be 5.7 to 9.7 times faster than original CPU-parallel MSAProbs. Additional tests performed on several protein families from Pfam database give overall speed-up of 6.7. Compared to other algorithms like MAFFT, MUSCLE, or ClustalW, QuickProbs proved to be much more accurate at similar speed. Additionally we introduce a tuned variant of QuickProbs which is significantly more accurate on sets of distantly related sequences than MSAProbs without exceeding its computation time. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors.

Accelerating pairwise DNA Sequence Alignment using the CUDA compatible GPU

International Journal of Computer Applications, 2013

We present a novel implementation of the pairwise DNA sequence alignment problem other than the Dynamic programming solution presented by Smith Waterman Algorithm. The proposed implementation uses CUDA; the parallel computing platform and programming model invented by NVIDIA. The main idea of the proposed implementation is assigning different nucleotide weights then merging the sub-sequences of match using the GPU Architecture according to predefined rules to get the optimum local alignment. We parallelize the whole solution for the pairwise DNA sequence alignment using CUDA and compare the results against a similar semi-parallelized solution and a traditional Smith-Waterman implementation on traditional processors; Experimental results demonstrate a considerable reduction in the running time.

iPuma: High-Performance Sequence Alignment on the Graphcore IPU

2024

String alignment algorithms are an essential tool for understanding DNA and protein sequences. They demand substantial computation in real-world applications, and are thus a prime target for hardware acceleration. However, GPUs struggle to provide sufficient acceleration. Meanwhile, the recent MIMD-capable AI accelerators such as the Graphcore Intelligence Processing Unit (IPU) have become technologically viable. In this paper we present iPuma, a new implementation of Smith-Waterman sequence alignment for the IPU, which offers generalized short and medium length, one-to-one, and many-to-many high-throughput alignments for both DNA and protein sequences. iPuma is integrated into two bioinformatics pipelines, MetaHipMer2 and PASTIS. On protein datasets, iPuma shows speedups of 2.7× and 1.6× over state-of-the-art GPU and CPU implementations, respectively. We test the scalability on up to 64 IPUs, attaining a peak scoring performance of 1763 GCUPS for protein and 1168 GCUPS for DNA sequences.

Protein alignment algorithms with an efficient backtracking routine on multiple GPUs

BMC Bioinformatics, 2011

Background Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. Results In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the imp...

Accelerated GPU Based Protein Sequence Alignment – An optimized database sequences approach

Smith-Waterman (S-W) algorithm is the perfect sequence alignment method for the biological database but practically this algorithm lacks pace due to high computational complexity. FASTA, BLAST and other heuristics approaches are faster in computations but less accurate. Volume and length variation of sequences require restructuring the database. Acceleration of Smith-Waterman algorithm on proper modern hardware brings perfection and accuracy. This paper presents a high-performance sequence alignment algorithm implemented on Kepler's architecture graphic processor unit. This new implementation is improved version having reduced memory accesses to eliminate bandwidth congestion. The implementation is performed on Kepler's architecture graphics processing unit on which the performance was raised to 51 Giga Cells updates per second GCPUS which is 138.3% increase than the previous implementation on GTX275 GPU. In this implementation protein database is converted into equal length sequence sets on advanced GPU. By this workload is distributed among GPU microprocessor threads. This results in improved implementation than previous implementations.