Distributed and parallel algorithms and systems for inference of huge phylogenetic trees based on the maximum likelihood method (original) (raw)

ACM/IEEE SC 2000 Conference (SC'00), 2000

Recent advances in DNA sequencing technology have created large data sets upon which phylogenetic inference can be performed. However, current research is limited by the prohibitive time necessary to perform tree search on even a reasonably sized data set. Some parallel algorithms have been developed but the biological research community does not use them because they don't trust the results from newly developed parallel software. This paper presents a new phylogenetic algorithm that allows existing, trusted phylogenetic software packages to be executed in parallel using the DOGMA parallel processing system. The results presented here indicate that data sets that currently take as much as 11 months to search using current algorithms, can be searched in as little as 2 hours using as few as 8 processors. This reduction in the time necessary to complete a phylogenetic search allows new research questions to be explored in many of the biological sciences.

Parallel algorithms for Bayesian phylogenetic inference

Journal of Parallel and Distributed Computing, 2003

The combination of a Markov chain Monte Carlo (MCMC) method with likelihood-based assessment of phylogenies is becoming a popular alternative to direct likelihood optimization. However, MCMC, like maximum likelihood, is a computationally expensive method. To approximate the posterior distribution of phylogenies, a Markov chain is constructed, using the Metropolis algorithm, such that the chain has the posterior distribution of the parameters of phylogenies as its stationary distribution.

Genetic Algorithms and Parallel Processing in Maximum-Likelihood Phylogeny Inference

Molecular Biology and Evolution, 2002

We investigated the usefulness of a parallel genetic algorithm for phylogenetic inference under the maximumlikelihood (ML) optimality criterion. Parallelization was accomplished by assigning each ''individual'' in the genetic algorithm ''population'' to a separate processor so that the number of processors used was equal to the size of the evolving population (plus one additional processor for the control of operations). The genetic algorithm incorporated branch-length and topological mutation, recombination, selection on the ML score, and (in some cases) migration and recombination among subpopulations. We tested this parallel genetic algorithm with large (228 taxa) data sets of both empirically observed DNA sequence data (for angiosperms) as well as simulated DNA sequence data. For both observed and simulated data, search-time improvement was nearly linear with respect to the number of processors, so the parallelization strategy appears to be highly effective at improving computation time for large phylogenetic problems using the genetic algorithm. We also explored various ways of optimizing and tuning the parameters of the genetic algorithm. Under the conditions of our analyses, we did not find the best-known solution using the genetic algorithm approach before terminating each run. We discuss some possible limitations of the current implementation of this genetic algorithm as well as of avenues for its future improvement.

Building large phylogenetic trees on coarse-grained parallel machines

Algorithmica, 2006

Phylogenetic analysis is an area of computational biology concerned with the reconstruction of evolutionary relationships between organisms, genes, and gene families. Maximum likelihood evaluation has proven to be one of the most reliable methods for constructing phylogenetic trees. The huge computational requirements associated with maximum likelihood analysis means that it is not feasible to produce large phylogenetic trees using a single processor. We have completed a fully cross platform coarse grained distributed application, DPRml, which overcomes many of the limitations imposed by the current set of parallel phylogenetic programs. We have completed a set of efficiency tests that show how to maximise efficiency while using the program to build large phylogenetic trees. The software is publicly available under the terms of the GNU general public licence from the system webpage at

DRAxML@home: a distributed program for computation of large phylogenetic trees

Future Generation Computer Systems, 2005

Inference of large phylogenetic trees using statistical methods is computationally extremely expensive. Thus, progress is primarily achieved via algorithmic innovation rather than by brute-force allocation of available computational ressources. We describe simple heuristics which yield accurate trees for synthetic (simulated) as well as real data and significantly improve execution time. The heuristics are implemented in a sequential program (RAxML) and a novel non-deterministic distributed algorithm (DRAxML@home). We implemented an MPI-based and a http-based distributed prototype of this algorithm and used DRAxML@home to infer trees comprising 1000 and 2025 organisms on LINUX PC clusters.

Parallel computation of phylogenetic consensus trees

2010

The field of bioinformatics is witnessing a rapid and overwhelming accumulation of molecular sequence data, predominantly driven by novel wet-lab sequencing techniques. This trend poses scalability challenges for tool developers. In the field of phylogenetic inference (reconstruction of evolutionary trees from molecular sequence data), scalability is becoming an increasingly important issue for operations other than the tree reconstruction itself. In this paper we focus on a post-analysis task in reconstructing very large trees, specifically the step of building (extended) majority rules consensus trees from a collection of equally plausible trees or a collection of bootstrap replicate trees. To this end, we present sequential optimizations that establish our implementation as the current fastest exact implementation in phylogenetics, and our novel parallelized routines are the first of their kind. Our sequential optimizations achieve a performance improvement of factor 50 compared to the previous version of our code and we achieve a maximum speedup of 5.5 on a 8-core Nehalem node for building consensi on trees comprising up to 55,000 organisms. The methods developed here are integrated into the widely used open-source tool RAxML for phylogenetic tree reconstruction.

Accelerating parallel maximum likelihood-based phylogenetic tree calculations using subtree equality vectors

2002

Abstract Heuristics for calculating phylogenetic trees for a large sets of aligned rRNA sequences based on the maximum likelihood method are computationally expensive. The core of most parallel algorithms, which accounts for the greatest part of computation time, is the tree evaluation function, that calculates the likelihood value for each tree topology. This paper describes and uses Subtree Equality Vectors (SEVs) to reduce the number of required floating point operations during topology evaluation.

Reconstruction of large phylogenetic trees: A parallel approach

Computational Biology and Chemistry, 2005

Reconstruction of phylogenetic trees for very large datasets is a known example of a computationally hard problem. In this paper, we present a parallel computing model for the widely used Multiple Instruction Multiple Data (MIMD) architecture. Following the idea of divide-and-conquer, our model adapts the Recursive-DCM3 decomposition method to divide datasets into smaller subproblems. It distributes computation load over multiple processors so that each processor constructs subtrees on each subproblem within a batch in parallel. It finally collects the resulting trees and merges them into a supertree. The proposed model is flexible as far as methods for dividing and merging datasets are concerned. We show that our method greatly reduces the computational time of the sequential version of the program. As a case study, our parallel approach only takes 22.1 hours on four processors to outperform the best score to date (found at 123.7 hours by the sequential Rec-I-DCM3 program ) on one dataset. Developed with the standard message-passing library, MPI, the program can be recompiled and run on any MIMD systems.

PBPI: a High Performance Implementation of Bayesian Phylogenetic Inference

ACM/IEEE SC 2006 Conference (SC'06), 2006

This paper describes the implementation and performance of PBPI, a parallel implementation of Bayesian phylogenetic inference method for DNA sequence data. By combining the Markov Chain Monte Carlo (MCMC) method with likelihood-based assessment of phylogenies, Bayesian phylogenetic inferences can incorporate complex statistic models into the process of phylogenetic tree estimation. However, Bayesian analyses are extremely computationally expensive. PBPI uses algorithmic improvements and parallel processing to achieve significant performance improvement over comparable Bayesian phylogenetic inference programs. We evaluated the performance and accuracy of PBPI using a simulated dataset on System X, a terascale supercomputer at Virginia Tech. Our results show that PBPI identifies equivalent tree estimates 1424 times faster on 256 processors than a widely-used, best-available (albeit sequential), Bayesian phylogenetic inference program. PBPI also achieves linear speedup with the number of processors for large problem sizes. Most importantly, the PBPI framework enables Bayesian phylogenetic analysis of large datasets previously impracticable.

Parallel implementation of a quartet-based algorithm for phylogenetic analysis

2006

Abstract This paper describes a parallel implementation of our recently developed algorithm for phylogenetic analysis on the IBM BlueGene/L cluster. This algorithm constructs evolutionary trees for a given set of DNA or protein sequences based on the topological information of every possible quartet trees. Our experimental results showed that it has several advantages over many popular algorithms.

Genetic algorithms and parallel processing in maximum-likelihood phylogeny …

Molecular Biology and …, 2002

We investigated the usefulness of a parallel genetic algorithm for phylogenetic inference under the maximum-likelihood (ML) optimality criterion. Parallelization was accomplished by assigning each "individual" in the genetic algorithm "population" to a separate processor so that the ...

Performance Evaluation of Computational Phylogeny Software in Parallel Computing Environment

Advances in Intelligent Systems and Computing, 2013

Computational phylogeny is a challenging application even for the most powerful supercomputers. One of significant application in this are is Randomized Axelerated Maximum Likelihood (RAxML) which is used for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. This paper covers scalability testing results on high-performance computers on up to 256 cores, for coarse and fine grained parallelization using MPI, Pthreads and hybrid version and comparison between results of traditional and SSE3 version of RAxML.

Parallel Implementation and Performance of FastDNAml: a Program for Maximum Likelihood Phylogenetic Inference

Proceedings of the …, 2001

This paper describes the parallel implementation of fastDNAml, a program for the maximum likelihood inference of phylogenetic trees from DNA sequence data. Mathematical means of inferring phylogenetic trees have been made possible by the wealth of DNA data now available. Maximum likelihood analysis of phylogenetic trees is extremely computationally intensive. Availability of computer resources is a key factor limiting use of such analyses. fastDNAml is implemented in serial, PVM, and MPI versions, and may be modified to use other message passing libraries in the future. We have developed a viewer for comparing phylogenies. We tested the scaling behavior of fastDNAml on an IBM RS/6000 SP up to 64 processors. The parallel version of fastDNAml is one of very few computational phylogenetics codes that scale well. fastDNAml is available for download as source code or compiled for Linux or AIX.

Exploiting Fine-Grained Parallelism in the Phylogenetic Likelihood Function with MPI, Pthreads, and OpenMP: A Performance Study

Lecture Notes in Computer Science, 2008

Emerging multi-and many-core computer architectures pose new challenges with respect to efficient exploitation of parallelism. In addition, it is currently not clear which might be the most appropriate parallel programming paradigm to exploit such architectures, both from the efficiency as well as software engineering point of view. Beyond that, the application of high performance computing techniques and the use of supercomputers will be essential to deal with the explosive accumulation of sequence data. We address these issues via a thorough performance study by example of RAxML, which is a widely used Bioinformatics application for large-scale phylogenetic inference under the Maximum Likelihood criterion. We provide an overview over the respective parallelization strategies with MPI, Pthreads, and OpenMP and assess performance for these approaches on a large variety of parallel architectures. Results indicate that there is no universally best-suited paradigm with respect to efficiency and portability of the ML function. Therefore, we suggest that the ML function should be parallelized with MPI and Pthreads based on software engineering criteria as well as to enforce data locality.

Computing large phylogenies with statistical methods: Problems and solutions

The computation of ever larger as well as more accurate phylogenetic trees with the ultimate goal to compute the "tree of life" represents a major challenge in Bioinformatics. Statistical methods for phylogenetic analysis such as maximum likelihood or bayesian inference, have shown to be the most accurate methods for tree reconstruction. Unfortunately, the size of trees which can be computed in reasonable time is limited by the severe computational complexity induced by these statistical methods. However, the field has witnessed great algorithmic advances over the last 3 years which enable inference of large phylogenetic trees containing 500-1000 sequences on a single CPU within a couple of hours using maximum likelihood programs such as RAxML and PHYML. An additional order of magnitude in terms of computable tree sizes can be obtained by parallelizing these new programs. In this paper we briefly present the MPI-based parallel implementation of RAxML (Randomized Axelerated Maximum Likelihood), as a solution to compute large phylogenies. Within this context, we describe how parallel RAxML has been used to compute the -to the best of our knowledge-first maximum likelihood-based phylogenetic tree containing 10.000 taxa on an inexpensive LINUX PC-Cluster. In addition, we address unresolved problems, which arise when computing large phylogenies for real-world sequence data consisting of more than 1.000 organisms with maximum likelihood, based on our experience with RAxML. Finally, we discuss potential algorithmic and technical enhancements of RAxML within the context of future work. Availability: wwwbode.in.tum.de/~stamatak