Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis - PubMed (original) (raw)
. 2008 Jun 20;320(5883):1632-5.
doi: 10.1126/science.1158395.
Affiliations
- PMID: 18566285
- DOI: 10.1126/science.1158395
Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis
Ari Löytynoja et al. Science. 2008.
Abstract
Genetic sequence alignment is the basis of many evolutionary and comparative studies, and errors in alignments lead to errors in the interpretation of evolutionary information in genomes. Traditional multiple sequence alignment methods disregard the phylogenetic implications of gap patterns that they create and infer systematically biased alignments with excess deletions and substitutions, too few insertions, and implausible insertion-deletion-event histories. We present a method that prevents these systematic errors by recognizing insertions and deletions as distinct evolutionary events. We show theoretically and practically that this improves the quality of sequence alignments and downstream analyses over a wide range of realistic alignment problems. These results suggest that insertions and sequence turnover are more common than is currently thought and challenge the conventional picture of sequence evolution and mechanisms of functional and structural changes.
Similar articles
- Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment.
Kim J, Sinha S. Kim J, et al. Bioinformatics. 2007 Feb 1;23(3):289-97. doi: 10.1093/bioinformatics/btl578. Epub 2006 Nov 15. Bioinformatics. 2007. PMID: 17110370 - The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection.
Fletcher W, Yang Z. Fletcher W, et al. Mol Biol Evol. 2010 Oct;27(10):2257-67. doi: 10.1093/molbev/msq115. Epub 2010 May 5. Mol Biol Evol. 2010. PMID: 20447933 - HIV/SIV glycoproteins: structure-function relationships.
Douglas NW, Munro GH, Daniels RS. Douglas NW, et al. J Mol Biol. 1997 Oct 17;273(1):122-49. doi: 10.1006/jmbi.1997.1277. J Mol Biol. 1997. PMID: 9367752 - Alignment methods: strategies, challenges, benchmarking, and comparative overview.
Löytynoja A. Löytynoja A. Methods Mol Biol. 2012;855:203-35. doi: 10.1007/978-1-61779-582-4_7. Methods Mol Biol. 2012. PMID: 22407710 Review. - Multiple sequence alignment in phylogenetic analysis.
Phillips A, Janies D, Wheeler W. Phillips A, et al. Mol Phylogenet Evol. 2000 Sep;16(3):317-30. doi: 10.1006/mpev.2000.0785. Mol Phylogenet Evol. 2000. PMID: 10991785 Review.
Cited by
- Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications.
Redelings BD, Holmes I, Lunter G, Pupko T, Anisimova M. Redelings BD, et al. Mol Biol Evol. 2024 Sep 4;41(9):msae177. doi: 10.1093/molbev/msae177. Mol Biol Evol. 2024. PMID: 39172750 Free PMC article. Review. - Comparative Analysis and Phylogeny of the Complete Chloroplast Genomes of Nine Cynanchum (Apocynaceae) Species.
Zhang E, Ma X, Guo T, Wu Y, Zhang L. Zhang E, et al. Genes (Basel). 2024 Jul 5;15(7):884. doi: 10.3390/genes15070884. Genes (Basel). 2024. PMID: 39062662 Free PMC article. - Characterization of the complete chloroplast genome of the rare medicinal plant: Mandragora caulescens (Solanaceae).
Ma H, Zhang E, An Y, Wei Y, Zhang L. Ma H, et al. Mitochondrial DNA B Resour. 2024 Jun 20;9(6):812-817. doi: 10.1080/23802359.2024.2368213. eCollection 2024. Mitochondrial DNA B Resour. 2024. PMID: 38911521 Free PMC article. - GTDrift: a resource for exploring the interplay between genetic drift, genomic and transcriptomic characteristics in eukaryotes.
Bénitière F, Duret L, Necsulea A. Bénitière F, et al. NAR Genom Bioinform. 2024 Jun 12;6(2):lqae064. doi: 10.1093/nargab/lqae064. eCollection 2024 Jun. NAR Genom Bioinform. 2024. PMID: 38867915 Free PMC article. - Phylogenetic analysis and divergence time estimation of Lycium species in China based on the chloroplast genomes.
Zhang L, Zhang E, Wei Y, Zheng G. Zhang L, et al. BMC Genomics. 2024 Jun 6;25(1):569. doi: 10.1186/s12864-024-10487-9. BMC Genomics. 2024. PMID: 38844874 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous