Who watches the watchmen? An appraisal of benchmarks for multiple sequence alignment - PubMed (original) (raw)
Review
Who watches the watchmen? An appraisal of benchmarks for multiple sequence alignment
Stefano Iantorno et al. Methods Mol Biol. 2014.
Abstract
Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique in bioinformatics used to infer related residues among biological sequences. Thus alignment accuracy is crucial to a vast range of analyses, often in ways difficult to assess in those analyses. To compare the performance of different aligners and help detect systematic errors in alignments, a number of benchmarking strategies have been pursued. Here we present an overview of the main strategies-based on simulation, consistency, protein structure, and phylogeny-and discuss their different advantages and associated risks. We outline a set of desirable characteristics for effective benchmarking, and evaluate each strategy in light of them. We conclude that there is currently no universally applicable means of benchmarking MSA, and that developers and users of alignment tools should base their choice of benchmark depending on the context of application-with a keen awareness of the assumptions underlying each benchmarking strategy.
Similar articles
- OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy.
Raghava GP, Searle SM, Audley PC, Barber JD, Barton GJ. Raghava GP, et al. BMC Bioinformatics. 2003 Oct 10;4:47. doi: 10.1186/1471-2105-4-47. BMC Bioinformatics. 2003. PMID: 14552658 Free PMC article. - DNA reference alignment benchmarks based on tertiary structure of encoded proteins.
Carroll H, Beckstead W, O'Connor T, Ebbert M, Clement M, Snell Q, McClellan D. Carroll H, et al. Bioinformatics. 2007 Oct 1;23(19):2648-9. doi: 10.1093/bioinformatics/btm389. Epub 2007 Aug 8. Bioinformatics. 2007. PMID: 17686799 - Protein multiple sequence alignment benchmarking through secondary structure prediction.
Le Q, Sievers F, Higgins DG. Le Q, et al. Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840. Bioinformatics. 2017. PMID: 28093407 Free PMC article. - Alignment methods: strategies, challenges, benchmarking, and comparative overview.
Löytynoja A. Löytynoja A. Methods Mol Biol. 2012;855:203-35. doi: 10.1007/978-1-61779-582-4_7. Methods Mol Biol. 2012. PMID: 22407710 Review. - Sequence alignment and phylogenetic analysis of Human Herpes Simplex Virus (HHV) using bioinformatics tool: a review.
Sohpal VK, Dey A, Singh A. Sohpal VK, et al. Int J Comput Biol Drug Des. 2010;3(1):68-88. doi: 10.1504/IJCBDD.2010.034466. Epub 2010 Aug 5. Int J Comput Biol Drug Des. 2010. PMID: 20693611 Review.
Cited by
- Partitional Classification: A Complement to Phylogeny.
Salomon M, Dassy B. Salomon M, et al. Evol Bioinform Online. 2016 Jun 16;12:149-56. doi: 10.4137/EBO.S38288. eCollection 2016. Evol Bioinform Online. 2016. PMID: 27346943 Free PMC article. - Developments in Algorithms for Sequence Alignment: A Review.
Chao J, Tang F, Xu L. Chao J, et al. Biomolecules. 2022 Apr 6;12(4):546. doi: 10.3390/biom12040546. Biomolecules. 2022. PMID: 35454135 Free PMC article. Review. - PASTA for proteins.
Collins K, Warnow T. Collins K, et al. Bioinformatics. 2018 Nov 15;34(22):3939-3941. doi: 10.1093/bioinformatics/bty495. Bioinformatics. 2018. PMID: 29931282 Free PMC article. - The Cumulative Indel Model: Fast and Accurate Statistical Evolutionary Alignment.
De Maio N. De Maio N. Syst Biol. 2021 Feb 10;70(2):236-257. doi: 10.1093/sysbio/syaa050. Syst Biol. 2021. PMID: 32653921 Free PMC article. - Evaluating Statistical Multiple Sequence Alignment in Comparison to Other Alignment Methods on Protein Data Sets.
Nute M, Saleh E, Warnow T. Nute M, et al. Syst Biol. 2019 May 1;68(3):396-411. doi: 10.1093/sysbio/syy068. Syst Biol. 2019. PMID: 30329135 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources