A novel heuristic for local multiple alignment of interspersed DNA repeats - PubMed (original) (raw)
. 2009 Apr-Jun;6(2):180-9.
doi: 10.1109/TCBB.2009.9.
Affiliations
- PMID: 19407343
- DOI: 10.1109/TCBB.2009.9
A novel heuristic for local multiple alignment of interspersed DNA repeats
Todd J Treangen et al. IEEE/ACM Trans Comput Biol Bioinform. 2009 Apr-Jun.
Abstract
Pairwise local sequence alignment methods have been the prevailing technique to identify homologous nucleotides between related species. However, existing methods that identify and align all homologous nucleotides in one or more genomes have suffered from poor scalability and limited accuracy. We propose a novel method that couples a gapped extension heuristic with an efficient filtration method for identifying interspersed repeats in genome sequences. During gapped extension, we use the MUSCLE implementation of progressive global multiple alignment with iterative refinement. The resulting gapped extensions potentially contain alignments of unrelated sequence. We detect and remove such undesirable alignments using a hidden Markov model (HMM) to predict the posterior probability of homology. The HMM emission frequencies for nucleotide substitutions can be derived from any time-reversible nucleotide substitution matrix. We evaluate the performance of our method and previous approaches on a hybrid data set of real genomic DNA with simulated interspersed repeats. Our method outperforms a related method in terms of sensitivity, positive predictive value, and localizing boundaries of homology. The described methods have been implemented in freely available software, Repeatoire, available from: http://wwwabi.snv.jussieu.fr/public/Repeatoire.
Similar articles
- HomologMiner: looking for homologous genomic groups in whole genomes.
Hou M, Berman P, Hsu CH, Harris RS. Hou M, et al. Bioinformatics. 2007 Apr 15;23(8):917-25. doi: 10.1093/bioinformatics/btm048. Epub 2007 Feb 18. Bioinformatics. 2007. PMID: 17308341 - Multiple alignment using hidden Markov models.
Eddy SR. Eddy SR. Proc Int Conf Intell Syst Mol Biol. 1995;3:114-20. Proc Int Conf Intell Syst Mol Biol. 1995. PMID: 7584426 - FEAST: sensitive local alignment with multiple rates of evolution.
Hudek AK, Brown DG. Hudek AK, et al. IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):698-709. doi: 10.1109/TCBB.2010.76. IEEE/ACM Trans Comput Biol Bioinform. 2011. PMID: 20733242 - Multiple sequence alignments.
Wallace IM, Blackshields G, Higgins DG. Wallace IM, et al. Curr Opin Struct Biol. 2005 Jun;15(3):261-6. doi: 10.1016/j.sbi.2005.04.002. Curr Opin Struct Biol. 2005. PMID: 15963889 Review. - The megaprior heuristic for discovering protein sequence patterns.
Bailey TL, Gribskov M. Bailey TL, et al. Proc Int Conf Intell Syst Mol Biol. 1996;4:15-24. Proc Int Conf Intell Syst Mol Biol. 1996. PMID: 8877500 Review.
Cited by
- progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement.
Darling AE, Mau B, Perna NT. Darling AE, et al. PLoS One. 2010 Jun 25;5(6):e11147. doi: 10.1371/journal.pone.0011147. PLoS One. 2010. PMID: 20593022 Free PMC article. - Split-alignment of genomes finds orthologies more accurately.
Frith MC, Kawaguchi R. Frith MC, et al. Genome Biol. 2015 May 21;16(1):106. doi: 10.1186/s13059-015-0670-9. Genome Biol. 2015. PMID: 25994148 Free PMC article. - Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.
Acuña-Amador L, Primot A, Cadieu E, Roulet A, Barloy-Hubler F. Acuña-Amador L, et al. BMC Genomics. 2018 Jan 16;19(1):54. doi: 10.1186/s12864-017-4429-4. BMC Genomics. 2018. PMID: 29338683 Free PMC article. - Progressive genome-wide introgression in agricultural Campylobacter coli.
Sheppard SK, Didelot X, Jolley KA, Darling AE, Pascoe B, Meric G, Kelly DJ, Cody A, Colles FM, Strachan NJ, Ogden ID, Forbes K, French NP, Carter P, Miller WG, McCarthy ND, Owen R, Litrup E, Egholm M, Affourtit JP, Bentley SD, Parkhill J, Maiden MC, Falush D. Sheppard SK, et al. Mol Ecol. 2013 Feb;22(4):1051-64. doi: 10.1111/mec.12162. Epub 2012 Dec 20. Mol Ecol. 2013. PMID: 23279096 Free PMC article. - Integrative workflows for metagenomic analysis.
Ladoukakis E, Kolisis FN, Chatziioannou AA. Ladoukakis E, et al. Front Cell Dev Biol. 2014 Nov 19;2:70. doi: 10.3389/fcell.2014.00070. eCollection 2014. Front Cell Dev Biol. 2014. PMID: 25478562 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources