Aligning amino acid sequences: comparison of commonly used methods - PubMed (original) (raw)
Comparative Study
Aligning amino acid sequences: comparison of commonly used methods
D F Feng et al. J Mol Evol. 1984.
Abstract
We examined two extensive families of protein sequences using four different alignment schemes that employ various degrees of "weighting" in order to determine which approach is most sensitive in establishing relationships. All alignments used a similarity approach based on a general algorithm devised by Needleman and Wunsch. The approaches included a simple program, UM (unitary matrix), whereby only identities are scored; a scheme in which the genetic code is used as a basis for weighting (GC); another that employs a matrix based on structural similarity of amino acids taken together with the genetic basis of mutation (SG); and a fourth that uses the empirical log-odds matrix (LOM) developed by Dayhoff on the basis of observed amino acid replacements. The two sequence families examined were (a) nine different globins and (b) nine different tyrosine kinase-like proteins. It was assumed a priori that all members of a family share common ancestry. In cases where two sequences were more than 30% identical, alignments by all four methods were almost always the same. In cases where the percentage identity was less than 20%, however, there were often significant differences in the alignments. On the average, the Dayhoff LOM approach was the most effective in verifying distant relationships, as judged by an empirical "jumbling test." This was not universally the case, however, and in some instances the simple UM was actually as good or better. Trees constructed on the basis of the various alignments differed with regard to their limb lengths, but had essentially the same branching orders. We suggest some reasons for the different effectivenesses of the four approaches in the two different sequence settings, and offer some rules of thumb for assessing the significance of sequence relationships.
Similar articles
- Using CLUSTAL for multiple sequence alignments.
Higgins DG, Thompson JD, Gibson TJ. Higgins DG, et al. Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8. Methods Enzymol. 1996. PMID: 8743695 - Progressive sequence alignment as a prerequisite to correct phylogenetic trees.
Feng DF, Doolittle RF. Feng DF, et al. J Mol Evol. 1987;25(4):351-60. doi: 10.1007/BF02603120. J Mol Evol. 1987. PMID: 3118049 - Profile analysis: detection of distantly related proteins.
Gribskov M, McLachlan AD, Eisenberg D. Gribskov M, et al. Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8. doi: 10.1073/pnas.84.13.4355. Proc Natl Acad Sci U S A. 1987. PMID: 3474607 Free PMC article. - Structural divergence and distant relationships in proteins: evolution of the globins.
Lecomte JT, Vuletich DA, Lesk AM. Lecomte JT, et al. Curr Opin Struct Biol. 2005 Jun;15(3):290-301. doi: 10.1016/j.sbi.2005.05.008. Curr Opin Struct Biol. 2005. PMID: 15922591 Review. - Evolution and taxonomy of positive-strand RNA viruses: implications of comparative analysis of amino acid sequences.
Koonin EV, Dolja VV. Koonin EV, et al. Crit Rev Biochem Mol Biol. 1993;28(5):375-430. doi: 10.3109/10409239309078440. Crit Rev Biochem Mol Biol. 1993. PMID: 8269709 Review.
Cited by
- Characterization on the oncogenic effect of the missense mutations of p53 via machine learning.
Pan Q, Portelli S, Nguyen TB, Ascher DB. Pan Q, et al. Brief Bioinform. 2023 Nov 22;25(1):bbad428. doi: 10.1093/bib/bbad428. Brief Bioinform. 2023. PMID: 38018912 Free PMC article. - Guiding the humoral response against HIV-1 toward a MPER adjacent region by immunization with a VLP-formulated antibody-selected envelope variant.
Beltran-Pavez C, Ferreira CB, Merino-Mansilla A, Fabra-Garcia A, Casadella M, Noguera-Julian M, Paredes R, Olvera A, Haro I, Brander C, Garcia F, Gatell JM, Yuste E, Sanchez-Merino V. Beltran-Pavez C, et al. PLoS One. 2018 Dec 19;13(12):e0208345. doi: 10.1371/journal.pone.0208345. eCollection 2018. PLoS One. 2018. PMID: 30566493 Free PMC article. - Differential Shape of Geminivirus Mutant Spectra Across Cultivated and Wild Hosts With Invariant Viral Consensus Sequences.
Sánchez-Campos S, Domínguez-Huerta G, Díaz-Martínez L, Tomás DM, Navas-Castillo J, Moriones E, Grande-Pérez A. Sánchez-Campos S, et al. Front Plant Sci. 2018 Jul 2;9:932. doi: 10.3389/fpls.2018.00932. eCollection 2018. Front Plant Sci. 2018. PMID: 30013589 Free PMC article. - Lethal mutagenesis of an RNA plant virus via lethal defection.
Díaz-Martínez L, Brichette-Mieg I, Pineño-Ramos A, Domínguez-Huerta G, Grande-Pérez A. Díaz-Martínez L, et al. Sci Rep. 2018 Jan 23;8(1):1444. doi: 10.1038/s41598-018-19829-6. Sci Rep. 2018. PMID: 29362502 Free PMC article.
References
- Science. 1967 Jan 20;155(3760):279-84 - PubMed
- Proc Natl Acad Sci U S A. 1983 Jul;80(14):4218-22 - PubMed
- J Mol Biol. 1972 Mar 14;64(2):417-37 - PubMed
- J Biol Chem. 1982 Aug 10;257(15):9005-15 - PubMed
- Nature. 1982 May 20;297(5863):205-8 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Other Literature Sources
Miscellaneous