Amino acid substitution matrices from an information theoretic perspective - PubMed (original) (raw)
Comparative Study
Amino acid substitution matrices from an information theoretic perspective
S F Altschul. J Mol Biol. 1991.
Abstract
Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a "substitution score matrix" that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a "log-odds" matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human alpha 1 B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins.
Similar articles
- A protein alignment scoring system sensitive at all evolutionary distances.
Altschul SF. Altschul SF. J Mol Evol. 1993 Mar;36(3):290-300. doi: 10.1007/BF00160485. J Mol Evol. 1993. PMID: 8483166 - Optimizing substitution matrices by separating score distributions.
Hourai Y, Akutsu T, Akiyama Y. Hourai Y, et al. Bioinformatics. 2004 Apr 12;20(6):863-73. doi: 10.1093/bioinformatics/btg494. Epub 2004 Jan 29. Bioinformatics. 2004. PMID: 14752003 - The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions.
Yu YK, Altschul SF. Yu YK, et al. Bioinformatics. 2005 Apr 1;21(7):902-11. doi: 10.1093/bioinformatics/bti070. Epub 2004 Oct 27. Bioinformatics. 2005. PMID: 15509610 - Protein database searches using compositionally adjusted substitution matrices.
Altschul SF, Wootton JC, Gertz EM, Agarwala R, Morgulis A, Schäffer AA, Yu YK. Altschul SF, et al. FEBS J. 2005 Oct;272(20):5101-9. doi: 10.1111/j.1742-4658.2005.04945.x. FEBS J. 2005. PMID: 16218944 Free PMC article. Review. - Substitution scoring matrices for proteins - An overview.
Trivedi R, Nagarajaram HA. Trivedi R, et al. Protein Sci. 2020 Nov;29(11):2150-2163. doi: 10.1002/pro.3954. Epub 2020 Oct 12. Protein Sci. 2020. PMID: 32954566 Free PMC article. Review.
Cited by
- IMGT/RobustpMHC: robust training for class-I MHC peptide binding prediction.
Kushwaha A, Duroux P, Giudicelli V, Todorov K, Kossida S. Kushwaha A, et al. Brief Bioinform. 2024 Sep 23;25(6):bbae552. doi: 10.1093/bib/bbae552. Brief Bioinform. 2024. PMID: 39504482 Free PMC article. - Commensal HPVs Have Evolved to Be More Immunogenic Compared with High-Risk α-HPVs.
Guennoun R, Alyakin A, Higuchi H, Demehri S. Guennoun R, et al. Vaccines (Basel). 2024 Jul 7;12(7):749. doi: 10.3390/vaccines12070749. Vaccines (Basel). 2024. PMID: 39066387 Free PMC article. - SignalP: The Evolution of a Web Server.
Nielsen H, Teufel F, Brunak S, von Heijne G. Nielsen H, et al. Methods Mol Biol. 2024;2836:331-367. doi: 10.1007/978-1-0716-4007-4_17. Methods Mol Biol. 2024. PMID: 38995548 - Leveraging protein language models for accurate multiple sequence alignments.
McWhite CD, Armour-Garb I, Singh M. McWhite CD, et al. Genome Res. 2023 Jul;33(7):1145-1153. doi: 10.1101/gr.277675.123. Epub 2023 Jul 6. Genome Res. 2023. PMID: 37414576 Free PMC article. - Genomic Signature in Evolutionary Biology: A Review.
de la Fuente R, Díaz-Villanueva W, Arnau V, Moya A. de la Fuente R, et al. Biology (Basel). 2023 Feb 16;12(2):322. doi: 10.3390/biology12020322. Biology (Basel). 2023. PMID: 36829597 Free PMC article. Review.
References
- Altschul S.F., Erickson B.W. A nonlinear measure of subalignment similarity and its significance levels. Bull. Math. Biol. 1986;48:617–632. - PubMed
- Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
- Argos P. A sensitive procedure to compare amino acid sequences. J. Mol. Biol. 1987;193:385–396. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous