Profile analysis: detection of distantly related proteins - PubMed (original) (raw)
Comparative Study
Profile analysis: detection of distantly related proteins
M Gribskov et al. Proc Natl Acad Sci U S A. 1987 Jul.
Abstract
Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of similar proteins. This information is expressed in a position-specific scoring table (profile), which is created from a group of sequences previously aligned by structural or sequence similarity. The similarity of any other sequence (target) to the group of aligned sequences (probe) can be tested by comparing the target to the profile using dynamic programming algorithms. The profile method differs in two major respects from methods of sequence comparison in common use: (i) Any number of known sequences can be used to construct the profile, allowing more information to be used in the testing of the target than is possible with pairwise alignment methods. (ii) The profile includes the penalties for insertion or deletion at each position, which allow one to include the probe secondary structure in the testing scheme. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.
Similar articles
- Profile scanning for three-dimensional structural patterns in protein sequences.
Gribskov M, Homyak M, Edenfield J, Eisenberg D. Gribskov M, et al. Comput Appl Biosci. 1988 Mar;4(1):61-6. doi: 10.1093/bioinformatics/4.1.61. Comput Appl Biosci. 1988. PMID: 3383004 - Using CLUSTAL for multiple sequence alignments.
Higgins DG, Thompson JD, Gibson TJ. Higgins DG, et al. Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8. Methods Enzymol. 1996. PMID: 8743695 - Finding homologs to nucleic acid or protein sequences using the framesearch program.
Healy M. Healy M. Curr Protoc Bioinformatics. 2002 Aug;Chapter 3:Unit 3.2. doi: 10.1002/0471250953.bi0302s00. Curr Protoc Bioinformatics. 2002. PMID: 18792937 Review. - Nucleic acid and protein sequence databases.
Kneale GG, Bishop MJ. Kneale GG, et al. Comput Appl Biosci. 1985;1(1):11-7. doi: 10.1093/bioinformatics/1.1.11. Comput Appl Biosci. 1985. PMID: 3916889 Review.
Cited by
- An Effective Computational Method for Predicting Self-Interacting Proteins Based on VGGNet Convolutional Neural Network and Gray-Level Co-occurrence Matrix.
Chu DH, An JY, Nie XM. Chu DH, et al. Evol Bioinform Online. 2024 Oct 21;20:11769343241292224. doi: 10.1177/11769343241292224. eCollection 2024. Evol Bioinform Online. 2024. PMID: 39464790 Free PMC article. - Pcons: a neural-network-based consensus predictor that improves fold recognition.
Lundström J, Rychlewski L, Bujnicki J, Elofsson A. Lundström J, et al. Protein Sci. 2001 Nov;10(11):2354-62. doi: 10.1110/ps.08501. Protein Sci. 2001. PMID: 11604541 Free PMC article. - The directional atomic solvation energy: an atom-based potential for the assignment of protein sequences to known folds.
Mallick P, Weiss R, Eisenberg D. Mallick P, et al. Proc Natl Acad Sci U S A. 2002 Dec 10;99(25):16041-6. doi: 10.1073/pnas.252626399. Epub 2002 Dec 2. Proc Natl Acad Sci U S A. 2002. PMID: 12461172 Free PMC article. - Improved coreceptor usage prediction and genotypic monitoring of R5-to-X4 transition by motif analysis of human immunodeficiency virus type 1 env V3 loop sequences.
Jensen MA, Li FS, van 't Wout AB, Nickle DC, Shriner D, He HX, McLaughlin S, Shankarappa R, Margolick JB, Mullins JI. Jensen MA, et al. J Virol. 2003 Dec;77(24):13376-88. doi: 10.1128/jvi.77.24.13376-13388.2003. J Virol. 2003. PMID: 14645592 Free PMC article. - Homology induction: the use of machine learning to improve sequence similarity searches.
Karwath A, King RD. Karwath A, et al. BMC Bioinformatics. 2002 Apr 23;3:11. doi: 10.1186/1471-2105-3-11. BMC Bioinformatics. 2002. PMID: 11972320 Free PMC article.
References
- J Mol Biol. 1966 Mar;16(1):9-16 - PubMed
- Nucleic Acids Res. 1986 Aug 26;14(16):6745-63 - PubMed
- Annu Rev Biochem. 1978;47:251-76 - PubMed
- J Mol Biol. 1980 Jan 25;136(3):225-70 - PubMed
- Science. 1981 Oct 9;214(4517):149-59 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources