Detecting patterns in protein sequences - PubMed (original) (raw)
. 1994 Jun 24;239(5):698-712.
doi: 10.1006/jmbi.1994.1407.
Affiliations
- PMID: 8014990
- DOI: 10.1006/jmbi.1994.1407
Detecting patterns in protein sequences
A F Neuwald et al. J Mol Biol. 1994.
Abstract
The detection of conserved sequence patterns (motifs) in related proteins often yields valuable structural and functional insights. We describe a method that utilizes rigorous statistics and a depth-first search procedure to efficiently and exhaustively search a set of proteins for significant patterns up to a specified length. Additional procedures classify related patterns into groups and identify protein segments most likely to share a common motif. The utility of the method was demonstrated on several difficult test problems; detection of motifs among 56 proteins in the acyltransferase family, detection of a dinucleotide-binding fold present within a small subset of a set of 91 distantly related and unrelated proteins, detection of the helix-turn-helix motif in 15 distantly related proteins and detection of subtle internal repeats in a prenyltransferase. In a search of a large set of sequences for internal repeats, the method detected novel ankyrin-like repeats in an Escherichia coli protein.
Similar articles
- Gibbs motif sampling: detection of bacterial outer membrane protein repeats.
Neuwald AF, Liu JS, Lawrence CE. Neuwald AF, et al. Protein Sci. 1995 Aug;4(8):1618-32. doi: 10.1002/pro.5560040820. Protein Sci. 1995. PMID: 8520488 Free PMC article. - Finding flexible patterns in unaligned protein sequences.
Jonassen I, Collins JF, Higgins DG. Jonassen I, et al. Protein Sci. 1995 Aug;4(8):1587-95. doi: 10.1002/pro.5560040817. Protein Sci. 1995. PMID: 8520485 Free PMC article. - Designing patterns for profile HMM search.
Sun Y, Buhler J. Sun Y, et al. Bioinformatics. 2007 Jan 15;23(2):e36-43. doi: 10.1093/bioinformatics/btl323. Bioinformatics. 2007. PMID: 17237102 - Comparison of ARM and HEAT protein repeats.
Andrade MA, Petosa C, O'Donoghue SI, Müller CW, Bork P. Andrade MA, et al. J Mol Biol. 2001 May 25;309(1):1-18. doi: 10.1006/jmbi.2001.4624. J Mol Biol. 2001. PMID: 11491282 Review.
Cited by
- SpeG polyamine acetyltransferase enzyme from Bacillus thuringiensis forms a dodecameric structure and exhibits high catalytic efficiency.
Tsimbalyuk S, Shornikov A, Thi Bich Le V, Kuhn ML, Forwood JK. Tsimbalyuk S, et al. J Struct Biol. 2020 Jun 1;210(3):107506. doi: 10.1016/j.jsb.2020.107506. Epub 2020 Apr 10. J Struct Biol. 2020. PMID: 32283314 Free PMC article. - Fast and accurate discovery of degenerate linear motifs in protein sequences.
Kelil A, Dubreuil B, Levy ED, Michnick SW. Kelil A, et al. PLoS One. 2014 Sep 10;9(9):e106081. doi: 10.1371/journal.pone.0106081. eCollection 2014. PLoS One. 2014. PMID: 25207816 Free PMC article. - Evaluating, comparing, and interpreting protein domain hierarchies.
Neuwald AF. Neuwald AF. J Comput Biol. 2014 Apr;21(4):287-302. doi: 10.1089/cmb.2013.0098. Epub 2014 Feb 21. J Comput Biol. 2014. PMID: 24559108 Free PMC article. - Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures.
Neuwald AF, Lanczycki CJ, Marchler-Bauer A. Neuwald AF, et al. BMC Bioinformatics. 2012 Jun 22;13:144. doi: 10.1186/1471-2105-13-144. BMC Bioinformatics. 2012. PMID: 22726767 Free PMC article. - WildSpan: mining structured motifs from protein sequences.
Hsu CM, Chen CY, Liu BJ. Hsu CM, et al. Algorithms Mol Biol. 2011 Mar 31;6(1):6. doi: 10.1186/1748-7188-6-6. Algorithms Mol Biol. 2011. PMID: 21453542 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases