Automated assembly of protein blocks for database searching - PubMed (original) (raw)
Automated assembly of protein blocks for database searching
S Henikoff et al. Nucleic Acids Res. 1991.
Free PMC article
Abstract
A system is described for finding and assembling the most highly conserved regions of related proteins for database searching. First, an automated version of Smith's algorithm for finding motifs is used for sensitive detection of multiple local alignments. Next, the local alignments are converted to blocks and the best set of non-overlapping blocks is determined. When the automated system was applied successively to all 437 groups of related proteins in the PROSITE catalog, 1764 blocks resulted; these could be used for very sensitive searches of sequence databases. Each block was calibrated by searching the SWISS-PROT database to obtain a measure of the chance distribution of matches, and the calibrated blocks were concatenated into a database that could itself be searched. Examples are provided in which distant relationships are detected either using a set of blocks to search a sequence database or using sequences to search the database of blocks. The practical use of the blocks database is demonstrated by detecting previously unknown relationships between oxidoreductases and by evaluating a proposed relationship between HIV Vif protein and thiol proteases.
Similar articles
- Protein family classification based on searching a database of blocks.
Henikoff S, Henikoff JG. Henikoff S, et al. Genomics. 1994 Jan 1;19(1):97-107. doi: 10.1006/geno.1994.1018. Genomics. 1994. PMID: 8188249 - Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations.
Henikoff S, Henikoff JG, Pietrokovski S. Henikoff S, et al. Bioinformatics. 1999 Jun;15(6):471-9. doi: 10.1093/bioinformatics/15.6.471. Bioinformatics. 1999. PMID: 10383472 - Automated construction and graphical presentation of protein blocks from unaligned sequences.
Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S. Henikoff S, et al. Gene. 1995 Oct 3;163(2):GC17-26. doi: 10.1016/0378-1119(95)00486-p. Gene. 1995. PMID: 7590261 - Issues in searching molecular sequence databases.
Altschul SF, Boguski MS, Gish W, Wootton JC. Altschul SF, et al. Nat Genet. 1994 Feb;6(2):119-29. doi: 10.1038/ng0294-119. Nat Genet. 1994. PMID: 8162065 Review. - Protein sequence comparisons: searching databases and aligning sequences.
Doolittle RF. Doolittle RF. Curr Opin Biotechnol. 1994 Feb;5(1):24-8. doi: 10.1016/s0958-1669(05)80065-5. Curr Opin Biotechnol. 1994. PMID: 7764639 Review.
Cited by
- Structure-based prediction of nucleic acid binding residues by merging deep learning- and template-based approaches.
Jiang Z, Shen YY, Liu R. Jiang Z, et al. PLoS Comput Biol. 2023 Sep 6;19(9):e1011428. doi: 10.1371/journal.pcbi.1011428. eCollection 2023 Sep. PLoS Comput Biol. 2023. PMID: 37672551 Free PMC article. - Wheat wounding-responsive HD-Zip IV transcription factor GL7 is predominantly expressed in grain and activates genes encoding defensins.
Kovalchuk N, Wu W, Bazanova N, Reid N, Singh R, Shirley N, Eini O, Johnson AAT, Langridge P, Hrmova M, Lopato S. Kovalchuk N, et al. Plant Mol Biol. 2019 Sep;101(1-2):41-61. doi: 10.1007/s11103-019-00889-9. Epub 2019 Jun 10. Plant Mol Biol. 2019. PMID: 31183604 - RBLOSUM performs better than CorBLOSUM with lesser error per query.
Govindarajan R, Leela BC, Nair AS. Govindarajan R, et al. BMC Res Notes. 2018 May 21;11(1):328. doi: 10.1186/s13104-018-3415-5. BMC Res Notes. 2018. PMID: 29784028 Free PMC article. - Predicting Amino Acid Substitution Probabilities Using Single Nucleotide Polymorphisms.
Rizzato F, Rodriguez A, Biarnés X, Laio A. Rizzato F, et al. Genetics. 2017 Oct;207(2):643-652. doi: 10.1534/genetics.117.300078. Epub 2017 Jul 28. Genetics. 2017. PMID: 28754661 Free PMC article. - PFASUM: a substitution matrix from Pfam structural alignments.
Keul F, Hess M, Goesele M, Hamacher K. Keul F, et al. BMC Bioinformatics. 2017 Jun 5;18(1):293. doi: 10.1186/s12859-017-1703-z. BMC Bioinformatics. 2017. PMID: 28583067 Free PMC article.
References
- Nucleic Acids Res. 1991 Apr 25;19 Suppl:2241-5 - PubMed
- Methods Enzymol. 1990;183:111-32 - PubMed
- Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8 - PubMed
- Comput Appl Biosci. 1989 Apr;5(2):115-21 - PubMed
- Gene. 1991 Feb 15;98(2):153-9 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources