Genome analysis: Assigning protein coding regions to three-dimensional structures - PubMed (original) (raw)
Comparative Study
Genome analysis: Assigning protein coding regions to three-dimensional structures
A A Salamov et al. Protein Sci. 1999 Apr.
Abstract
We describe the results of a procedure for maximizing the number of sequences that can be reliably linked to a protein of known three-dimensional structure. Unlike other methods, which try to increase sensitivity through the use of fold recognition software, we only use conventional sequence alignment tools, but apply them in a manner that significantly increases the number of relationships detected. We analyzed 11 genomes and found that, depending on the genome, between 23 and 32% of the ORFs had significant matches to proteins of known structure. In all cases, the aligned region consisted of either >100 residues or >50% of the smaller sequence. Slightly higher percentages could be attained if smaller motifs were also included. This is significantly higher than most previously reported methods, even those that have a fold-recognition component. We survey the biochemical and structural characteristics of the most frequently occurring proteins, and discuss the extent to which alignment methods can realistically assign function to gene products.
References
- Proc Natl Acad Sci U S A. 1997 Oct 28;94(22):11929-34 - PubMed
- Structure. 1997 Aug 15;5(8):1093-108 - PubMed
- Proteins. 1998 Feb 15;30(3):275-86 - PubMed
- Protein Sci. 1998 Feb;7(2):233-42 - PubMed
- Proc Natl Acad Sci U S A. 1998 May 26;95(11):6073-8 - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources