Genome analysis: Assigning protein coding regions to three-dimensional structures - PubMed (original) (raw)

Comparative Study

Genome analysis: Assigning protein coding regions to three-dimensional structures

A A Salamov et al. Protein Sci. 1999 Apr.

Abstract

We describe the results of a procedure for maximizing the number of sequences that can be reliably linked to a protein of known three-dimensional structure. Unlike other methods, which try to increase sensitivity through the use of fold recognition software, we only use conventional sequence alignment tools, but apply them in a manner that significantly increases the number of relationships detected. We analyzed 11 genomes and found that, depending on the genome, between 23 and 32% of the ORFs had significant matches to proteins of known structure. In all cases, the aligned region consisted of either >100 residues or >50% of the smaller sequence. Slightly higher percentages could be attained if smaller motifs were also included. This is significantly higher than most previously reported methods, even those that have a fold-recognition component. We survey the biochemical and structural characteristics of the most frequently occurring proteins, and discuss the extent to which alignment methods can realistically assign function to gene products.

PubMed Disclaimer

References

    1. Proc Natl Acad Sci U S A. 1997 Oct 28;94(22):11929-34 - PubMed
    1. Structure. 1997 Aug 15;5(8):1093-108 - PubMed
    1. Proteins. 1998 Feb 15;30(3):275-86 - PubMed
    1. Protein Sci. 1998 Feb;7(2):233-42 - PubMed
    1. Proc Natl Acad Sci U S A. 1998 May 26;95(11):6073-8 - PubMed

Publication types

MeSH terms

LinkOut - more resources