Analysis of donor splice sites in different eukaryotic organisms - PubMed (original) (raw)

Analysis of donor splice sites in different eukaryotic organisms

I B Rogozin et al. J Mol Evol. 1997 Jul.

Abstract

We present here a new algorithm for functional site analysis. It is based on four main assumptions: each variation of nucleotide composition makes a different contribution to the overall binding free energy of interaction between a functional site and another molecule; nonfunctioning site-like regions (pseudosites) are absent or rare in genomes; there may be errors in the sample of sites; and nucleotides of different site positions are considered to be mutually dependent. In this algorithm, the site set is divided into subsets, each described by a certain consensus. Donor splice sites of the human protein-coding genes were analyzed. Comparing the results with other methods of donor splice site prediction has demonstrated a more accurate prediction of consensus sequences AG/GU(A,G), G/GUnAG, /GU(A,G)AG, /GU(A,G)nGU, and G/GUA than is achieved by weight matrix and consensus (A,C)AG/GU(A,G)AGU with mismatches. The probability of the first type error, E1, for the obtained consensus set was about 0.05, and the probability of the second type error, E2, was 0.15. The analysis demonstrated that accuracy of the functional site prediction could be improved if one takes into account correlations between the site positions. The accuracy of prediction by using human consensus sequences was tested on sequences from different organisms. Some differences in consensus sequences for the plant Arabidopsis sp., the invertebrate Caenorhabditis sp., and the fungus Aspergillus sp. were revealed. For the yeast Saccharomyces sp. only one conservative consensus, /GUA(U,A,C)G(U,A,C), was revealed (E1 = 0.03, E2 = 0.03). Yeast is a very interesting model to use for analysis of molecular mechanisms of splicing.

PubMed Disclaimer

Similar articles

Cited by

References

    1. J Mol Biol. 1992 Dec 20;228(4):1124-36 - PubMed
    1. Nucleic Acids Res. 1992 Aug 25;20(16):4255-62 - PubMed
    1. Nat Genet. 1994 Oct;8(2):183-8 - PubMed
    1. Comput Appl Biosci. 1993 Oct;9(5):499-509 - PubMed
    1. Nucleic Acids Res. 1994 Dec 11;22(24):5156-63 - PubMed

Publication types

MeSH terms

LinkOut - more resources