RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression (original) (raw)

Abstract

A systematic analysis of the RNA splice junction sequences of eukaryotic protein coding genes was carried out using the GENBANK databank. Nucleotide frequencies obtained for the highly conserved regions around the splice sites for different categories of organisms closely agree with each other. A striking similarity among the rare splice junctions which do not contain AG at the 3' splice site or GT at the 5' splice site indicates the existence of special mechanisms to recognize them, and that these unique signals may be involved in crucial gene-regulation events and in differentiation. A method was developed to predict potential exons in a bare sequence, using a scoring and ranking scheme based on nucleotide weight tables. This method was used to find a majority of the exons in selected known genes, and also predicted potential new exons which may be used in alternative splicing situations.

7155

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bell G. I., Quinto C., Quiroga M., Valenzuela P., Craik C. S., Rutter W. J. Isolation and sequence of a rat chymotrypsin B gene. J Biol Chem. 1984 Nov 25;259(22):14265–14270. [PubMed] [Google Scholar]
  2. Bernard O., Hozumi N., Tonegawa S. Sequences of mouse immunoglobulin light chain genes before and after somatic changes. Cell. 1978 Dec;15(4):1133–1144. doi: 10.1016/0092-8674(78)90041-7. [DOI] [PubMed] [Google Scholar]
  3. Bilofsky H. S., Burks C., Fickett J. W., Goad W. B., Lewitter F. I., Rindone W. P., Swindell C. D., Tung C. S. The GenBank genetic sequence databank. Nucleic Acids Res. 1986 Jan 10;14(1):1–4. doi: 10.1093/nar/14.1.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Breathnach R., Chambon P. Organization and expression of eucaryotic split genes coding for proteins. Annu Rev Biochem. 1981;50:349–383. doi: 10.1146/annurev.bi.50.070181.002025. [DOI] [PubMed] [Google Scholar]
  5. Brown J. W. A catalogue of splice junction and putative branch point sequences from plant introns. Nucleic Acids Res. 1986 Dec 22;14(24):9549–9559. doi: 10.1093/nar/14.24.9549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Dennis E. S., Gerlach W. L., Pryor A. J., Bennetzen J. L., Inglis A., Llewellyn D., Sachs M. M., Ferl R. J., Peacock W. J. Molecular analysis of the alcohol dehydrogenase (Adh1) gene of maize. Nucleic Acids Res. 1984 May 11;12(9):3983–4000. doi: 10.1093/nar/12.9.3983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dodgson J. B., Engel J. D. The nucleotide sequence of the adult chicken alpha-globin genes. J Biol Chem. 1983 Apr 10;258(7):4623–4629. [PubMed] [Google Scholar]
  8. Emorine L., Dreher K., Kindt T. J., Max E. E. Rabbit immunoglobulin kappa genes: structure of a germline b4 allotype J-C locus and evidence for several b4-related sequences in the rabbit genome. Proc Natl Acad Sci U S A. 1983 Sep;80(18):5709–5713. doi: 10.1073/pnas.80.18.5709. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Erbil C., Niessing J. The primary structure of the duck alpha D-globin gene: an unusual 5' splice junction sequence. EMBO J. 1983;2(8):1339–1343. doi: 10.1002/j.1460-2075.1983.tb01589.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Green M. R. Pre-mRNA splicing. Annu Rev Genet. 1986;20:671–708. doi: 10.1146/annurev.ge.20.120186.003323. [DOI] [PubMed] [Google Scholar]
  11. Harr R., Häggström M., Gustafsson P. Search algorithm for pattern match analysis of nucleic acid sequences. Nucleic Acids Res. 1983 May 11;11(9):2943–2957. doi: 10.1093/nar/11.9.2943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Heidmann O., Rougeon F. Diversity in the rabbit immunoglobulin kappa chain variable regions is amplified by nucleotide deletions and insertions at the V-J junction. Cell. 1983 Oct;34(3):767–777. doi: 10.1016/0092-8674(83)90533-0. [DOI] [PubMed] [Google Scholar]
  13. Hozumi N., Hawley R. G., Murialdo H. Molecular cloning of an immunoglobulin kappa constant gene from NZB mouse. Gene. 1981 Mar;13(2):163–172. doi: 10.1016/0378-1119(81)90005-6. [DOI] [PubMed] [Google Scholar]
  14. Iida Y., Sasaki F. Recognition patterns for exon-intron junctions in higher organisms as revealed by a computer search. J Biochem. 1983 Dec;94(6):1731–1738. doi: 10.1093/oxfordjournals.jbchem.a134524. [DOI] [PubMed] [Google Scholar]
  15. Iida Y. Splice-site signals of mRNA precursors as revealed by computer search. Site-specific mutagenesis and thalassemia. J Biochem. 1985 Apr;97(4):1173–1179. doi: 10.1093/oxfordjournals.jbchem.a135162. [DOI] [PubMed] [Google Scholar]
  16. Karn J., Brenner S., Barnett L. Protein structural domains in the Caenorhabditis elegans unc-54 myosin heavy chain gene are not separated by introns. Proc Natl Acad Sci U S A. 1983 Jul;80(14):4253–4257. doi: 10.1073/pnas.80.14.4253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Katinakis P., Verma D. P. Nodulin-24 gene of soybean codes for a peptide of the peribacteroid membrane and was generated by tandem duplication of a sequence resembling an insertion element. Proc Natl Acad Sci U S A. 1985 Jun;82(12):4157–4161. doi: 10.1073/pnas.82.12.4157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Kim S., Davis M., Sinn E., Patten P., Hood L. Antibody diversity: somatic hypermutation of rearranged VH genes. Cell. 1981 Dec;27(3 Pt 2):573–581. doi: 10.1016/0092-8674(81)90399-8. [DOI] [PubMed] [Google Scholar]
  19. Kwoh T. J., Engler J. A. The nucleotide sequence of the chicken thymidine kinase gene and the relationship of its predicted polypeptide to that of the vaccinia virus thymidine kinase. Nucleic Acids Res. 1984 May 11;12(9):3959–3971. doi: 10.1093/nar/12.9.3959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Levanon D., Lieman-Hurwitz J., Dafni N., Wigderson M., Sherman L., Bernstein Y., Laver-Rudich Z., Danciger E., Stein O., Groner Y. Architecture and anatomy of the chromosomal locus in human chromosome 21 encoding the Cu/Zn superoxide dismutase. EMBO J. 1985 Jan;4(1):77–84. doi: 10.1002/j.1460-2075.1985.tb02320.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Loh D. Y., Bothwell A. L., White-Scharf M. E., Imanishi-Kari T., Baltimore D. Molecular basis of a mouse strain-specific anti-hapten response. Cell. 1983 May;33(1):85–93. doi: 10.1016/0092-8674(83)90337-9. [DOI] [PubMed] [Google Scholar]
  22. Malissen M., Malissen B., Jordan B. R. Exon/intron organization and complete nucleotide sequence of an HLA gene. Proc Natl Acad Sci U S A. 1982 Feb;79(3):893–897. doi: 10.1073/pnas.79.3.893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Marche P. N., Tykocinski M. L., Max E. E., Kindt T. J. Structure of a functional rabbit class I MHC gene: similarity to human class I genes. Immunogenetics. 1985;21(1):71–82. doi: 10.1007/BF00372243. [DOI] [PubMed] [Google Scholar]
  24. Max E. E., Maizel J. V., Jr, Leder P. The nucleotide sequence of a 5.5-kilobase DNA segment containing the mouse kappa immunoglobulin J and C region genes. J Biol Chem. 1981 May 25;256(10):5116–5120. [PubMed] [Google Scholar]
  25. Max E. E., Seidman J. G., Leder P. Sequences of five potential recombination sites encoded close to an immunoglobulin kappa constant region gene. Proc Natl Acad Sci U S A. 1979 Jul;76(7):3450–3454. doi: 10.1073/pnas.76.7.3450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Mount S. M. A catalogue of splice junction sequences. Nucleic Acids Res. 1982 Jan 22;10(2):459–472. doi: 10.1093/nar/10.2.459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Mulligan M. E., McClure W. R. Analysis of the occurrence of promoter-sites in DNA. Nucleic Acids Res. 1986 Jan 10;14(1):109–126. doi: 10.1093/nar/14.1.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Sakano H., Hüppi K., Heinrich G., Tonegawa S. Sequences at the somatic recombination sites of immunoglobulin light-chain genes. Nature. 1979 Jul 26;280(5720):288–294. doi: 10.1038/280288a0. [DOI] [PubMed] [Google Scholar]
  29. Senapathy P., Carter B. J. Molecular cloning of adeno-associated virus variant genomes and generation of infectious virus by recombination in mammalian cells. J Biol Chem. 1984 Apr 10;259(7):4661–4666. [PubMed] [Google Scholar]
  30. Senapathy P. Origin of eukaryotic introns: a hypothesis, based on codon distribution statistics in genes, and its implications. Proc Natl Acad Sci U S A. 1986 Apr;83(7):2133–2137. doi: 10.1073/pnas.83.7.2133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Sharp P. A. Splicing of messenger RNA precursors. Science. 1987 Feb 13;235(4790):766–771. doi: 10.1126/science.3544217. [DOI] [PubMed] [Google Scholar]
  32. Shibahara S., Kubo T., Perski H. J., Takahashi H., Noda M., Numa S. Cloning and sequence analysis of human genomic DNA encoding gamma subunit precursor of muscle acetylcholine receptor. Eur J Biochem. 1985 Jan 2;146(1):15–22. doi: 10.1111/j.1432-1033.1985.tb08614.x. [DOI] [PubMed] [Google Scholar]
  33. Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Wieringa B., Meyer F., Reiser J., Weissmann C. Unusual splice sites revealed by mutagenic inactivation of an authentic splice site of the rabbit beta-globin gene. Nature. 1983 Jan 6;301(5895):38–43. doi: 10.1038/301038a0. [DOI] [PubMed] [Google Scholar]