Neural network detects errors in the assignment of mRNA splice sites - PubMed (original) (raw)
Neural network detects errors in the assignment of mRNA splice sites
S Brunak et al. Nucleic Acids Res. 1990.
Free PMC article
Abstract
The use of databanks in genetic research assumes reliability of the information they contain. Currently, error-detection in the manually or electronically entered data contained in the nucleotide sequence databanks at EMBL, Heidelberg and GenBank at Los Alamos is limited. We have used a subset of sequences from these databanks to train neural networks to recognize pre-mRNA splicing signals in human genes. During the training on 33 human genes from the EMBL databank seven genes appeared to disturb the learning process. Subsequent investigation revealed discrepancies from the original published papers, for three genes. In four genes, we found wrongly assigned splicing frames of introns. We believe this to be a reflection of the fact that splicing frames cannot always be unambiguously assigned on the basis of experimental data. Thus incorrect assignment appear both due to mere typographical misprints as well as erroneous interpretation of experiments. Training on 241 human sequences from GenBank revealed nine new errors. We propose that such errors could be detected by computer algorithms designed to check the consistency of data prior to their incorporation in databanks.
Similar articles
- Cleaning the GenBank Arabidopsis thaliana data set.
Korning PG, Hebsgaard SM, Rouze P, Brunak S. Korning PG, et al. Nucleic Acids Res. 1996 Jan 15;24(2):316-20. doi: 10.1093/nar/24.2.316. Nucleic Acids Res. 1996. PMID: 8628656 Free PMC article. - Identification of sites of pre-MRNA/spliceosome association.
Rymond BC. Rymond BC. SAAS Bull Biochem Biotechnol. 1991 Jan;4:76-80. SAAS Bull Biochem Biotechnol. 1991. PMID: 1369323 - Regulation of splicing: the importance of being translatable.
Miriami E, Sperling R, Sperling J, Motro U. Miriami E, et al. RNA. 2004 Jan;10(1):1-4. doi: 10.1261/rna.5112704. RNA. 2004. PMID: 14681577 Free PMC article. Review. - Modification of pre-mRNA splicing by antisense oligonucleotides.
Kole R. Kole R. Acta Biochim Pol. 1997;44(2):231-7. Acta Biochim Pol. 1997. PMID: 9360712 Review.
Cited by
- SignalP: The Evolution of a Web Server.
Nielsen H, Teufel F, Brunak S, von Heijne G. Nielsen H, et al. Methods Mol Biol. 2024;2836:331-367. doi: 10.1007/978-1-0716-4007-4_17. Methods Mol Biol. 2024. PMID: 38995548 - ACDC, a global database of amphibian cytochrome-b sequences using reproducible curation for GenBank records.
van den Burg MP, Herrando-Pérez S, Vieites DR. van den Burg MP, et al. Sci Data. 2020 Aug 13;7(1):268. doi: 10.1038/s41597-020-00598-9. Sci Data. 2020. PMID: 32792559 Free PMC article. - Method of predicting splice sites based on signal interactions.
Churbanov A, Rogozin IB, Deogun JS, Ali H. Churbanov A, et al. Biol Direct. 2006 Apr 3;1:10. doi: 10.1186/1745-6150-1-10. Biol Direct. 2006. PMID: 16584568 Free PMC article. - Analysis of missense variants in the PKHD1-gene in patients with autosomal recessive polycystic kidney disease (ARPKD).
Losekoot M, Haarloo C, Ruivenkamp C, White SJ, Breuning MH, Peters DJ. Losekoot M, et al. Hum Genet. 2005 Nov;118(2):185-206. doi: 10.1007/s00439-005-0027-7. Epub 2005 Nov 15. Hum Genet. 2005. PMID: 16133180 - Analysis of donor splice sites in different eukaryotic organisms.
Rogozin IB, Milanesi L. Rogozin IB, et al. J Mol Evol. 1997 Jul;45(1):50-9. doi: 10.1007/pl00006200. J Mol Evol. 1997. PMID: 9211734
References
- Nature. 1985 Feb 28-Mar 6;313(6005):806-10 - PubMed
- EMBO J. 1984 Apr;3(4):887-94 - PubMed
- Nucleic Acids Res. 1986 May 27;14(10):4127-45 - PubMed
- Annu Rev Biochem. 1986;55:1119-50 - PubMed
- Annu Rev Genet. 1986;20:671-708 - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources