Neural network detects errors in the assignment of mRNA splice sites - PubMed (original) (raw)

Neural network detects errors in the assignment of mRNA splice sites

S Brunak et al. Nucleic Acids Res. 1990.

Free PMC article

Abstract

The use of databanks in genetic research assumes reliability of the information they contain. Currently, error-detection in the manually or electronically entered data contained in the nucleotide sequence databanks at EMBL, Heidelberg and GenBank at Los Alamos is limited. We have used a subset of sequences from these databanks to train neural networks to recognize pre-mRNA splicing signals in human genes. During the training on 33 human genes from the EMBL databank seven genes appeared to disturb the learning process. Subsequent investigation revealed discrepancies from the original published papers, for three genes. In four genes, we found wrongly assigned splicing frames of introns. We believe this to be a reflection of the fact that splicing frames cannot always be unambiguously assigned on the basis of experimental data. Thus incorrect assignment appear both due to mere typographical misprints as well as erroneous interpretation of experiments. Training on 241 human sequences from GenBank revealed nine new errors. We propose that such errors could be detected by computer algorithms designed to check the consistency of data prior to their incorporation in databanks.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Nature. 1985 Feb 28-Mar 6;313(6005):806-10 - PubMed
    1. EMBO J. 1984 Apr;3(4):887-94 - PubMed
    1. Nucleic Acids Res. 1986 May 27;14(10):4127-45 - PubMed
    1. Annu Rev Biochem. 1986;55:1119-50 - PubMed
    1. Annu Rev Genet. 1986;20:671-708 - PubMed

MeSH terms

Substances

LinkOut - more resources