Using programmatic motifs and genetic programming to classify protein sequences as to cellular location (original) (raw)

Abstract

As newly sequenced proteins are deposited into the world's ever-growing archives, they are typically immediately tested by various algorithms for clues as to their biological structure and function. One question about a new protein involves its cellular location — that is, where the protein resides in a living organism (e.g., extracellular, membrane, nuclear). A human-created five-way algorithm for cellular location using statistical techniques with 76% accuracy was recently reported. This paper describes a two-way algorithm that was evolved using genetic programming with 83% accuracy for determining whether a protein is an extracellular protein, 84% for nuclear proteins, 89% for membrane proteins, and 83% for anchored membrane proteins. Unlike the statistical calculation, the genetically evolved programs employ a large and varied arsenal of computational capabilities, including arithmetic functions, conditional operations, subroutines, iterations, named memory, indexed memory, setcreating operations, and look-ahead. The genetically evolved classification program can be viewed as an extension (which we call a programmatic motif) of the conventional notion of a protein motif.

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

  1. Computer Science Department, Stanford University, 94305, Stanford, California
    John R. Koza
  2. Genetic Programming Inc., 94023, Los Altos, California
    Forrest H. Bennett III (Chief Scientist)
  3. Computer Science Division, University of California, Berkeley, California
    David Andre

Authors

  1. John R. Koza
  2. Forrest H. Bennett III
  3. David Andre

Editor information

V. W. Porto N. Saravanan D. Waagen A. E. Eiben

Rights and permissions

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Koza, J.R., Bennett, F.H., Andre, D. (1998). Using programmatic motifs and genetic programming to classify protein sequences as to cellular location. In: Porto, V.W., Saravanan, N., Waagen, D., Eiben, A.E. (eds) Evolutionary Programming VII. EP 1998. Lecture Notes in Computer Science, vol 1447. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0040796

Download citation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us