Gene Ranking from Microarray Data for Cancer Classification–A Machine Learning Approach (original) (raw)
Abstract
Traditional gene selection methods often select the top–ranked genes according to their individual discriminative power. We propose to apply feature evaluation measure broadly used in the machine learning field and not so popular in the DNA microarray field. Besides, the application of sequential gene subset selection approaches is included. In our study, we propose some well-known criteria (filters and wrappers) to rank attributes, and a greedy search procedure combined with three subset evaluation measures. Two completely different machine learning classifiers are applied to perform the class prediction. The comparison is performed on two well–known DNA microarray data sets. We notice that most of the top-ranked genes appear in the list of relevant–informative genes detected by previous studies over these data sets.
This research was supported by the Spanish Research Agency CICYT under grants TIN2004–00159 and TIN2004-06689C0303.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
- Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)
Article Google Scholar - Ben-Dor, A., et al.: Tissue classification with gene expression profiles. Proc. Natl. Acad. Sci. USA 98(26), 15149–15154 (2001)
Article Google Scholar - Dash, M., Liu, H., Motoda, H.: Consistency based feature selection. In: Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 98–109 (2000)
Google Scholar - Golub, T., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Article Google Scholar - Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar - Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machine. Machine Learning 46(1-3), 389–422 (2002)
Article MATH Google Scholar - Hall, M.: Correlation-based Feature Selection for Machine Learning. PhD thesis, University of Waikato, Dept Computer Science, Hamilton, New Zealand (1999)
Google Scholar - Hellem, T., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biology 3(4), 0017.1–0017.11 (2002)
Google Scholar - Inza, I., et al.: Filter versus wrapper gene selection approaches in dna microarray domains. Artificial Intelligence in Medicine 31, 91–103 (2004)
Article Google Scholar - Kononenko, I.: Estimating attributes: Analysis and estensions of relief. In: European Conf. on Machine Learning, Vienna, pp. 171–182. Springer, Heidelberg (1994)
Google Scholar - Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: 7th IEEE Int. Conf. on Tools with Artificial Intelligence (1995)
Google Scholar - Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. on Knowledge and Data Eng. 17(3), 1–12 (2005)
Article MATH Google Scholar - Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar - Ruiz, R., Riquelme, J., Aguilar-Ruiz, J.: Projection-based measure for efficient feature selection. Journal of Intelligent and Fuzzy System 12(3–4), 175–183 (2002)
MATH Google Scholar - Witten, I., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco (2005)
Google Scholar - Xing, E., Jordan, M., Karp, R.: Feature selection for high-dimensional genomic microarray data. In: Proc. 18th Int. Conf. on Machine Learning, pp. 601–608. Morgan Kaufmann, San Francisco (2001)
Google Scholar - Xiong, M., Jin, L., Li, W., Boerwinkle, E.: Computatinal methods for gene expression-based tumor classification. BioTechniques 29, 1264–1270 (2000)
Google Scholar - Yu, L., Liu, H.: Redundancy based feature selection for microarry data. In: 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (2004)
Google Scholar
Author information
Authors and Affiliations
- Department of Computer Science, University of Seville, Avenida Reina Mercedes s/n, 41012, Sevilla, Spain
Roberto Ruiz & Beatriz Pontes - Area of Computer Science, University of Pablo de Olavide, Ctra. de Utrera, km. 1, 41013, Sevilla, Spain
Raúl Giráldez & Jesús S. Aguilar–Ruiz
Authors
- Roberto Ruiz
- Beatriz Pontes
- Raúl Giráldez
- Jesús S. Aguilar–Ruiz
Editor information
Editors and Affiliations
- School of Design, Engineering and Computing, Bournemouth University, UK
Bogdan Gabrys - Centre for SMART Systems, School of Environment and Technology, University of Brighton, BN2 4GJ, Brighton, UK
Robert J. Howlett - School of Electrical and Information Engineering, Knowledge Based Intelligent Engineering Systems Centre, University of South Australia, SA, 5095, Mawson Lakes, Australia
Lakhmi C. Jain
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruiz, R., Pontes, B., Giráldez, R., Aguilar–Ruiz, J.S. (2006). Gene Ranking from Microarray Data for Cancer Classification–A Machine Learning Approach. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2006. Lecture Notes in Computer Science(), vol 4252. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893004\_161
Download citation
- .RIS
- .ENW
- .BIB
- DOI: https://doi.org/10.1007/11893004\_161
- Publisher Name: Springer, Berlin, Heidelberg
- Print ISBN: 978-3-540-46537-9
- Online ISBN: 978-3-540-46539-3
- eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.