Feature Ranking Derived from Data Mining Process (original) (raw)

Abstract

Most common feature ranking methods are based on the statistical approach. This paper compare several statistical methods with new method for feature ranking derived from data mining process. This method ranks features depending on percentage of child units that survived the selection process. A child unit is a processing element transforming the parent input features to the output. After training, units are interconnected in the feedforward hybrid neural network called GAME. The selection process is realized by means of niching genetic algorithm, where units connected to least significant features starve and fade from population. Parameters of new FR algorithm are investigated and comparison among different methods is presented on well known real world and artificial data sets.

Preview

Unable to display preview. Download preview PDF.

References

Uci machine learning repository (September 2006), http://www.ics.uci.edu/mlearn/MLSummary.html
Almuallim., T.G., Dietterich, H.: Learning with many irrelevant features (1991)
Google Scholar
Biesiada, J., Duch, W., Kachel, A., Maczka, K., Palucha, S.: Feature ranking methods based on information entropy with parzen windows., 109–119 (2005)
Google Scholar
Brown, G.: Diversity in Neural Network Ensembles. PhD thesis, The University of Birmingham, School of Computer Science, Birmingham B15 2TT, United Kingdom (January 2004)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar
Kohavi, R.: Wrappers for Performance Enhancement and Oblivious Decision Graphs. PhD thesis, Stanford University (1995)
Google Scholar
Kordík, P.: Fully Automated Knowledge Extraction using Group of Adaptive Models Evolution. PhD thesis, Czech Technical University in Prague, FEE, Dep. of Comp. Sci. and Computers, FEE, CTU Prague, Czech Republic (September 2006)
Google Scholar
Madala, H., Ivakhnenko, A.: Inductive Learning Algorithm for Complex System Modelling. CRC Press, Boca Raton (1994)
Google Scholar
Mahfoud, S.W.: A comparison of parallel and sequential niching methods. In: Sixth International Conference on Genetic Algorithms, pp. 136–143 (1995)
Google Scholar
Mahfoud, S.W.: Niching methods for genetic algorithms. Technical Report 95001, Illinois Genetic Algorithms Laboratory (IlliGaL), University of Ilinios at Urbana-Champaign (May 1995)
Google Scholar
Muller, J.A., Lemke, F.: Self-Organising Data Mining, Berlin (2000) ISBN 3-89811-861-4
Google Scholar
Piramuthu, S.: Evaluating feature selection methods for learning in data mining applications. European Journal of Operational Research 156, 483–494 (2004)
Article MATH Google Scholar
Siedlecki, W., Sklansky, J.: On automatic feature selection. International Journal of Pattern Recognition 2, 197–220 (1988)
Article Google Scholar
Tesmer, M., Estevez, P.: Amifs: adaptive feature selection by using mutual information. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, July 2004, vol. 1, page 308. Dept. of Electr. Eng., Chile Univ, Santiago (2004)
Google Scholar
Witten, I., Frank, E.: Data Mining – Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, Amsterdam (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, FEE, Czech Technical University, Prague, Czech Republic
Aleš Pilný, Pavel Kordík & Miroslav Šnorek

Authors

Aleš Pilný
Pavel Kordík
Miroslav Šnorek

Editor information

Véra Kůrková Roman Neruda Jan Koutník

Rights and permissions

Copyright information

About this paper

Cite this paper

Pilný, A., Kordík, P., Šnorek, M. (2008). Feature Ranking Derived from Data Mining Process. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87559-8\_92

Download citation

.RIS
.ENW
.BIB
DOI: https://doi.org/10.1007/978-3-540-87559-8\_92
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87558-1
Online ISBN: 978-3-540-87559-8
eBook Packages: Computer Science Computer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.