Feature Ranking Derived from Data Mining Process (original) (raw)
Abstract
Most common feature ranking methods are based on the statistical approach. This paper compare several statistical methods with new method for feature ranking derived from data mining process. This method ranks features depending on percentage of child units that survived the selection process. A child unit is a processing element transforming the parent input features to the output. After training, units are interconnected in the feedforward hybrid neural network called GAME. The selection process is realized by means of niching genetic algorithm, where units connected to least significant features starve and fade from population. Parameters of new FR algorithm are investigated and comparison among different methods is presented on well known real world and artificial data sets.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
- Uci machine learning repository (September 2006), http://www.ics.uci.edu/mlearn/MLSummary.html
- Almuallim., T.G., Dietterich, H.: Learning with many irrelevant features (1991)
Google Scholar - Biesiada, J., Duch, W., Kachel, A., Maczka, K., Palucha, S.: Feature ranking methods based on information entropy with parzen windows., 109–119 (2005)
Google Scholar - Brown, G.: Diversity in Neural Network Ensembles. PhD thesis, The University of Birmingham, School of Computer Science, Birmingham B15 2TT, United Kingdom (January 2004)
Google Scholar - Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar - Kohavi, R.: Wrappers for Performance Enhancement and Oblivious Decision Graphs. PhD thesis, Stanford University (1995)
Google Scholar - Kordík, P.: Fully Automated Knowledge Extraction using Group of Adaptive Models Evolution. PhD thesis, Czech Technical University in Prague, FEE, Dep. of Comp. Sci. and Computers, FEE, CTU Prague, Czech Republic (September 2006)
Google Scholar - Madala, H., Ivakhnenko, A.: Inductive Learning Algorithm for Complex System Modelling. CRC Press, Boca Raton (1994)
Google Scholar - Mahfoud, S.W.: A comparison of parallel and sequential niching methods. In: Sixth International Conference on Genetic Algorithms, pp. 136–143 (1995)
Google Scholar - Mahfoud, S.W.: Niching methods for genetic algorithms. Technical Report 95001, Illinois Genetic Algorithms Laboratory (IlliGaL), University of Ilinios at Urbana-Champaign (May 1995)
Google Scholar - Muller, J.A., Lemke, F.: Self-Organising Data Mining, Berlin (2000) ISBN 3-89811-861-4
Google Scholar - Piramuthu, S.: Evaluating feature selection methods for learning in data mining applications. European Journal of Operational Research 156, 483–494 (2004)
Article MATH Google Scholar - Siedlecki, W., Sklansky, J.: On automatic feature selection. International Journal of Pattern Recognition 2, 197–220 (1988)
Article Google Scholar - Tesmer, M., Estevez, P.: Amifs: adaptive feature selection by using mutual information. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, July 2004, vol. 1, page 308. Dept. of Electr. Eng., Chile Univ, Santiago (2004)
Google Scholar - Witten, I., Frank, E.: Data Mining – Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, Amsterdam (2005)
MATH Google Scholar
Author information
Authors and Affiliations
- Department of Computer Science and Engineering, FEE, Czech Technical University, Prague, Czech Republic
Aleš Pilný, Pavel Kordík & Miroslav Šnorek
Authors
- Aleš Pilný
- Pavel Kordík
- Miroslav Šnorek
Editor information
Véra Kůrková Roman Neruda Jan Koutník
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pilný, A., Kordík, P., Šnorek, M. (2008). Feature Ranking Derived from Data Mining Process. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87559-8\_92
Download citation
- .RIS
- .ENW
- .BIB
- DOI: https://doi.org/10.1007/978-3-540-87559-8\_92
- Publisher Name: Springer, Berlin, Heidelberg
- Print ISBN: 978-3-540-87558-1
- Online ISBN: 978-3-540-87559-8
- eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.