Selection of Decision Stumps in Bagging Ensembles (original) (raw)

2954 Accesses
7 Citations

Abstract

This article presents a comprehensive study of different ensemble pruning techniques applied to a bagging ensemble composed of decision stumps. Six different ensemble pruning methods are tested. Four of these are greedy strategies based on first reordering the elements of the ensemble according to some rule that takes into account the complementarity of the predictors with respect to the classification task. Subensembles of increasing size are then constructed by incorporating the ordered classifiers one by one. A halting criterion stops the aggregation process before the complete original ensemble is recovered. The other two approaches are selection techniques that attempt to identify optimal subensembles using either genetic algorithms or semidefinite programming. Experiments performed on 24 benchmark classification tasks show that the selection of a small subset (≈ 10 − 15%) of the original pool of stumps generated with bagging can significantly increase the accuracy and reduce the complexity of the ensemble.

This work has been supported by Consejería de Educació n de la Comunidad Autónoma de Madrid, European Social Fund, and the Dirección General de Investigación, grant TIN2004-07676-C02-02.

Preview

Unable to display preview. Download preview PDF.

References

Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH Google Scholar
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall/CRC (1994)
Google Scholar
Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proc. 14th International Conference on Machine Learning, pp. 211–218. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Zhou, Z.H., Tang, W.: Selective ensemble of decision trees. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639, pp. 476–483. Springer, Heidelberg (2003)
Chapter Google Scholar
Martínez-Muñoz, G., Suárez, A.: Aggregation ordering in bagging. In: Proc. of the IASTED International Conference on Artificial Intelligence and Applications, pp. 258–263. Acta Press (2004)
Google Scholar
Zhang, Y., Burer, S., Street, W.N.: Ensemble pruning via semi-definite programming. Journal of Machine Learning Research 7, 1315–1338 (2006)
Google Scholar
Martínez-Muñoz, G., Suárez, A.: Pruning in ordered bagging ensembles. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 609–616 (2006)
Google Scholar
Martínez-Muñoz, G., Suárez, A.: Using boosting to prune bagging ensembles. Pattern Recognition Letters 28(1), 156–165 (2007)
Article Google Scholar
Zhou, Z.H., Wu, J., Tang, W.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)
Article MATH Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Proc. 2nd European Conference on Computational Learning Theory, pp. 23–37 (1995)
Google Scholar
Eiben, A.E., Smith, J.E.: Introduction to evolutionary computing. Springer, Heidelberg (2003)
MATH Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Universidad Autónoma de Madrid, C/ Francisco Tomás y Valiente, 11, Madrid 28049, Spain
Gonzalo Martínez-Muñoz, Daniel Hernández-Lobato & Alberto Suárez

Authors

Gonzalo Martínez-Muñoz
Daniel Hernández-Lobato
Alberto Suárez

Editor information

Joaquim Marques de Sá Luís A. Alexandre Włodzisław Duch Danilo Mandic

Rights and permissions

Copyright information

About this paper

Cite this paper

Martínez-Muñoz, G., Hernández-Lobato, D., Suárez, A. (2007). Selection of Decision Stumps in Bagging Ensembles. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4668. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74690-4\_33

Download citation

.RIS
.ENW
.BIB
DOI: https://doi.org/10.1007/978-3-540-74690-4\_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74689-8
Online ISBN: 978-3-540-74690-4
eBook Packages: Computer Science Computer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.