Evolutionary Algorithms for Constructing an Ensemble of Decision Trees (original) (raw)
Related papers
A Survey of Evolutionary Algorithms for Decision-Tree Induction
IEEE Transactions on Systems, Man, and Cybernetics, 2012
This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The paper’s original
An Evolutionary Scheme for Decision Tree Construction
Knowledge-Based Systems, 2016
Classification is a central task in machine learning and data mining. Decision tree (DT) is one of the most popular learning models in data mining. The performance of a DT in a complex decision problem depends on the e ciency of its construction. However, obtaining the optimal DT is not a straightforward process. In this paper, we propose a new evolutionary meta-heuristic optimization based approach for identifying the best settings during the construction of a DT. We designed a genetic algorithm coupled with a multi-task objective function to pull out the optimal DT with the best parameters. This objective function is based on three main factors: (1) Precision over the test samples, (2) Trust in the construction and validation of a DT using the smallest possible training set and the largest possible testing set, and (3) Simplicity in terms of the size of the generated candidate DT, and the used set of attributes. We extensively evaluate our approach on 13 benchmark datasets and a fault diagnosis dataset. The results show that
Breeding decision trees using evolutionary techniques
MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, 2001
We explore the use of genetic algorithms to directly evolve classification decision trees. We argue on the suitability of such a concept learner due to its ability to efficiently search complex hypotheses spaces and discover conditionally dependent as well as irrelevant attributes. The performance of the system is measured on a set of artificial and standard discretized concept-learning problems and compared with the performance of two known algorithms (C4. 5, OneR). We demonstrate that the derived hypotheses of standard ...
Evolutionary design of decision trees
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2012
Decision tree (DT) is one of the most popular symbolic machine learning approaches to classification with a wide range of applications. Decision trees are especially attractive in data mining. It has an intuitive representation and is, therefore, easy to understand and interpret, also by nontechnical experts. The most important and critical aspect of DTs is the process of their construction. Several induction algorithms exist that use the recursive top-down principle to divide training objects into subgroups based on different statistical measures in order to achieve homogeneous subgroups. Although being robust and fast, generally providing good results, their deterministic and heuristic nature can lead to suboptimal solutions. Therefore, alternative approaches have developed which try to overcome the drawbacks of classical induction. One of the most viable approaches seems to be the use of evolutionary algorithms, which can produce better DTs as they are searching for globally optimal solutions, evaluating potential solutions with regard to different criteria. We review the process of evolutionary design of DTs, providing the description of the most common approaches as well as referring to recognized specializations. The overall process is first explained and later demonstrated in a step-by-step case study using a dataset from the University of California, Irvine (UCI) machine learning repository.
Quality Diversity Genetic Programming for Learning Decision Tree Ensembles
Lecture Notes in Computer Science
Quality Diversity (QD) algorithms are a class of populationbased evolutionary algorithms designed to generate sets of solutions that are both fit and diverse. In this paper, we describe a strategy for applying QD concepts to the generation of decision tree ensembles by optimizing collections of trees for both individually accurate and collectively diverse predictive behavior. We compare three variants of this QD strategy with two existing ensemble generation strategies over several classification data sets. We then briefly highlight the effect of the evolutionary algorithm at the core of the strategy. The examined algorithms generate ensembles with distinct predictive behaviors as measured by classification accuracy and intrinsic diversity. The plotted behaviors hint at highly data-dependent relationships between these metrics. QD-based strategies are suggested as a means to optimize classifier ensembles along this performance curve along with other suggestions for future work.
Evolutionary and greedy exploration of the space of decision trees
2006
This paper addresses the issue of the decision trees induction. We define the space of all possible trees and try to find good trees by searching that space. We compare performance of an evolutionary algorithm and standard, problem-specific algorithms (ID3, C4.5).
IEEE Access, 2018
In this paper, a differential-evolution-based approach implementing a global search strategy to find a near-optimal axis-parallel decision tree is introduced. In this paper, the internal nodes of a decision tree are encoded in a real-valued chromosome, and a population of them evolves using the training accuracy of each one as its fitness value. The height of a complete binary decision tree whose number of internal nodes is not less than the number of attributes in the training set is used to compute the chromosome size, and a procedure to map a feasible axis-parallel decision tree from one chromosome is applied, which uses both the smallest-position-value rule and the training instances. The best decision tree in the final population is refined replacing some leaf nodes with sub-trees to improve its accuracy. The differential evolution algorithm has been successfully applied in conjunction with several supervised learning methods to solve numerous classification problems, due to it exhibiting a good tradeoff between its exploitation and exploration skills, and to the best of our knowledge, it has not been utilized to build axis-parallel decision trees. To obtain reliable estimates of the predictive performance of this approach and to compare its results with those achieved by other methods, a repeated stratified tenfold cross-validation procedure is applied in the experimental study. A statistical analysis of these results suggests that our approach is better as a decision tree induction method as compared with other supervised learning methods. Also our results are comparable to those obtained with random forest and one multilayer-perceptron-based classifier. INDEX TERMS Decision trees, differential evolution, metaheuristics, smallest-position-value rule, supervised learning. JUANA CANUL-REICH received the Ph.D. degree in computer science and engineering from the University of South Florida in 2010. She is currently a Faculty Member with the
Ensemble learning for free with evolutionary algorithms
Computing Research Repository, 2007
Evolutionary Learning proceeds by evolving a population of classifiers, from which it generally returns (with some notable exceptions) the single best-of-run classifier as final result. In the meanwhile, Ensemble Learning, one of the most efficient approaches in supervised Machine Learning for the last decade, proceeds by building a population of diverse classifiers. Ensemble Learning with Evolutionary Computation thus receives increasing attention. The Evolutionary Ensemble Learning (EEL) approach presented in this paper features two contributions. First, a new fitness function, inspired by co-evolution and enforcing the classifier diversity, is presented. Further, a new selection criterion based on the classification margin is proposed. This criterion is used to extract the classifier ensemble from the final population only (Off-EEL) or incrementally along evolution (On-EEL). Experiments on a set of benchmark problems show that Off-EEL outperforms single-hypothesis evolutionary learning and state-of-art Boosting and generates smaller classifier ensembles.