Managing Diversity In Regression Ensembles (original) (raw)

Building Ensembles with Heterogeneous Models

2003

In the context of ensemble learning for regression problems, we study the effect of building ensembles from different model classes. Tests on real and simulated data sets show that this approach can improve model accuracy compared to ensembles from a single model class.

Optimising Diversity in Classifier Ensembles

SN Computer Science, 2022

Ensembles of predictors have been generally found to have better performance than single predictors. Although diversity is widely thought to be an important factor in building successful ensembles, there have been contradictory results in the literature regarding the influence of diversity on the generalisation error. Fundamental to this may be the way diversity itself is defined. We present two new diversity measures, based on the idea of ambiguity, obtained from the bias-variance decomposition using the cross-entropy error or the hinge-loss. If random sampling is used to select patterns on which ensemble members are trained, we find that the generalisation error is negatively correlated with diversity at high sampling rates; conversely generalisation error is positively correlated with diversity when the sampling rate is low and the diversity high. We use evolutionary optimisers for small ensembles to select the subsets of patterns for predictor training by maximising these divers...

Diversity in ensemble feature selection

2003

Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to achieve higher accuracy, which is often not achievable with single models. It was shown theoretically and experimentally that in order for an ensemble to be effective, it should consist of high-accuracy base classifiers that should have high diversity in their predictions.

Exploiting Diversity In Ensembles: Improving the Performance on Unbalanced Datasets

Multiple Classifier Systems, 2007

Ensembles are often capable of greater predictive performance than any of their individual classifiers. Despite the need for classifiers to make different kinds of errors, the ma jority voting scheme, typically used, treats each classifier as though it contributed equally to the group‘s performance. This can be particularly limiting on unbalanced datasets, as one is more interested in complementing classifiers that can assist in improving the true positive rate without signicantly increasing the false positive rate. Therefore, we implement a genetic algorithm based framework to weight the contribution of each classifier by an appropriate fitness function, such that the classifiers that complement each other on the unbalanced dataset are preferred, resulting in significantly improved performances. The proposed framework can be built on top of any collection of classifiers with different fitness functions.