A weighted voting framework for classifiers ensembles (original) (raw)

A weight-adjusted voting algorithm for ensembles of classifiers

Journal of the Korean Statistical Society, 2011

We present a new weighted voting classification ensemble method, called WAVE, that uses two weight vectors: a weight vector of classifiers and a weight vector of instances. The instance weight vector assigns higher weights to observations that are hard to classify. The weight vector of classifiers puts larger weights on classifiers that perform better on hardto-classify instances. One weight vector is designed to be calculated in conjunction with the other through an iterative procedure. That is, the instances of higher weights play a more important role in determining the weights of classifiers, and vice versa. We proved that the iterated weight vectors converge to the optimal weights which can be directly calculated from the performance matrix of classifiers in an ensemble. The final prediction of the ensemble is obtained by voting using the optimal weight vector of classifiers. To compare the performance between a simple majority voting and the proposed weighted voting, we applied both of the voting methods to bootstrap aggregation and investigated the performance on 28 datasets. The result shows that the proposed weighted voting performs significantly better than the simple majority voting in general.

A Novel Naive Bayes Voting Strategy for Combining Classifiers

2012 International Conference on Frontiers in Handwriting Recognition, 2012

Classifier combination methods have proved to be an effective tool for increasing the performance in pattern recognition applications. The rationale of this approach follows from the observation that appropriately diverse classifiers make uncorrelated errors. Unfortunately, this theoretical assumption is not easy to satisfy in practical cases, thus reducing the performance obtainable with any combination strategy. In this paper we propose a new weighted majority vote rule which try to solve this problem by jointly analyzing the responses provided by all the experts, in order to capture their collective behavior when classifying a sample. Our rule associates a weight to each class rather than to each expert and computes such weights by estimating the joint probability distribution of each class with the set of responses provided by all the experts in the combining pool. The probability distribution has been computed by using the naive Bayes probabilistic model. Despite its simplicity, this model has been successfully used in many practical applications, often competing with much more sophisticated techniques. The experimental results, performed by using three standard databases of handwritten digits, confirmed the effectiveness of the proposed method.

Combining Heterogeneous Sets of Classiers: Theoretical and Experimental Comparison of Methods

In recent years, the combination of classifiers has been proposed as a method to improve the accuracy achieved in isolation by a single classifier. We are interested in ensemble methods that allow the combination of heterogeneous sets of classifiers, which are classifiers built using differing learning paradigms. We focus on theoretical and experimental comparison of five such combination methods: majority vote, a method based on Bayes' rule, a method based on Dempster-Shafer evidence combination, behavior-knowledge space, and logistic regression. We develop an upper bound on the accuracy that can be obtained by any of the five methods of combination, and show that this estimate can be used to determine whether an ensemble may improve the performance of its members. We then report a series of experiments using standard data sets and learning methods, and compare experimental results to theoretical expectations.

A game-theoretic framework for classifier ensembles using weighted majority voting with local accuracy estimates

In this paper, a novel approach for the optimal combination of binary classifiers is proposed. The classifier combination problem is approached from a Game Theory perspective. The proposed framework of adapted weighted majority rules (WMR) is tested against common rank-based, Bayesian and simple majority models, as well as two soft-output averaging rules. Experiments with ensembles of Support Vector Machines (SVM), Ordinary Binary Tree Classifiers (OBTC) and weighted k-nearest-neighbor (w/k-NN) models on benchmark datasets indicate that this new adaptive WMR model, employing local accuracy estimators and the analytically computed optimal weights outperform all the other simple combination rules.

A Unifying Framework for Learning the Linear Combiners for Classifier Ensembles

For classifier ensembles, an effective combination method is to combine the outputs of each classifier using a linearly weighted combination rule. There are multiple ways to linearly combine classifier outputs and it is beneficial to analyze them as a whole. We present a unifying framework for multiple linear combination types in this paper. This unification enables using the same learning algorithms for different types of linear combiners. We present various ways to train the weights using regularized empirical loss minimization. We propose using the hinge loss for better performance as compared to the conventional least-squares loss. We analyze the effects of using hinge loss for various types of linear weight training by running experiments on three different databases. We show that, in certain problems, linear combiners with fewer parameters may perform as well as the ones with much larger number of parameters even in the presence of regularization.

Classifier selection for majority voting

Information fusion, 2005

Individual classification models are recently challenged by combined pattern recognition systems, which often show better performance. In such systems the optimal set of classifiers is first selected and then combined by a specific fusion method. For a small number of classifiers optimal ensembles can be found exhaustively, but the burden of exponential complexity of such search limits its practical applicability for larger systems. As a result, simpler search algorithms and/or selection criteria are needed to reduce the complexity. This work provides a revision of the classifier selection methodology and evaluates the practical applicability of diversity measures in the context of combining classifiers by majority voting. A number of search algorithms are proposed and adjusted to work properly with a number of selection criteria including majority voting error and various diversity measures. Extensive experiments carried out with 15 classifiers on 27 datasets indicate inappropriateness of diversity measures used as selection criteria in favour of the direct combiner error based search. Furthermore, the results prompted a novel design of multiple classifier systems in which selection and fusion are recurrently applied to a population of best combinations of classifiers rather than the individual best. The improvement of the generalisation performance of such system is demonstrated experimentally.

Relationships between combination methods and measures of diversity in combining classifiers

Information Fusion, 2002

This study looks at the relationships between different methods of classifier combination and different measures of diversity. We considered ten combination methods and ten measures of diversity on two benchmark data sets. The relationship was sought on ensembles of 3 classifiers built on all possible partitions of the respective feature sets into subsets of pre-specified sizes. The only positive finding was that the Double-Fault measure of diversity and the measure of difficulty both showed reasonable correlation with Majority Vote and Naive-Bayes combinations. Since both these measures have an indirect connection to the ensemble accuracy, this result was not unexpected. However, our experiments did not detect a consistent relationship between the other measures of diversity and the ten combination methods.

On evidential combination rules for ensemble classifiers

2008

Ensemble classifiers are known to generally perform better than each individual classifier of which they consist. One approach to classifier fusion is to apply Shaferpsilas theory of evidence. While most approaches have adopted Dempsterpsilas rule of combination, a multitude of combination rules have been proposed. A number of combination rules as well as two voting rules are compared when used in conjunction with a specific kind of ensemble classifier, known as random forests, w.r.t. accuracy, area under ROC curve and Brier score on 27 datasets. The empirical evaluation shows that the choice of combination rule can have a significant impact on the performance for a single dataset, but in general the evidential combination rules do not perform better than the voting rules for this particular ensemble design. Furthermore, among the evidential rules, the associative ones appear to have better performance than the non-associative.

Classifier Ensemble Generation for the Majority Vote Rule

Lecture Notes in Computer Science, 2008

This paper addresses the problem of classifier ensemble generation. The goal is to obtain an ensemble to achieve maximum recognition gains with the lowest number of classifiers. The final decision is taken following a majority vote rule. If the classifiers make independent errors, the majority vote outperforms the best classifier. Therefore, the ensemble should be formed by classifiers exhibiting individual accuracy and diversity. To account for the quality of the ensemble, this work uses a sigmoid function to measure the behavior of the ensemble in relation to the majority vote rule, over a test labelled data set.