COBRA: A combined regression strategy (original) (raw)

COBRA: A Nonlinear Aggregation Strategy

A new method for combining several initial estimators of the regression function is introduced. Instead of building a linear or convex optimized combination over a collection of basic estimators r 1 ,. .. , r M , we use them as a collective indicator of the proximity between the training data and a test observation. This local distance approach is model-free and very fast. More specifically, the resulting collective estimator is shown to perform asymptotically at least as well in the L 2 sense as the best basic estimator in the collective. Moreover, it does so without having to declare which might be the best basic estimator for the given data set. A companion R package called COBRA (standing for COmBined Regression Alternative) is presented (downloadable on http://cran.r-project.org/web/packages/COBRA/index.html). Substantial numerical evidence is provided on both synthetic and real data sets to assess the excellent performance and velocity of our method in a large variety of prediction problems.

Aggregating regression procedures to improve performance

Bernoulli, 2004

Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuraccy. Applications of these methods in many examples are very succeesful, pointing to the great potential of combining procedures. A fundamental question regarding combining procedure is: What is the potential gain and how much one needs to pay for it?

Aggregating Regression Procedures for a Better Performance

Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuraccy. Applications of these methods in many examples are very succeesful, pointing to the great potential of combining procedures. A fundamental question regarding combining procedure is: What is the potential gain and how much one needs to pay for it?

A principal components approach to combining regression estimates

1999

The goal of combining the predictions of multiple learned models is to form an improved estimator. A combining strategy must be able to robustly handle the inherent correlation, or multicollinearity, of the learned models while identifying the unique contributions of each. A progression of existing approaches and their limitations with respect to these two issues are discussed. A new approach, PCR*, based on principal components regression is proposed to address these limitations. An evaluation of the new approach on a collection of domains reveals that (1) PCR* was the most robust combining method, (2) correlation could be handled without eliminating any of the learned models, and (3) the principal components of the learned models provided a continuum of "regularized" weights from which PCR* could choose.

Improving nonparametric regression methods by bagging and boosting

Computational Statistics & Data Analysis, 2002

Recently, many authors have proposed new algorithms to improve the accuracy of certain classiÿers by assembling a collection of individual classiÿers obtained resampling on the training sample. Bagging and boosting are well-known methods in the machine learning context and they have been proved to be successful in classiÿcation problems. In the regression context, the application of these techniques has received little investigation. Our aim is to analyse, by simulation studies, when boosting and bagging can reduce the training set error and the generalization error, using nonparametric regression methods as predictors. In this work, we will consider three methods: projection pursuit regression (PPR), multivariate adaptive regression splines (MARS), local learning based on recursive covering (DART).

Improving the aggregating algorithm for regression

Kernel Ridge Regression (KRR) and the recently developed Kernel Aggregating Algorithm for Regression (KAAR) are regression methods based on Least Squares. KAAR has theoretical advantages over KRR since a bound on its square loss for the worst case is known that does not hold for KRR. This bound does not make any assumptions about the underlying probability distribution of the data. In practice, however, KAAR performs better only when the data is heavily corrupted by noise or has severe outliers. This is due to the fact that KAAR is similar to KRR but with some fairly strong extra regularisation. In this paper we develop KAAR in such a way as to make it practical for use on real world data. This is achieved by controlling the amount of extra regularisation. Empirical results (including results on the well known Boston Housing dataset) suggest that in general our new methods perform as well as or better than KRR, KAAR and Support Vector Machines (SVM) in terms of the square loss they suffer.

An extensive experimental survey of regression methods

Neural Networks, 2019

Regression is a very relevant problem in machine learning, with many different available approaches. The current work presents a comparison of a large collection composed by 77 popular regression models which belong to 19 families: linear and generalized linear models, generalized additive models, least squares, projection methods, LASSO and ridge regression, Bayesian models, Gaussian processes, quantile regression, nearest neighbors, regression trees and rules, random forests, bagging and boosting, neural networks, deep learning and support vector regression. These methods are evaluated using all the regression datasets of the UCI machine learning repository (83 datasets), with some exceptions due to technical reasons. The experimental work identifies several outstanding regression models: the M5 rule-based model with corrections based on nearest neighbors (cubist), the gradient

Performance Evaluation and Comparison of a New Regression Algorithm

2023

In recent years, Machine Learning algorithms, in particular supervised learning techniques, have been shown to be very effective in solving regression problems. We compare the performance of a newly proposed regression algorithm against four conventional machine learning algorithms namely, Decision Trees, Random Forest, k-Nearest Neighbours and XG Boost. The proposed algorithm was presented in detail in a previous paper but detailed comparisons were not included. We do an in-depth comparison, using the Mean Absolute Error (MAE) as the performance metric, on a diverse set of datasets to illustrate the great potential and robustness of the proposed approach. The reader is free to replicate our results since we have provided the source code in a GitHub repository while the datasets are publicly available.

A comparison of model aggregation methods for regression

2003

Combining machine learning models is a means of improving overall accuracy. Various algorithms have been proposed to create aggregate models from other models, and two popular examples for classification are Bagging and AdaBoost. In this paper we examine their adaptation to regression, and benchmark them on synthetic and real-world data. Our experiments reveal that different types of AdaBoost algorithms require different complexities of base models.

Robust Combination of Model Selection Methods for Prediction

Statistica Sinica, 2012

One important goal of regression analysis is prediction. In recent years, the idea of combining different statistical methods has attracted an increasing attention. In this work, we propose a method, l1-ARM (adaptive regression by mixing), to robustly combine model selection methods that performs well adaptively. In numerical work, we consider the LASSO, SCAD, and adaptive LASSO in representative scenarios, as well as in cases of randomly generated models. The l1-ARM automatically performs like the best among them and consequently provides a better estimation/prediction in an overall sense, especially when outliers are likely to occur.