Forward Regression in R: From The Extreme Slow to the Extreme Fast (original) (raw)
Related papers
Building Regression Models with the Forward Search
Journal of Computing and Information Technology, 2007
We give an example of the use of the forward search in building a regression model. The standard backwards elimination of variables is supplemented by forward plots of added variable t statistics that exhibit the effect of each observation on the process of model building. Attention is also paid to the effect of individual observations on selection of a transformation. Variable selection using AIC is mentioned, as is the analysis of multivariate data.
A Simple Introduction to Regression Modeling using R
2023
In statistical modeling, regression analysis is a group of statistical processes used in R programming and statistics to determine the relationship between dataset variables. It is a solid technique for determining the factors that affect an issue of interest. You can confidently establish which elements are most important, which ones can be ignored, and how these factors interact when you do a regression. It can be used to simulate the long-term link between variables and gauge how strongly the relationships between them are related. Regression analysis is typically used to ascertain the relationship between the dataset's dependent and independent variables. Generally, regression analysis is used to determine the relationship between the dependent and independent variables of the dataset. Understanding how dependent variables change when one of the independent variables changes while the other independent variables remain constant is made easier with the use of regression analysis. As a result, it is easier to create a regression model and forecast values in response to changes in one of the independent variables. Based on the categories of dependent variables, the quantity of independent variables, and the contour of the regression line. In this paper, we use the R programming language to present various empirical investigations in statistics and econometrics. We next consider problems involving modeling the relationship between response and explanatory variables for linear and non-liner regression models.
Forward Selection Procedure for Linear Model Building Using Spearman’s Rank Correlation
Dhaka University Journal of Science, 2012
Forward selection (FS) is a step-by-step model-building algorithm for linear regression. The FS algorithm was expressed in terms of sample correlations where Pearsons product-moment correlation was used. The FS yields poor results when the data contain contaminations. In this article, we propose the use of Spearmans rank correlation in FS. The proposed method is called FSr. We conduct an extensive simulation study to compare the performance of FSr with FS. The proposed FSr performs better than the FS algorithm in the contaminated data. We also demonstrate a real data application of FSr.DOI: http://dx.doi.org/10.3329/dujs.v60i2.11481 Dhaka Univ. J. Sci. 60(2): 141-145, 2012 (July)
Multi-step-ahead multivariate predictors: A comparative analysis
49th IEEE Conference on Decision and Control (CDC), 2010
The focus of this article is to undertake a comparative analysis of multi-step-ahead linear multivariate predictors. The approach considered for the estimation will be based on geometrically reliable linear algebra tools, resorting to subspace identification methods. A crucial issue is quantification of both bias error and variance affecting the estimate of the prediction for increasing values of the look ahead when only a small number of samples is available. No complete theory is available so far, nor sufficient numerical experience. Therefore, the analysis of this paper aims at shading some lights on the topic providing some insights and help to develop some intuitions.
ERAF: A R Package for Regression and Forecasting
Biological and Artificial Intelligence Environments, 2005
We present a package for R language containing a set of tools for regression using ensembles of learning machines and for time series forecasting. The package contains implementations of Bagging and Adaboost for regression, and algorithms for computing mutual information, autocorrelation and false nearest neighbors.
COBRA: A combined regression strategy
Journal of Multivariate Analysis, 2015
A new method for combining several initial estimators of the regression function is introduced. Instead of building a linear or convex optimized combination over a collection of basic estimators r 1 ,. .. , r M , we use them as a collective indicator of the proximity between the training data and a test observation. This local distance approach is model-free and very fast. More specifically, the resulting nonparametric/nonlinear combined estimator is shown to perform asymptotically at least as well in the L 2 sense as the best combination of the basic estimators in the collective. A companion R package called COBRA (standing for COmBined Regression Alternative) is presented (downloadable on http://cran.r-project.org/web/packages/ COBRA/index.html). Substantial numerical evidence is provided on both synthetic and real data sets to assess the excellent performance and velocity of our method in a large variety of prediction problems.
Introducing Prior Information into the Forward Search for Regression
Topics on Methodological and Applied Statistical Inference, 2016
The forward search provides a flexible and informative form of robust regression. We describe the introduction of prior information into the regression model used in the search through the device of fictitious observations. The extension to the forward search is not entirely straightforward, requiring weighted regression. Forward plots are used to exhibit the effect of correct and incorrect prior information on inferences.
FWDselect: An R Package for Variable Selection in Regression Models
In multiple regression models, when there are a large number (p) of explanatory variables which may or may not be relevant for predicting the response, it is useful to be able to reduce the model. To this end, it is necessary to determine the best subset of q (q ≤ p) predictors which will establish the model with the best prediction capacity. FWDselect package introduces a new forward stepwisebased selection procedure to select the best model in different regression frameworks (parametric or nonparametric). The developed methodology, which can be equally applied to linear models, generalized linear models or generalized additive models, aims to introduce solutions to the following two topics: i) selection of the best combination of q variables by using a step-by-step method; and, perhaps, most importantly, ii) search for the number of covariates to be included in the model based on bootstrap resampling techniques. The software is illustrated using real and simulated data.
Multiple Regression: A Leisurely Primer
2001
Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to regression analysis, focusing on five major questions a novice user might ask. The presentation is set in the framework of the general linear model and builds on correlational theory. More advanced topics are introduced briefly with suggested references for the reader who might wish to pursue the subject. (Contains 11 references.) (Author/SLD) Reproductions supplied by EDRS are the best that can be made from the original document.