ERAF: A R Package for Regression and Forecasting (original) (raw)
Related papers
Journal of Open Source Software
The metrica R package (Correndo et al., 2022) is an open-source software designed to facilitate the quantitative and visual assessment of prediction performance of point-forecast simulation models for continuous (regression) and categorical variables (classification). The package ensembles a series of 80+ functions that account for multiple aspects of the agreement between predicted and observed values. Without the need of advanced skills on programming, metrica enables users to automate the estimation of multiple prediction performance metrics including goodness of fit, error metrics, error decomposition, model efficiency, indices of agreement, and to produce stylish data visualization outputs. This article introduces metrica, an R package developed with the main objective of contributing to transparent and reproducible evaluation of point-forecast models performance.
Time series prediction with ensemble models
2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541)
We describe the use of ensemble methods to build proper models for time series prediction. Our approach extends the classical ensemble methods for neural networks by using several different model architectures. We further suggest an iterated prediction procedure to select the final ensemble members.
An evaluation of neural network ensembles and model selection for time series prediction
2010
Ensemble methods represent an approach to combine a set of models, each capable of solving a given task, but which together produce a composite global model whose accuracy and robustness exceeds that of the individual models. Ensembles of neural networks have traditionally been applied to machine learning and pattern recognition but more recently have been applied to forecasting of time series data. Several methods have been developed to produce neural network ensembles ranging from taking a simple average of individual model outputs to more complex methods such as bagging and boosting. Which ensemble method is best; what factors affect ensemble performance, under what data conditions are ensembles most useful and when is it beneficial to use ensembles over model selection are a few questions which remain unanswered. In this paper we present some initial findings using neural network ensembles based on the mean and median applied to forecast synthetic time series data. We vary factors such as the number of models included in the ensemble and how the models are selected, whether randomly or based on performance. We compare the performance of different ensembles to model selection and present the results.
Ensemble Methods in Data Mining Improving Accuracy Through Combining Predictions BOOK
Ensemble methods have been called the most influential development in Data Mining and Machine Learning in the past decade. They combine multiple models into one usually more accurate than the best of its components. Ensembles can provide a critical boost to industrial challenges -from investment timing to drug discovery, and fraud detection to recommendation systems -where predictive accuracy is more vital than model interpretability.
An Extensible Ensemble Environment for Time Series Forecasting
Proceedings of the 12th International Conference on Enterprise Information Systems, 2010
There have been diverse works demonstrating that ensembles can improve the performance over any individual solution for time series forecasting. This work presents an extensible environment that can be used to create, experiment and analyse ensembles for time series forecasting. Usually, the analyst develops the individual solution and the ensemble algorithms for each experiment. The proposed environment intends to provide a flexible tool for the analyst to include, configure and experiment with individual solutions and to build and execute ensembles. In this paper, we describe the environment, its features and we present a simple experiment on its usage.
2020
Machine learning techniques always aim to reduce the generalized prediction error. In order to reduce it, ensemble methods present a good approach combining several models that results in a greater forecasting capacity. The Random Machines already have been demonstrated as strong technique, i.e: high predictive power, to classification tasks, in this article we propose an procedure to use the bagged-weighted support vector model to regression problems. Simulation studies were realized over artificial datasets, and over real data benchmarks. The results exhibited a good performance of Regression Random Machines through lower generalization error without needing to choose the best kernel function during tuning process.
HELIX
The aim of this work is to exertion a plug-in, formerly named as Time Series Analysis and Forecasting (TSAF) and incorporates this plug-in into R language. The intent behind materializing this plug-in is to establish a firstrated approach to forecast in-advance extrapolations in time series data and to make accurate decisions methodically. The plug-in provides a computationally intelligent environment by accepting a preprocessed time series datasets as input and sense the direction of outputs that will transpire over the coming ages. The internal code structure and implementation details in between the input and output precincts are factorized with the general machine learning, statistical calculation, and visualization packages. The preeminence of this incarnated viewpoint is scientifically verified over timeseries datasets archived in UCI repositories. The results enabled from these datasets pertain to revive qualitative nature of forecasting, which helps the users to predict or foresee changing domain trends and thereby make strategic decisions and hopefully gain lifelong encroachments in this process.
adabag: an r package for classification with boosting and bagging
Boosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best known, although it is suitable only for dichotomous tasks. AdaBoost.M1 and SAMME (stagewise additive modeling using a multi-class exponential loss function) are two easy and natural extensions to the general case of two or more classes. In this paper, the adabag R package is introduced. This version implements AdaBoost.M1, SAMME and bagging algorithms with classification trees as base classifiers. Once the ensembles have been trained, they can be used to predict the class of new samples. The accuracy of these classifiers can be estimated in a separated data set or through cross validation. Moreover, the evolution of the error as the ensemble grows can be analysed and the ensemble can be pruned. In addition, the margin in the class prediction and the probability of each class for the observations can be calculated. Finally, several classic examples in classification literature are shown to illustrate the use of this package.