Marius Kloft - Academia.edu (original) (raw)

Related Authors

Sabine Grunwald

J.E. Laffoon

Siswanto  Siswanto

Richard Telford

Joas da Silva Brito

Dave Chadee

University of the West Indies, St. Augustine

Uploads

Papers by Marius Kloft

Research paper thumbnail of Performance Analysis of Some Machine Learning Algorithms for Regression Under Varying Spatial Autocorrelation

Machine learning is a computational technology widely used in regression and classification tasks... more Machine learning is a computational technology widely used in regression and classification tasks. One of the drawbacks of its use in the analysis of spatial variables is that machine learning algorithms are in general, not designed to deal with spatially autocorrelated data. This often causes the residuals to exhibit clustering, in clear violation of the condition of independent and identically distributed random variables. In this work we analyze the performance of some well-established Machine Learning algorithms and one spatial algorithm in regression tasks for situations where the data presents varying degrees of clustering. We defined “performance” as the goodness of fit achieved by an algorithm in conjunction with the degree of spatial association of the residuals. We generated a set of synthetic datasets with varying degrees of clustering and built regression models with synthetic autocorrelated explanatory variables and regression coefficients. We then solved these regression models with the algorithms chosen. We identified significant differences between the machine learning algorithms in their sensitivity to spatial autocorrelation and the achieved goodness of fit. We also exposed the superiority of machine learning algorithms over generalized least squares in both goodness of fit and residual spatial autocorrelation. Our findings can be useful in choosing the best regression algorithm for the analysis of spatial variables

Research paper thumbnail of Performance Analysis of Some Machine Learning Algorithms for Regression Under Varying Spatial Autocorrelation

Machine learning is a computational technology widely used in regression and classification tasks... more Machine learning is a computational technology widely used in regression and classification tasks. One of the drawbacks of its use in the analysis of spatial variables is that machine learning algorithms are in general, not designed to deal with spatially autocorrelated data. This often causes the residuals to exhibit clustering, in clear violation of the condition of independent and identically distributed random variables. In this work we analyze the performance of some well-established Machine Learning algorithms and one spatial algorithm in regression tasks for situations where the data presents varying degrees of clustering. We defined “performance” as the goodness of fit achieved by an algorithm in conjunction with the degree of spatial association of the residuals. We generated a set of synthetic datasets with varying degrees of clustering and built regression models with synthetic autocorrelated explanatory variables and regression coefficients. We then solved these regression models with the algorithms chosen. We identified significant differences between the machine learning algorithms in their sensitivity to spatial autocorrelation and the achieved goodness of fit. We also exposed the superiority of machine learning algorithms over generalized least squares in both goodness of fit and residual spatial autocorrelation. Our findings can be useful in choosing the best regression algorithm for the analysis of spatial variables

Log In