Boosting for Mineral Prospectivity Modeling: A New GIS Toolbox (original) (raw)
Natural Resources Research
With an increasing demand for raw materials, predictive models that support successful mineral exploration targeting are of great importance. We evaluated different machine learning techniques with an emphasis on boosting algorithms and implemented them in an ArcGIS toolbox. Performance was tested on an exploration dataset from the Iberian Pyrite Belt (IPB) with respect to accuracy, performance, stability, and robustness. Boosting algorithms are ensemble methods used in supervised learning for regression and classification. They combine weak classifiers, i.e., classifiers that perform slightly better than random guessing to obtain robust classifiers. Each time a weak learner is added; the learning set is reweighted to give more importance to misclassified samples. Our test area, the IPB, is one of the oldest mining districts in the world and hosts giant volcanichosted massive sulfide (VMS) deposits. The spatial density of ore deposits, as well as the size and tonnage, makes the area unique, and due to the high data availability and number of known deposits, well-suited for testing machine learning algorithms. We combined several geophysical datasets, as well as layers derived from geological maps as predictors of the presence or absence of VMS deposits. Boosting algorithms such as BrownBoost and Adaboost were tested and compared to Logistic Regression (LR), Random Forests (RF) and Support Vector machines (SVM) in several experiments. We found performance results relatively similar, especially to BrownBoost, which slightly outperformed LR and SVM with respective accuracies of 0.96 compared to 0.89 and 0.93. Data augmentation by perturbing deposit location led to a 7% improvement in results. Variations in the split ratio of training and test data led to a reduction in the accuracy of the prediction result with relative stability occurring at a critical point at around 26 training samples out of 130 total samples. When lower numbers of training data were introduced accuracy dropped significantly. In comparison with other machine learning methods, Adaboost is user-friendly due to relatively short training and prediction times, the low likelihood of overfitting and the reduced number of hyperparameters for optimization. Boosting algorithms gave high predictive accuracies, making them a potential data-driven alternative for regional scale and/or brownfields mineral exploration.