Machine learning algorithms for predicting air pollutants (original) (raw)

An Application of Machine Learning Methods to PM10 Level Medium-Term Prediction

Lecture Notes in Computer Science, 2007

The study described in this paper, analyzed the urban and suburban air pollution principal causes and identified the best subset of features (meteorological data and air pollutants concentrations) for each air pollutant in order to predict its medium-term concentration (in particular for the PM 10). An information theoretic approach to feature selection has been applied in order to determine the best subset of features by means of a proper backward selection algorithm. The final aim of the research is the implementation of a prognostic tool able to reduce the risk for the air pollutants concentrations to be above the alarm thresholds fixed by the law. The implementation of this tool will be carried out using machine learning methods based on some of the most widespread statistical data driven techniques (Artificial Neural Networks, ANN, and Support Vector Machines, SVM).

Predictive Analysis of Air Pollution Using Machine Learning Techniques

Air pollution is a major source of worry for all living things. India has one of the world's highest levels of air pollution. Rising population, unplanned growth, increased automotive traffic, stubble burning, industrial waste, fossil fuel combustion, powerplant emissions and a variety of other causes all contribute considerably to air pollution in developing countries. Particulate matter (PM) 2.5 is the most concerning of all air pollutants since it causes major health problems in individuals. Prediction and management of air quality have therefore become critical. Several machine learning algorithms were used in this work to examine dataset results. The results of our work suggest that for future predictions, logistic regression and autoregression can be efficaciously utilised for the analysis and forecasting of levels of PM2.5 in the future. Countries can lower the prevalence of strokes, and chronic and acute respiratory illnesses such as asthma, and lung cancer by reducing air pollution levels.

Machine learning algorithms in air quality modeling

Global Journal of Environmental Science and Management, 2019

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affect the performance of an algorithm, however, it is yet to be known why an algorithm is preferred over the other for a certain task. The work aims at highlighting the underlying principles of machine learning techniques and about their role in enhancing the prediction performance. The study adopts, 38 most relevant studies in the field of environmental science and engineering which have applied machine learning techniques during last 6 years. The review conducted explores several aspects of the studies such as: 1) the role of input predictors to improve the prediction accuracy; 2) geographically...

Air Quality Prediction with Machine Learning

2019

In recent years, air quality has become a significant environmental health issue due to rapid urbanization and industrialization. Because of the impact air quality has on peoples everyday life, how to predict air quality precisely, has become an urgent and essential problem. Air quality prediction is a challenging problem with several complicated factors with additional dependencies among them. We target our air prediction study to the city of Trondheim, Norway. The air quality in Trondheim is on average at a healthy level, but has periods of high variations of severe pollution, especially in the winter months. The study demonstrates the benefits of machine learning for predicting air pollutants general pattern, and to foresee sudden spikes of a high pollution level. This paper explores a multivariate time series approach to modeling and forecasting the pollution of PM2.5, PM10, and NO2 at three air quality stations. This study is concerned with combining data of pollutants, meteoro...

A Novel Method for Improving Air Pollution Prediction Based on Machine Learning Approaches: A Case Study Applied to the Capital City of Tehran

ISPRS International Journal of Geo-Information

Environmental pollution has mainly been attributed to urbanization and industrial developments across the globe. Air pollution has been marked as one of the major problems of metropolitan areas around the world, especially in Tehran, the capital of Iran, where its administrators and residents have long been struggling with air pollution damage such as the health issues of its citizens. As far as the study area of this research is concerned, a considerable proportion of Tehran air pollution is attributed to PM10 and PM2.5 pollutants. Therefore, the present study was conducted to determine the prediction models to determine air pollutions based on PM10 and PM2.5 pollution concentrations in Tehran. To predict the air-pollution, the data related to day of week, month of year, topography, meteorology, and pollutant rate of two nearest neighbors as the input parameters and machine learning methods were used. These methods include a regression support vector machine, geographically weighte...

Forecasting Air Pollution Particulate Matter (PM 2.5 ) Using Machine Learning Regression Models

Elsevier, 2020

From the past few decades, it has been observed that the urbanization and industrialization are expanding in the developed nations and are confronting the overwhelming air contamination issue. The citizens and governments have experienced and expressed the increasingly concerned regarding the impact of air pollution affecting human health and proposed sustainable development for overriding air pollution issues across the worldwide. The outcome of modern industrialization contains the liquid droplets, solid particles and gas molecules and is spreading in the atmospheric air. The heavy concentration of particulate matter of size PM 10 and PM 2.5 is seriously caused adverse health effect. Through the determination of particulate matter concentration in atmospheric air for the betterment of human being well in primary importance. In this paper machine learning predictive models for forecasting particulate matter concentration in atmospheric air are investigated on Taiwan Air Quality Monitoring data sets, which were obtained from 2012 to 2017. These models were compared with the existing traditional models and perform better in predictive performance. The performance of these models was evaluated with statistical measures: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Square Error (MSE), and Coefficient of Determination (R 2).

Prediction and analysis of particulate matter (PM2.5 and PM10) concentrations using machine learning techniques

Journal of Ambient Intelligence and Humanized Computing, 2021

The National Capital Region (NCR) encircling the capital of India is the one of the most polluted regions in the world. Poor air quality is a cause of a number of diseases and reduction in life span. Particulate matter (PM) is the most significant as well as the most hazardous air pollutant in this region. This work proposes to build models to analyze and forecast PM concentrations at a location in the NCR. The correlation between PM concentrations in different seasons and with meteorological parameters and other air pollutants is studied to determine the most suitable explanatory variables for building the forecast models. The performance of the proposed models is evaluated with the help of variable importance ranking (VIR), partial plots and measures such as mean error, absolute mean error and root mean square error.

Machine Learning-Based Prediction of Air Quality

Applied Sciences

Air, an essential natural resource, has been compromised in terms of quality by economic activities. Considerable research has been devoted to predicting instances of poor air quality, but most studies are limited by insufficient longitudinal data, making it difficult to account for seasonal and other factors. Several prediction models have been developed using an 11-year dataset collected by Taiwan’s Environmental Protection Administration (EPA). Machine learning methods, including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions. A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R2 and RMSE, while AdaBoost provides best results for MAE.

Intelligent Forecasting of Air Quality and Pollution Prediction Using Machine Learning

Adsorption Science & Technology

Air pollution consists of harmful gases and fine Particulate Matter (PM2.5) which affect the quality of air. This has not only become the key issues in scientific research but also turned to be an important social issues of the public’s life. Therefore, many experts and scholars at different R&Ds, universities, and abroad are involved in lot of research on PM2.5 pollutant predictions. In this scenario, the authors proposed various machine learning models such as linear regression, random forest, KNN, ridge and lasso, XGBoost, and AdaBoost models to predict PM2.5 pollutants in polluted cities. This experiment is carried out using Jupyter Notebook in Python 3.7.3. From the results with respect to MAE, MAPE, and RMSE metrics, among the models, XGBoost, AdaBoost, random forest, and KNN models (8.27, 0.40, and 13.85; 9.23, 0.45, and 10.59; 39.84, 1.94, and 54.59; and 49.13, 2.40, and 69.92, respectively) are observed to be more reliable models. The PM2.5 pollutant concentration (PClow-PC...

Applicability of machine learning in modeling of atmospheric particle pollution in Bangladesh

Air Quality, Atmosphere & Health, 2020

Atmospheric particle pollution causes acute and chronic health effects. Predicting the concentrations of PM 2.5 and PM 10 , therefore, is a prerequisite to avoid the consequences and mitigate the complications. This research utilized the machine learning (ML) models such as linear-support vector machine (L-SVM), medium Gaussian-support vector machine (M-SVM), Gaussian process regression (GPR), artificial neural network (ANN), random forest regression (RFR), and a time series model namely PROPHET. Atmospheric NO X , SO 2 , CO, and O 3 , along with meteorological variables from Dhaka, Chattogram, Rajshahi, and Sylhet for the period of 2013 to 2019, were utilized as exploratory variables. Results showed that the overall performance of GPR performed better particularly for Dhaka in predicting the concentration of both PM 2.5 and PM 10 while ANN performed best in case of Chattogram and Sylhet for predicting PM 2.5. However, in terms of predicting PM 10 , M-SVM and RFR were selected respectively. Therefore, this study recommends utilizing "ensemble learning" models by combining several best models to advance application of ML in predicting pollutants' concentration in Bangladesh.