Statistical Modeling Approaches for PM10 Prediction in Urban Areas; A Review of 21st-Century Studies (original) (raw)
Related papers
An Application of Machine Learning Methods to PM10 Level Medium-Term Prediction
Lecture Notes in Computer Science, 2007
The study described in this paper, analyzed the urban and suburban air pollution principal causes and identified the best subset of features (meteorological data and air pollutants concentrations) for each air pollutant in order to predict its medium-term concentration (in particular for the PM 10). An information theoretic approach to feature selection has been applied in order to determine the best subset of features by means of a proper backward selection algorithm. The final aim of the research is the implementation of a prognostic tool able to reduce the risk for the air pollutants concentrations to be above the alarm thresholds fixed by the law. The implementation of this tool will be carried out using machine learning methods based on some of the most widespread statistical data driven techniques (Artificial Neural Networks, ANN, and Support Vector Machines, SVM).
Spatial prediction of PM10 concentration using machine learning algorithms in Ankara, Turkey
Environmental Pollution, 2020
With the increase in population and industrialization, air pollution has become one of the global problems nowadays. Therefore, air pollutant parameters should be measured at regular intervals, and the necessary measures should be taken by evaluating the results of measurements. In order to prevent air pollution, pollutant parameters must be evaluated within the framework of a model. Recently, in order to obtain objective and more sensitive results with regard to air pollution nowadays, studies, which use machine learning algorithms in artificial intelligence technologies, have been carried out. In this study, PM 10 concentrations, which are obtained from 7 stations in Ankara province in Turkey, were trained with machine learning algorithms (LASSO, SVR, RF, kNN, xGBoost, ANN). The PM 10 concentrations of the years 2009e2017 of 6 stations in Ankara were given as input, and the PM 10 concentrations of the seventh station for the year 2018 were predicted. The model development stage was repeated for each station, and the performance and error rates of the algorithms were determined by comparing the results produced by the algorithms with the actual results. The best results were provided with ANN (R 2 ¼ 0.58, RMSE ¼ 20.8, MAE ¼ 14.4). The spatial distribution of the estimated concentration results was provided through Geographic Information System (GIS), and spatial strategies for improving air pollution over land use were established.
Forecasting PM10 in metropolitan areas: Efficacy of neural networks
Environmental Pollution, 2012
Deterministic photochemical air quality models are commonly used for regulatory management and planning of urban airsheds. These models are complex, computer intensive, and hence are prohibitively expensive for routine air quality predictions. Stochastic methods are becoming increasingly popular as an alternative, which relegate decision making to artificial intelligence based on Neural Networks that are made of artificial neurons or 'nodes' capable of 'learning through training' via historic data. A Neural Network was used to predict particulate matter concentration at a regulatory monitoring site in Phoenix, Arizona; its development, efficacy as a predictive tool and performance vis-à-vis a commonly used regulatory photochemical model are described in this paper. It is concluded that Neural Networks are much easier, quicker and economical to implement without compromising the accuracy of predictions. Neural Networks can be used to develop rapid air quality warning systems based on a network of automated monitoring stations.
Environmental and Ecological Statistics, 2016
Over the past years, the health impact of airborne particulate matter PM 10 has become a very topical subject. Thereby, a lot of research effort in the environmental sciences goes towards the modeling and the prediction of ambient PM 10 concentrations. In this paper, we are interested in the statistical classification of the daily mean PM 10 concentration in Tunisia according to the authority regulation. We consider two monitoring stations: a big industrial station and a traffic station. The main goal of this work is to determine the pertinent predictors of PM 10 concentration within a nonlinear multiclass framework. To do this, we used two popular statistical learning methods; the support vector machines (SVM) and the random forests (RF). The statistical results obtained on the real datasets, show that RF outperform SVM for the purpose of variable selection even with a reduced number of observations compared to the number of explicative variables. It was also demonstrated that the PM 10 concentration measured yesterday is the most relevant predictor of its present-day value. Moreover, we found that the more delayed values of PM 10 concentration may be crucial to get an accurate prediction.
Statistical PM2.5 Prediction in an Urban Area Using Vertical Meteorological Factors
Atmosphere
A key concern related to particulate air pollution is the development of an early warning system that can predict local PM2.5 levels and excessive PM2.5 concentration episodes using vertical meteorological factors. Machine learning (ML) algorithms, particularly those with recognition tasks, show great potential for this purpose. The objective of this study was to compare the performance of multiple linear regression (MLR) and multilayer perceptron (MLP) in predicting PM2.5 levels. The software was trained to predict PM2.5 levels up to 7 days in advance using data from long-term measurements of vertical meteorological factors taken at five heights above ground level (AGL)—10, 30, 50, 75, and 110 m—and PM2.5 concentrations measured 30 m AGL. The data used were collected between 2015 and 2020 at the Microclimate and Air Pollutants Monitoring Tower station at Kasetsart University, Bangkok, Thailand. The results showed that the correlation coefficients of PM2.5 predicted and observed usi...
Environmental Science and Pollution Research, 2011
In the present work, two types of artificial neural network (NN) models using the multilayer perceptron (MLP) and the radial basis function (RBF) techniques, as well as a model based on principal component regression analysis (PCRA), are employed to forecast hourly PM 10 concentrations in four urban areas (Larnaca, Limassol, Nicosia and Paphos) in Cyprus. The model development is based on a variety of meteorological and pollutant parameters corresponding to the 2-year period between July 2006 and June 2008, and the model evaluation is achieved through the use of a series of well-established evaluation instruments and methodologies. The evaluation reveals that the MLP NN models display the best forecasting performance with R 2 values ranging between 0.65 and 0.76, whereas the RBF NNs and the PCRA models reveal a rather weak performance with R 2 values between 0.37-0.43 and 0.33-0.38, respectively. The derived MLP models are also used to forecast Saharan dust episodes with remarkable success (probability of detection ranging between 0.68 and 0.71). On the whole, the analysis shows that the models introduced here could provide local authorities with reliable and precise predictions and alarms about air quality if used on an operational basis.
2008 19th International Conference on Systems Engineering, 2008
The aim of the study was to examine the possibilities of the development of a prognostic instrument for the air quality management in cities. The study was focused on the development of the neural network models for prediction of the classes of the air quality state in relation to maximum daily dust PM 10 concentration. The air quality class was predicted for the next day in relation to maximal daily concentrations. The models MLP and RBF were tested. The tests were carried out in the city of Lodz in central Poland. The results of the modelling were satisfactory. In the optimally constructed models false prognosis (in testing series) were only 7.4% in the case of predicting maximal daily concentration (test series) and 2.7% (training series). A low level of error prediction confirmed the fact, that the neural network models is an effective instrument of the air quality management in cities.
Atmospheric Environment, 2000
Hourly average concentrations of PM have been measured at a "xed point in the downtown area of Santiago, Chile. We have focused our attention on data for the months that register higher values, from May to September, on years 1994 and 1995. We show that it is possible to predict concentrations at any hour of the day, by "tting a function of the 24 hourly average concentrations measured on the previous day. We have compared the predictions produced by three di!erent methods: multilayer neural networks, linear regression and persistence. Overall, the neural network gives the best results. Prediction errors go from 30% for early hours to 60% for late hours. In order to improve predictions, the e!ect of noise reduction, rearrangement of the data and explicit consideration of meteorological variables are discussed.
ISPRS International Journal of Geo-Information
Environmental pollution has mainly been attributed to urbanization and industrial developments across the globe. Air pollution has been marked as one of the major problems of metropolitan areas around the world, especially in Tehran, the capital of Iran, where its administrators and residents have long been struggling with air pollution damage such as the health issues of its citizens. As far as the study area of this research is concerned, a considerable proportion of Tehran air pollution is attributed to PM10 and PM2.5 pollutants. Therefore, the present study was conducted to determine the prediction models to determine air pollutions based on PM10 and PM2.5 pollution concentrations in Tehran. To predict the air-pollution, the data related to day of week, month of year, topography, meteorology, and pollutant rate of two nearest neighbors as the input parameters and machine learning methods were used. These methods include a regression support vector machine, geographically weighte...
Atmosphere
The PM10 concentration is subject to significant changes brought on by both gaseous and meteorological variables. The aim of this research was to explore the performance of a hybrid model combining the support vector machine (SVM) and the boosted regression trees (BRT) technique in predicting the PM10 concentration for 3 consecutive days. The BRT model was trained by utilizing maximum daily data in the cities of Alor Setar, Klang, and Kuching from the years 2002 to 2017. The SVM–BRT model can optimize the number of predictors and predict PM10 concentration; it was shown to be capable of predicting air pollution based on the models’ performance with NAE (0.15–0.33), RMSE (10.46–32.60), R2 (0.33–0.70), IA (0.59–0.91), and PA (0.50–0.84). This was accomplished while saving training time by reducing the feature size given in the data representation and preventing learning from noise (overfitting) to improve accuracy. This knowledge establishes the foundation for the development of effic...