An efficient spatiotemporal data calibration approach for the low-cost PM2.5 sensing network: A case study in Taiwan (original) (raw)

Low-processing data enrichment and calibration for PM2.5 low-cost sensors

Thermal Science, 2023

Particulate matter (PM) in air has been proven to be hazardous to human health. Here we focused on analysis of PM data we obtained from the same campaign which was presented in our previous study. Multivariate linear and random forest models were used for the calibration and analysis. In our linear regression model the inputs were PM, temperature and humidity measured with low-cost sensors, and the target was the reference PM measurements obtained from SEPA in the same timeframe.

Low-Cost Air Quality Sensor Evaluation and Calibration in Contrasting Aerosol Environments

The use of low-cost sensors (LCS) in air quality monitoring has been gaining interest across all walks of society, including community and citizen scientists, academic research groups, environmental agencies, and the private sector. Traditional air monitoring, performed by regulatory agencies, involves expensive regulatory-grade equipment and requires ongoing maintenance and quality control checks. The low-price tag, minimal operating cost, ease of use, and open data access are the primary driving factors behind the popularity of LCS. This study discusses the role and associated challenges of PM2.5 25 sensors in monitoring air quality. We present the results of evaluations of the PurpleAir (PA.) PA-II LCS against regulatorygrade PM2.5 federal equivalent methods (FEM) and the development of sensor calibration algorithms. The LCS calibration was performed for 2 to 4 weeks during December 2019-January 2020 in Raleigh, NC, and Delhi, India, to evaluate the data quality under different aerosols loadings and environmental conditions. This exercise aims to develop a robust calibration model that uses PA measured parameters (i.e., PM2.5, temperature, relative humidity) as input and provides bias-corrected 30 PM2.5 output at an hourly scale. Thus, the calibration model relies on simultaneous measurements of PM2.5 by FEM as target output during the calibration model development process. We applied various statistical and machine learning methods to achieve a regional calibration model. The results from our study indicate that, with proper calibration, we can achieve biascorrected PM2.5 data using PA sensors within 12% percentage mean absolute bias at hourly and within 6% for a daily average. Our study also suggests that pre-deployment calibrations developed at local or regional scales should be performed for the PA 35 sensors to correct data from the field for scientific data analysis. 1. Introduction Air quality monitoring is critical for managing and mitigating air pollution at varying spatiotemporal scales. However, air 40 quality monitoring is limited in many parts of the world (Martin et al., 2019) in part due to the high cost and technical experience requirements of operating regulatory-grade monitors (R.G.M.). Regulatory-grade continuous air quality monitors have high measurement accuracy under varying operating conditions. The high cost of RGM and their associated infrastructure needs and regular maintenance also limit the extensive deployment of such monitors in a region and the spatial density of the network. This is particularly true in developing countries. The lack of data affects critical decision-making by the public about 45 their day-today activities and regulatory agencies for controlling and mitigating air pollution in many regions.

Evaluation of Crowd-Sourced PM2.5 Measurements from Low-Cost Sensors for Air Quality Mapping in Stuttgart City

iCity. Transformative Research for the Livable, Intelligent, and Sustainable City, 2022

Exposure to particulate matter (PM) pollution poses a major risk to the environment and human health. Monitoring PM pollution is thus crucial to understand particle distribution and mitigation. There has been rapid development of low-cost PM sensors and advancement in the field of Internet of Things (IoT) that has led to the deployment of the sensors by technology-aware people in cities. In this study, we evaluate the stability and accuracy of PM measurements from low-cost sensors crowd-sourced from a citizen science project in Stuttgart. Long-term measurements from the sensors show a strong correlation with measurements from reference stations with most of the selected sensors achieving Pearson correlation coefficients of r > 0.7. We investigate the stability of the sensors for reproducibility of measurements using five sensors installed at different height levels and horizontal distances. They exhibit minor variations with low correlation of variation (CV) values of between 10 and 14%. A CV of 10% is recommended for low-cost sensors. In a dense network, the sensors enable extraction pollution patterns and trends. We analyse PM measurements from 2 years using space-time pattern analysis and generate two clusters of sensors that have similar trends. The clustering shows the relationship between traffic and pollution with most sensors near major roads being in the same cluster.

Ensemble learning of model hyperparameters and spatiotemporal data for calibration of low-cost PM2.5 sensors

Mathematical Biosciences and Engineering

The PM 2.5 air quality index (AQI) measurements from government-built supersites are accurate but cannot provide a dense coverage of monitoring areas. Low-cost PM 2.5 sensors can be used to deploy a fine-grained internet-of-things (IoT) as a complement to government facilities. Calibration of low-cost sensors by reference to high-accuracy supersites is thus essential. Moreover, the imputation for missing-value in training data may affect the calibration result, the best performance of calibration model requires hyperparameter optimization, and the affecting factors of PM 2.5 concentrations such as climate, geographical landscapes and anthropogenic activities are uncertain in spatial and temporal dimensions. In this paper, an ensemble learning for imputation method selection, calibration model hyperparameterization, and spatiotemporal training data composition is proposed. Three government supersites are chosen in central Taiwan for the deployment of low-cost sensors and hourly PM 2.5 measurements are collected for 60 days for conducting experiments. Three optimizers, Sobol sequence, Nelder and Meads, and particle swarm optimization (PSO), are compared for evaluating their performances with various versions of ensembles. The best calibration results are obtained by using PSO, and the improvement ratios with respect to R 2 , RMSE, and NME, are 4.92%, 52.96%, and 56.85%, respectively.

Gaussian Process regression model for dynamically calibrating a wireless low-cost particulate matter sensor network in Delhi

Atmospheric Measurement Techniques Discussions

Wireless low-cost particulate matter sensor networks (WLPMSNs) are transforming air quality monitoring by providing PM information at finer spatial and temporal resolutions; however, large-scale WLPMSN calibration and maintenance remain a challenge because the manual labor involved in initial calibration by collocation and routine recalibration is intensive, the transferability of the calibration models determined from initial collocation to new deployment sites is questionable as calibration factors typically vary with urban heterogeneity of operating conditions and aerosol optical properties, and the stability of low-cost sensors can develop drift or degrade over time. This study presents a simultaneous Gaussian Process regression (GPR) and simple linear regression pipeline to calibrate and monitor dense WLPMSNs on the fly by leveraging all available reference monitors across an area without resorting to pre-deployment collocation calibration. We evaluated our method for Delhi where the PM2.5 measurements of all 22 regulatory reference and 10 low-cost nodes were available in 59 valid days from January 1, 2018 to March 31, 2018 (PM2.5 averaged 138 ± 31 µg m-3 among 22 reference stations) using a leave-one-out cross-validation (CV) over the 22 reference nodes. We showed that our approach can achieve an overall 30 % prediction error (RMSE: 33 µg m-3) at a 24 h scale and is robust as underscored by the small variability in the GPR model parameters and in the model-produced calibration factors for the low-cost nodes among the 22-fold CV. We revealed that the accuracy of our calibrations depends on the degree of homogeneity of PM concentrations, and decreases with increasing local source contributions. As by-products of dynamic calibration, our algorithm can be adapted for automated large-scale WLPMSN monitoring as simulations proved its capability of differentiating malfunctioning or singular low-cost nodes within a network via model-generated calibration factors with the aberrant nodes having slopes close to 0 and intercepts close to the global mean of true PM2.5 and of tracking the drift of low-cost nodes accurately within 4 % error for all the simulation scenarios. The simulation results showed that ~20 reference stations are optimum for our solution in Delhi and confirmed that low-cost nodes can extend the spatial precision of a network by decreasing the extent of pure interpolation among only reference stations. Our solution has substantial implications in reducing the amount of manual labor for the calibration and surveillance of extensive WLPMSNs, improving the spatial comprehensiveness of PM evaluation, and enhancing the accuracy of WLPMSNs.

Field Evaluation and Calibration of Low-Cost Air Pollution Sensors for Environmental Exposure Research

Sensors, 2022

This paper seeks to evaluate and calibrate data collected by low-cost particulate matter (PM) sensors in different environments and using different aggregated temporal units (i.e., 5-s, 1-min, 10-min, 30 min intervals). We first collected PM concentrations (i.e., PM1, PM2.5, and PM10) data in five different environments (i.e., indoor and outdoor of an office building, a train platform and lobby of a subway station, and a seaside location) in Hong Kong, using five AirBeam2 sensors as the low-cost sensors and a TSI DustTrak DRX Aerosol Monitor 8533 as the reference sensor. By comparing the collected PM concentrations, we found high linearity and correlation between the data reported by the AirBeam2 sensors in different environments. Furthermore, the results suggest that the accuracy and bias of the PM data reported by the AirBeam2 sensors are affected by rainy weather and environments with high humidity and a high level of hygroscopic salts (i.e., a seaside location). In addition, inc...

Concentration-Temporal Multilevel Calibration of Low-Cost PM2.5 Sensors

Sustainability

Ambient aerosols have a significant impact on plant species mortality, air pollution, and climate change. It is critical to monitor the concentrations of aerosols, especially particulate matter with an aerodynamic diameter ≤ 2.5 μm (PM2.5), which has a direct relationship with human respiratory diseases. Recently, low-cost PM2.5 sensors have been deployed to provide a denser monitoring coverage than that of government-built monitoring supersites, which only give a macro perspective of air quality. To increase the measurement accuracy, low-cost sensors need to be calibrated. In current practice, regression techniques are used to calibrate sensors. This paper proposes a concentration-temporal multilevel calibration method to cope with the varying regression relation in different concentration and temporal domains. The performance of our method is evaluated with real field data from a supersite sensor and a low-cost sensor deployed in Puli, Taiwan. The experimental results show that ou...

Development of a PM2.5 Forecasting System Integrating Low-cost Sensors for Ho Chi Minh City, Vietnam

Aerosol and Air Quality Research, 2020

Air pollution is a serious concern in urban areas, especially cities such as Ho Chi Minh City (HCMC). Because the air quality directly affects people's health, air quality monitoring is urgently needed. In this study, the models of Weather Research and Forecasting (WRF), Sparse Matrix Operator Kernel Emission (SMOKE), and Community Multiscale Air Quality (CMAQ) were integrated to develop an air quality forecasting system. Drawing input data from transportation and industrial emission inventories, the forecasting system was calibrated and configured using local parameters to deliver hourly forecasts for HCMC. To increase the accuracy of WRF and the meteorological forecasting, the global DEM and land use data were replaced by Lidar data, and land use data were also retrieved from MODIS. Output from the MOZART model served as the boundary conditions for CMAQ, and AOD values reported by the MODIS Aerosol Product were assimilated to enhance the accuracy of the results. A low-cost PM 2.5 sensor connected to a LinkIt ONE, a development board for Internet of things (IoT) devices, was employed for calibration and verification. The strong correlation (R 2 = 0.8) between the measured and predicted concentrations indicates that the estimates delivered by the proposed forecasting system are consistent with the values obtained via monitoring.

Monitoring, Mapping, and Modeling Spatial–Temporal Patterns of PM2.5 for Improved Understanding of Air Pollution Dynamics Using Portable Sensing Technologies

International Journal of Environmental Research and Public Health

Fine particulate matter with an aerodynamic diameter of less than 2.5 µm (PM2.5) is highly variable in space and time. In this study, the dynamics of PM2.5 concentrations were mapped at high spatio-temporal resolutions using bicycle-based, mobile measures on a university campus. Significant diurnal and daily variations were revealed over the two-week survey, with the PM2.5 concentration peaking during the evening rush hours. A range of predictor variables that have been proven useful in estimating the pollution level was derived from Geographic Information System, high-resolution airborne images, and Light Detection and Ranging (LiDAR) datasets. Considering the complex interplay among landscape, wind, and air pollution, variables influencing the PM2.5 dynamics were quantified under a new wind wedge-based system that incorporates wind effects. Panel data analysis models identified eight natural and built environment variables as the most significant determinants of local-scale air qu...