A novel application of an adaptable modeling approach to the management of toxic microalgal bloom events in coastal areas (original) (raw)
Related papers
2020
Ostreopsis cf. ovata, a benthic toxic marine dinoflagellate, has been recorded along Italian coasts since the '90, but large bloom events have been reported only in recent years. In 2005, a monitoring programme started along the Ligurian coast (North-western Mediterranean), where time series of cell abundances have been collected for several sites, together with a range of related environmental variables. Data of cell abundances in 15 sites, together with environmental data provided by meteo-marine forecasting models used by the Regional Environmental Agency (ARPAL), have been used to implement a predictive modelling tool, able to forecast Ostreopsis cells concentration threshold exceedance as a function of meteo-marine forecasts. Starting from the experience of the predictive model implemented in 2015, the Quantile Regression Forest (QRF) has been applied: the model has been trained on past data (from 2015 until 2017) and tested with data taken during the two last available yea...
A Remote Sensing and Machine Learning-Based Approach to Forecast the Onset of Harmful Algal Bloom
Remote Sensing
In the last few decades, harmful algal blooms (HABs, also known as “red tides”) have become one of the most detrimental natural phenomena in Florida’s coastal areas. Karenia brevis produces toxins that have harmful effects on humans, fisheries, and ecosystems. In this study, we developed and compared the efficiency of state-of-the-art machine learning models (e.g., XGBoost, Random Forest, and Support Vector Machine) in predicting the occurrence of HABs. In the proposed models the K. brevis abundance is used as the target, and 10 level-02 ocean color products extracted from daily archival MODIS satellite data are used as controlling factors. The adopted approach addresses two main shortcomings of earlier models: (1) the paucity of satellite data due to cloudy scenes and (2) the lag time between the period at which a variable reaches its highest correlation with the target and the time the bloom occurs. Eleven spatio-temporal models were generated, each from 3 consecutive day satellit...
Scientific Reports
Increasing occurrence of harmful algal blooms across the land–water interface poses significant risks to coastal ecosystem structure and human health. Defining significant drivers and their interactive impacts on blooms allows for more effective analysis and identification of specific conditions supporting phytoplankton growth. A novel iterative Random Forests (iRF) machine-learning model was developed and applied to two example cases along the California coast to identify key stable interactions: (1) phytoplankton abundance in response to various drivers due to coastal conditions and land-sea nutrient fluxes, (2) microbial community structure during algal blooms. In Example 1, watershed derived nutrients were identified as the least significant interacting variable associated with Monterey Bay phytoplankton abundance. In Example 2, through iRF analysis of field-based 16S OTU bacterial community and algae datasets, we independently found stable interactions of prokaryote abundance p...
Ecological Modelling, 2004
Under European Commission project harmful algal bloom expert system (HABES), a rule-based model has been developed in the Dutch pilot study for predicting Phaeocystis globosa blooms in the Dutch coastal waters (Noordwijk 10) of the North Sea. The model uses decision trees to qualitatively predict bloom timing (bloom or not bloom in a certain day) and uses nonlinear piecewise regression to quantitatively predict bloom intensity (cell concentrations). A multi-variable regression model was also set up to predict bloom duration if bloom is forecasted to take place. The constructed model clearly indicates that the joint effects of mean water column irradiance of photic depth (I m ), temperature (T) and dissolved inorganic phosphorus (DIP) determine bloom timing and intensity. The bloom duration depends on the bloom timing (starting day), starting intensity and temperature. Irradiance is seen to act just as one of the triggers to P. globosa bloom. As long as it is higher than the threshold, extra irradiance plays little role in bloom intensity or duration. River discharge from the Rhine does not have instant effect on P. globosa bloom. The threshold values of I m , T and DIP independently found by the model are in accordance with those discovered by other researchers through laboratory experiments. The model was tested by an independent dataset from the same area, and the model results agree well with the real observations both qualitatively and quantitatively. The developed rule-based model is sensible to be interpreted from ecological point of view and is applicable in practice. Due to splits of parameters' space in decision trees and piecewise regression, the model has great advantages to deal with the common problem in algal blooms that limiting factor is changing. The research demonstrates that decision trees and nonlinear piecewise regression are quite promising alternative techniques in modelling harmful algal blooms.
Scientific Reports, 2019
Increased anthropic pressure on the coastal zones of the Mediterranean Sea caused an enrichment in nutrients, promoting microalgal proliferation. Among those organisms, some species, such as the dinoflagellate Alexandrium minutum, can produce neurotoxins. Toxic blooms can cause serious impacts to human health, marine environment and economic maritime activities at coastal sites. A mathematical model predicting the presence of A. minutum in coastal waters of the NW Adriatic Sea was developed using a Random Forest (RF), which is a Machine Learning technique, trained with molecular data of A. minutum occurrence obtained by molecular PCR assay. The model is able to correctly predict more than 80% of the instances in the test data set. Our results showed that predictive models may play a useful role in the study of Harmful Algal Blooms (HAB).
Environmental science & technology, 2018
Harmful algal blooms are a growing human and environmental health hazard globally. Eco-physiological diversity of the cyanobacteria genera that make up these blooms creates challenges for water managers tasked with controlling the intensity and frequency of blooms, particularly of harmful taxa (e.g., toxin producers, N fixers). Compounding these challenges is the ongoing debate over the efficacy of nutrient management strategies (phosphorus-only versus nitrogen and phosphorus), which increases decision-making uncertainty. To improve our understanding of how different cyanobacteria respond to nutrient levels and other biophysical factors, we analyzed a unique 17 year data set comprising monthly observations of cyanobacteria genera and zooplankton abundances, water quality, and flow in a bloom-impacted, subtropical, flow-through lake in Florida (United States). Using the Random Forests machine learning algorithm, an ensemble modeling approach, we characterized and quantified relations...
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016
We developed and successfully applied data-driven models that heavily rely on readily available remote sensing datasets to investigate probabilities of algal bloom occurrences in Kuwait Bay. An artificial neural network (ANN) model, a multivariate regression (MR) model, and a spatiotemporal hybrid model were constructed, optimized, and validated. Temporal and spatial submodels were coupled in a hybrid modeling framework to improve on the predictive powers of conventional ANN and MR generic models. Sixteen variables (sea surface temperature [SST], chlorophyll a OC3M, chlorophyll a Generalized Inherent Optical Property (GIOP), chlorophyll a Garver-Siegel-Maritorena (GSM), precipitation, CDOM, turbidity index, PAR, euphotic depth, Secchi depth, wind direction, wind speed, bathymetry, distance to nearest river outlet, distance to shore, and distance to aquaculture) were used as inputs for the spatial submodel; all of these, with the exception of bathymetry, distance to nearest river outlet, distance to shore, and distance to aquaculture were used for the temporal submodel as well. Findings include: 1) the ANN model performance exceeded that of the MR model and 2) the hybrid models improved the model performance significantly; 3) the temporal variables most indicative of the timing of bloom propagation are sea surface temperature, Secchi disk depth, wind direction, chlorophyll a (OC3M), and wind speed; and 4) the spatial variables most indicative of algal bloom distribution are the ocean chlorophyll from OC3M, GSM, and the GIOP products; distance to shore; and SST. The adopted methodologies are reliable, cost-effective and could be used to forecast algal bloom occurrences in data-scarce regions. Index Terms-Coupled spatiotemporal algal bloom model, data mining, Kuwait bay, neural networks, remote sensing.
Toxics
Many countries have attempted to mitigate and manage issues related to harmful algal blooms (HABs) by monitoring and predicting their occurrence. The infrequency and duration of HABs occurrence pose the challenge of data imbalance when constructing machine learning models for their prediction. Furthermore, the appropriate selection of input variables is a significant issue because of the complexities between the input and output variables. Therefore, the objective of this study was to improve the predictive performance of HABs using feature selection and data resampling. Data resampling was used to address the imbalance in the minority class data. Two machine learning models were constructed to predict algal alert levels using 10 years of meteorological, hydrodynamic, and water quality data. The improvement in model accuracy due to changes in resampling methods was more noticeable than the improvement in model accuracy due to changes in feature selection methods. Models constructed ...
2021
Harmful algal blooms (HABs) intoxicate and asphyxiate marine life, causing devastating environmental and socio-economic impacts costing at least $8bn/yr globally. Accumulation of phycotoxins from HAB phytoplankton in filter-feeding shellfish can poison human consumers, prompting site harvesting closures if concentrations in shellfish exceed safe levels. To better quantify both long- and short-term HAB risks, we developed novel data-driven approaches to predict phycotoxin concentrations in bivalve shellfish associated with HAB forming Dinophysis species. Our spatiotemporal statistical modelling framework assesses long-term HAB risks for different shellfish species in both data-rich and data-poor locations. This can revolutionise mariculture management by more confidently informing optimal siting of new shellfish operations and safe harvesting periods for businesses. Meanwhile, our machine learning framework forecasts phycotoxin concentrations further into the future than previously p...
Journal of Hydroinformatics, 2017
There is a growing world need for predicting algal blooms in lakes and reservoirs to better manage water quality. We applied the random forest model with a sliding window strategy, which is one of the machine learning algorithms, to forecast chlorophyll-a concentrations in the fresh water of the Urayama Reservoir and the saline water of Lake Shinji. Both water bodies are situated in Japan and have historical water records containing more than ten years of data. The Random Forest (RF) model allowed us to forecast trends in time series of chlorophyll-a in these two water bodies. In the case of the reservoir, we used the data separately from two sampling stations. We found that the best model parameters for the number of min-leaf, and with/without pre-selection of predictors, varied at different stations in the same reservoir. We also found that the best performance of lead-time and accuracy of the prediction varied between the two stations. In the case of the lake, we found the best c...