Algae-based Biomonitoring: Predicting Diatom Reference Communities in Unpolluted Streams using Classification Trees, Random Forests, and Artificial Neural Networks (original) (raw)
Related papers
Ecological Indicators, 2009
Freshwater Diatoms Indices Predictive model a b s t r a c t Diatoms are widely used in stream bioassessment due to their broad distribution, extraordinary variability and the ability to integrate changes in water quality. The indices Specific Polluosensitivity Index (SPI), standardized Biological Diatom Index (BDI), European Economic Community Index (CEC) and Generic Diatom Index (GDI), originally developed in France, are often applied in Portugal to evaluate stream ecological quality based on diatom
Water quality assessment using diatom assemblages and advanced modelling techniques
Freshwater Biology, 2004
Summary1. Two types of artificial neural networks procedures were used to define and predict diatom assemblage structures in Luxembourg streams using environmental data.2. Self‐organising maps (SOM) were used to classify samples according to their diatom composition, and multilayer perceptron with a backpropagation learning algorithm (BPN) was used to predict these assemblages using environmental characteristics of each sample as input and spatial coordinates (X and Y) of the cell centres of the SOM map identified as diatom assemblages as output. Classical methods (correspondence analysis and clustering analysis) were then used to identify the relations between diatom assemblages and the SOM cell number. A canonical correspondence analysis was also used to define the relationship between these assemblages and the environmental conditions.3. The diatom‐SOM training set resulted in 12 representative assemblages (12 clusters) having different species compositions. Comparison of observe...
Canadian Journal of Fisheries and Aquatic Sciences, 2006
We developed a diatom-based index that integrates the effects of multiple stresses on streams and provides information related to the "distance" from the nonimpacted state. The Eastern Canadian Diatom Index (IDEC) was based on a correspondence analysis (CA) to develop a chemistry-free index where the position of the sites along the gradient of maximum variance (first axis) is strictly determined by diatom community structure and is therefore independent of measured environmental variables. The index value indicates the distance of each diatom community from its specific reference community. A high index value represents a non- or less-impacted site, while a low index value represents a more heavily impacted site. Two sub-indices were developed based on two sets of reference communities. The IDEC-circumneutral includes the sites that have reference communities characteristic of slightly acidic or neutral environments. The IDEC-alkaline includes the sites that have reference...
Canadian Journal of Fisheries and Aquatic Sciences, 2006
The identification of biological reference conditions specific to each type of water body is essential for the development of sound biological indicators and criteria. The purpose of the present study was to establish the reference conditions of each stream type sampled in southern Québec (Canada) using benthic diatoms and environmental variables characterizing streams and watersheds. First, stream reaches were classified as a function of their natural watershed and habitat characteristics. Second, diatom communities were classified based solely on taxa abundance data. Resulting groups were graphically presented on ordinations to interpret, a posteriori, the environmental gradients associated with diatom groups and to identify the diatom communities representing the reference conditions of each of the stream reach groups. A final classification based solely on diatom reference communities found pH and conductivity to be the main discriminating factors, regardless of ecoregion and st...
Ecological Modelling, 2007
Aquatic ecosystem management Artificial neural network a b s t r a c t Distinguishing natural variations from human related changes becomes crucial in assessment, maintenance and restoration of aquatic ecosystems. The present work aims to focus on variation in the biogeographical frame of diatom assemblages in natural or near-natural conditions. First, 233 diatom samples collected from clean to less disturbed sites of the French hydrographic network were classified based on similarities of community assemblages using a self-organizing map. The results showed five different community types, corresponding to specific environmental conditions and made sense with the existing French hydro-ecoregions. Second, the community types were predicted with a set of environmental variables through a multilayer perceptron (MLP) with a backpropagation learning algorithm. The predictability was further compared with a discriminant function analysis and a regression tree. The best results were obtained with MLP. The relative importance of environmental variables to predict diatom community types was additionally evaluated through a sensitivity analysis of MLP. In each French hydro-ecoregion it was possible to predict how the community should be like out of anthropogenic pressure. According to the Water Framework Directive requirements in Europe, this work gives idea of the diatom assemblage representative of the good ecological status, taking into account ecoregional particularities of natural environment. to various environmental conditions. Diatoms have been widely used as an indicator group in water quality management , as an efficient means of early warning of aquatic ecosystems disturbances. Predictive approaches have been developed to infer environmental conditions from a structure of diatom assemblages. Models were first applied to paleolimnology: information 0304-3800/$ -see front matter
Ecological Indicators, 2018
In this study we developed a predictive diatom-based model to assess the ecological status of streams and rivers of Northern Spain. Diatom samples were collected with standard protocols over stones from 676 sites distributed along existing environmental conditions across Northern Spain, during seven years between 2002 and 2008 (n = 1056 samples). This dataset included a network of 91 reference sites selected by using criteria that confirmed the absence of relevant human pressures according WFD. A multinomial logistic regression using GAAC cluster-derived reference sites group as response variable was performed. The independent variables included obligatory typology factors (WFD System A typology descriptors), and other optional typology B descriptors were included in the model performed on a forward stepwise procedure. The Ecological Quality Ratios (EQRs) were obtained by dividing the observed similarity between the diatoms composition in each sample by the expected median similarity of each type reference diatom community. The model predictions (EQRs) responded significantly to eutrophication and intensive agriculture pressures, but were not related with sewages, hydromorphological alterations and extensive agriculture pressures. These results demonstrated the accuracy of the diatom model in predicting nutrient enrichment in Northern Spanish rivers and streams.
2009
The purpose of this study was to fi rst present different approaches used for developing diatombased indices and second, to evaluate the ecological integrity of eastern Canadian streams (Québec) using these different approaches. Six indices from Europe (TDI, IPS, IBD, SLA) and North America (IBI, IDEC) were employed. The results from this study confi rmed that, as a general trend, most of the common riverine taxa have similar ecological preferences throughout Europe and North America. The comparison of the six diatom-based indices illustrated the similarity of results and robustness of diatom-based monitoring no matter what indices were used. All indices effectively scored river health across most of the environmental gradient, although variations in scores were observed at both tails of the integrity spectrum with the greatest range in index scores at the heavily impacted end of the gradient. The use of sub-indices to account for ecoregion characteristics and natural pH variations improves regional index performance, especially at the extremes (i.e. in reference and heavily impacted rivers). Regionally derived indices are more sensitive and preferred, although we recognize the utility of using larger gradient indices excluding ecoregion or geological considerations for broader national and international applications.
Journal of Applied Phycology, 2014
Stream algal indices of biotic integrity (IBIs) are generally based entirely or largely on diatoms, because nondiatom ("soft") algae can be difficult to quantify and taxonomically challenging, thus calling into question their practicality and cost-effectiveness for use as bioindicators. Little has been published rigorously evaluating the strengths of diatom vs. soft algae-based indices, or how they compare to indices combining these assemblages. Using a set of ranked evaluation criteria, we compare indices of biotic integrity (IBIs) (developed for southern California streams) that incorporate different combinations of algal assemblages. We split a large dataset into independent "calibration" and "validation" subsets, then used the calibration subset to screen candidate metrics with respect to degree of responsiveness to anthropogenic stress, metric score distributions, and signal-to-noise ratio. The highest-performing metrics were combined into a total of 25 IBIs comprising either singleassemblage metrics (based on either diatoms or soft algae, including cyanobacteria) or combinations of metrics representing the two assemblages (for "hybrid IBIs"). Performance of all IBIs was assessed based on: responsiveness to anthropogenic stress (in terms of surrounding land uses and a composite water-chemistry gradient) using the validation data, and evaluated based on signal-to-noise ratio, metric redundancy, and degree of indifference to natural gradients. Hybrid IBIs performed best overall based on our evaluation. Single-assemblage IBIs ranked lower than hybrids vis-à-vis the abovementioned performance attributes, but may be considered appropriate for routine monitoring applications. Tradeoffs inherent in the use of the different algal assemblages, and types of IBI, should be taken into consideration when designing an algae-based stream bioassessment program.
Journal of the North American Benthological Society, 2007
Diatom-based indicators can contribute significantly to comprehensive assessments of stream biological conditions. We used modeling to develop, evaluate, and compare 2 types of diatom-based indicators for Idaho streams: an observed/expected (O/E) ratio of taxon loss derived from a model similar to the River InVertebrate Prediction And Classification System (RIVPACS) and a multimetric index (MMI). Modeling the effects of natural environmental gradients on assemblage composition is a key component of RIVPACS, but modeling has seldom been used for MMI development. Diatom assemblage structure varied substantially among reference-site samples, but neither ecoregion nor bioregion accounted for a significant portion of that variation. Therefore, we used Classification and Regression Trees (CART) to model the variation of individual metrics with natural gradients. For both CART and RIVPACS modeling, we restricted predictors to natural variables unaffected by or resistant to human disturbances. On average, 46% of the total variance in 32 metrics could be explained by CART models, but the predictor variables differed among the metrics and often showed evidence of interacting with one another. The use of CART residuals (i.e., metric values adjusted for the effect of natural environmental gradients) affected whether or how strongly many metrics discriminated between reference and test sites. We used cluster analysis to examine redundancies among candidate metrics and then selected the metric with the highest discrimination efficiency from each cluster. This step was applied to both unadjusted and adjusted metrics and led to inclusion of 7 metrics in MMIs. Adjusted MMIs were more precise than unadjusted ones (coefficient of variation ;50% lower). Adjusted and unadjusted MMIs rated similar proportions of the test sites as being in nonreference condition but disagreed on the assessment of many individual test sites. Use of unadjusted MMIs probably resulted in higher rates of both Type I and Type II errors than use of adjusted metrics, a logical consequence of the inability of unadjusted metrics to distinguish the confounding effects of natural environmental factors from those associated with human-caused stress. The RIVPACS-type model for diatom assemblages performed similarly to models developed for invertebrate assemblages. The O/E ratio was as precise as the adjusted MMI, but rated a lower proportion of test sites as being in nonreference condition, implying that taxon loss was less severe than changes in overall diatom assemblage structure. As previously demonstrated for O/E measures, modeling appears to be an effective means of developing more accurate and precise MMIs. Furthermore, modeling enabled us to develop a single MMI for use throughout an environmentally heterogeneous region.