Fish survival prediction in an aquatic environment using random forest model (original) (raw)
Related papers
Predicting Water Quality Parameters in Mahseer Fish Farming Using Machine Learning Techniques
Seventh Sense Research Group®, 2024
Mahseer fish farming faces challenges in maintaining optimal water quality, essential for fish health and growth. Poor water quality can lead to stress, disease, and mortality, impacting productivity. This study compares Random forest Regression (RF) and Support Vector Regression (SVR) models in predicting water quality parameters, such as pH, dissolved oxygen (DO), and temperature. The RF model outperformed SVR, showing superior accuracy with lower Mean Squared Error (MSE) and Mean Absolute Error (MAE) and higher R-squared values (99% for DO, 98% for temperature, and 95% for pH). RF's superior performance makes it a reliable tool for tracking water quality trends and fluctuations. Recommendations for enhanced monitoring include extending data turbidity and capturing seasonal and long-term trends, integrating sensors for additional parameters like ammonia and turbidity, and developing a user-friendly mobile app for real-time data and alerts. These improvements aim to support the sustainability and productivity of Mahseer fish farming.
Random forests to evaluate biotic interactions in fish distribution models
Environmental Modelling & Software, 2015
Previous research indicated that high predictive performance in species distribution modelling can be obtained by combining both biotic and abiotic habitat variables. However, models developed for fish often only address physical habitat characteristics, thus omitting potentially important biotic factors. Therefore, we assessed the impact of biotic variables on fish habitat preferences in four selected stretches of the upper Cabriel River (E Spain). The occurrence of Squalius pyrenaicus and Luciobarbus guiraonis was related to environmental variables describing biotic interactions (inferred by relationships among fish abundances) and channel hydro-morphological characteristics. Random Forests (RF) models were trained and then validated using independent datasets. To build RF models, the conditional variable importance was used together with the model improvement ratio technique. The procedure showed effectiveness in identifying a parsimonious set of not correlated variables, which minimize noise and improve model performance in both training and validation phases. Water depth, channel width, fine substrate and water-surface gradient were selected as most important habitat variables for both fish. Results showed clear habitat overlapping between fish species and suggest that competition is not a strong factor in the study area.
Sensors, 2018
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.
Sustainability
This study aimed to evaluate classification algorithms to predict largemouth bass (Micropterus salmoides) occurrence in South Korea. Fish monitoring and environmental data (temperature, precipitation, flow rate, water quality, elevation, and slope) were collected from 581 locations throughout four major river basins for 5 years (2011–2015). Initially, 13 classification models built in the caret package were evaluated for predicting largemouth bass occurrence. Based on the accuracy (>0.8) and kappa (>0.5) criteria, the top three classification algorithms (i.e., random forest (rf), C5.0, and conditional inference random forest) were selected to develop ensemble models. However, combining the best individual models did not work better than the best individual model (rf) at predicting the frequency of largemouth bass occurrence. Additionally, annual mean temperature (12.1 °C) and fall mean temperature (13.6 °C) were the most important environmental variables to discriminate the pr...
2021
Physicochemical traits of river influence the habitat of fish species in aquatic ecosystems. Fish showed a complex relationship with aquatic factors in river. Machine learning (ML) modeling is a useful tool to established relationship between complex systems. This study identified the preferred habitat indicators of Chanda nama (a small indigenous fish), in the Krishna River, of peninsular India, using machine learning modeling. Data were observed on Chanda nama fish distribution (presence/absence) and associated ten physical and chemical parameters of water at 22 sampling sites on the river during year 2001-02. Machine learning models such as random forest (RF), artificial neural network (ANN), support vector machine (SVM), k-nearest neighbors (KNN) used for the classification of Chanda nama distribution in the river. The ML model efficiency was evaluated using classification accuracy (CCI), Cohen’s kappa coefficient (k), sensitivity, specificity and receiver-operating-characterist...
Developing an Ensembled Machine Learning Prediction Model for Marine Fish and Aquaculture Production
Sustainability, 2021
The fishing industry is identified as a strategic sector to raise domestic protein production and supply in Malaysia. Global changes in climatic variables have impacted and continue to impact marine fish and aquaculture production, where machine learning (ML) methods are yet to be extensively used to study aquatic systems in Malaysia. ML-based algorithms could be paired with feature importance, i.e., (features that have the most predictive power) to achieve better prediction accuracy and can provide new insights on fish production. This research aims to develop an ML-based prediction of marine fish and aquaculture production. Based on the feature importance scores, we select the group of climatic variables for three different ML models: linear, gradient boosting, and random forest regression. The past 20 years (2000–2019) of climatic variables and fish production data were used to train and test the ML models. Finally, an ensemble approach named voting regression combines those thre...
"Machine learning (ML) techniques have become important to support decision making in management and conservation of freshwater aquatic ecosystems. Given the large number of ML techniques and to improve the understanding of ML utility in ecology, it is necessary to perform comparative studies of these techniques as a preparatory analysis for future model applications. The objectives of this study were (i) to compare the reliability and ecological relevance of two predictive models for fish richness, based on the techniques of artificial neural networks (ANN) and random forests (RF) and (ii) to evaluate the conformity in terms of selected important variables between the two modelling approaches. The effectiveness of the models were evaluated using three performance metrics: the determination coefficient (R2), the mean squared error (MSE) and the adjusted determination coefficient (R2adj) and both models were developed using a k-fold crossvalidation procedure. According to the results, both techniques had similar validation performance (R2 = 68% for RF and R2 = 66% for ANN). Although the two methods selected different subsets of input variables, both models demonstrated high ecological relevance for the conservation of native fish in the Mediterranean region. Moreover, this work shows how the use of different modelling methods can assist the critical analysis of predictions at a catchment scale. Copyright ONEMA, 2013. "
Machine learning for manually-measured water quality prediction in fish farming
PLOS ONE
Monitoring variables such as dissolved oxygen, pH, and pond temperature is a key aspect of high-quality fish farming. Machine learning (ML) techniques have been proposed to model the dynamics of such variables to improve the fish farmer’s decision-making. Most of the research on ML in aquaculture has focused on scenarios where devices for real-time data acquisition, storage, and remote monitoring are available, making it easy to develop accurate ML techniques. However, fish farmers do not necessarily have access to such devices. Many of them prefer to use equipment to manually measure these variables limiting the amount of available data to process. In this work, we study the use of random forests, multivariate linear regression, and artificial neural networks in scenarios with limited amount of measurements to analyze data from water-quality variables that are commonly measured in fish farming. We propose a methodology to build models in two scenarios: i) estimation of unobserved v...
Smart Prediction of Water Quality System for Aquaculture using Machine Learning Algorithms
This article focuses on the importance of the continuous collection of water parameters data from the sensors and also the prediction of water quality using the latest different Machine learning algorithms like Logistic Regression, Random Forest, Support Vector Machine, Decision Tree, K-nearest Neighbour, XGBoost, Gradient Boosting and Naive Bayes. These Machine learning models are implemented and tested to validate and achieve a satisfactory result of water quality prediction in terms of different attributes like pH, hardness, Solids, Chloramines, Sulfate, Conductivity, organic carbon, trihalomethanes, Turbidity and potability.