Using Data-Mining Techniques for the Prediction of the Severity of Road Crashes in Cartagena, Colombia (original) (raw)
Related papers
Road traffic accidents are one of the leading causes of death and injuries in Zambia. Some of the answers to reducing the problem of road traffic accidents are through research, and data mining is one of the research tools for discovering the causes of road traffic accidents. The main aim of this study was to identify and investigate drivers, road, weather and motor Vehicle-related factors that contribute to the severity of a road traffic accident in Zambia. In this research, road traffic accident severity was classified into three classes and these are, fatal, seriously injured, and slightly injured. This research develops a road traffic accident prediction model and compares the performance of various prediction models in order to select the best performing algorithm in the prediction of the road traffic accident severity. The data used in this study was a data file collected from the Zambia Police Service headquarter in Lusaka. The data collected was from the year 2016 to 2020, it contained 159,698 road traffic accidents. The CRISP-DM 1.0 standard data mining methodology was adopted in this research. Using WEKA (Waikato Environment for Knowledge Analysis) data mining software, four renowned classification algorithms were engaged to model the severity of the accidents. These algorithms comprised of Decision Tree (J48), Rule Induction (PART), Naive Bayes, and Random Forest. To build the models, first the whole dataset was used as a training set for the algorithm and the same dataset was used to build classifiers using 10-fold cross-validation. To institute the main causal features for road accident severity, rules produced by the Decision Tree (J48) and PART algorithms were supplementary explored. The efficiency of the algorithms used in the research was evaluated by comparing the classification accuracy, the Receiver Operator Characteristics curve, and the results shown in the confusion matrix. The results showed that the Random forest algorithm performed better in terms of classification accuracy and produced a better Receiver Operator Characteristics curve using training set, while the J48 algorithm out-performed the other three algorithms in terms of classification accuracy using 10-fold cross-validation. The rules produced by PART algorithm shows that, year, province, tire condition, car braking condition, cause of the accident, driver's age, driver's license grade, time and lighting condition are the most important features in the classification of a road traffic accident severity.
Indonesian Journal of Electrical Engineering and Computer Science
This research was conducted to help the traffic policy makers and general public in preventing road incidents using the collected traffic accident dataset between the years 2016 and 2019. Data mining using classification algorithm was utilized to develop a predictive model for predicting occurrences of traffic accidents. Classification algorithms such as decision tree, k-nn, naïve bayes and neural network have been compared in identifying better classification capability in classifying stage of felony. Neural network shows a very promising result in classifying road accident with a total accuracy result of 87.63%. Nonetheless, k-nn and naïve bayes both acquired a higher than 80% accuracy which shows that this classification algorithms were also good in predicting road accidents. Moreover, public vehicle is more prone in accident rather than private vehicle in both stage of felony and accident may occur between or on 3:00pm and 6:00pm.
The Prediction of Road-Accident Risk through Data Mining: A Case Study from Setubal, Portugal
Informatics, 2023
This work proposes a tool to predict the risk of road accidents. The developed system consists of three steps: data selection and collection, preprocessing, and the use of mining algorithms. The data were imported from the Portuguese National Guard database, and they related to accidents that occurred from 2019 to 2021. The results allowed us to conclude that the highest concentration of accidents occurs during the time interval from 17:00 to 20:00, and that rain is the meteorological factor with the greatest effect on the probability of an accident occurring. Additionally, we concluded that Friday is the day of the week on which more accidents occur than on other days. These results are of importance to the decision makers responsible for planning the most effective allocation of resources for traffic surveillance.
Mathematical Problems in Engineering, 2015
With the ever-increasing number of vehicles on the road, traffic accidents have also increased, resulting in the loss of lives and properties, as well as immeasurable social costs. The environment, time, and region influence the occurrence of traffic accidents. The life and property loss is expected to be reduced by improving traffic engineering, education, and administration of law and advocacy. This study observed 2,471 traffic accidents which occurred in central Taiwan from January to December 2011 and used the Recursive Feature Elimination (RFE) of Feature Selection to screen the important factors affecting traffic accidents. It then established models to analyze traffic accidents with various methods, such as Fuzzy Robust Principal Component Analysis (FRPCA), Backpropagation Neural Network (BPNN), and Logistic Regression (LR). The proposed model aims to probe into the environments of traffic accidents, as well as the relationships between the variables of road designs, rule-vio...
Prediction of Road Accidents Using Machine Learning Algorithms
Middle East Journal of Applied Science & Technology (MEJAST), 2023
Today, one of the top concerns for governments is road safety. There are many safety features built into cars, yet traffic accidents still happen frequently and are unavoidable. To lessen the harm caused by traffic accidents, predicting their causes has become the primary goal. In this situation, it will be beneficial to examine the frequency of accidents so that we can use this information to further aid us in developing strategies to lessen them. From this, we can deduce the connections between traffic accidents, road conditions, and the impact of environmental factors on accident occurrence. In order to construct an accident prediction model, I used machine learning techniques, including the Decision Tree, Random Forest, and Logistic Regression. The development of safety measures and accident prediction will both benefit from these classification systems. Several elements, including weather, vehicle condition, road surface condition, and light condition, can be used to predict road accidents. Three dataset files—accidents, casualties, and vehicles are loaded into this dataset. This allows us to forecast the severity of accidents.
The number of vehicles and road transportation increases rapidly daily. Hence the frequency of road accidents and crashes also gradually increase with it. Analysing traffic accidents is one of the essential concerns in the world. Due to the considerable number of casualties and fatalities caused by those accidents, taking necessary actions to reduce road accidents is a vital public safety concern and challenge worldwide. Various statistical methods and techniques are used to address this issue. Hence, those statistical implementations are used for multiple applications, such as extracting cause and effect to predict realtime accidents. In this study, a United States (US) Countrywide car accidents data set consisting of about 1.5 million accident records with other relevant 45 measurements related to the US Countrywide Traffic Accidents were used. This work aims to develop classification models that predict the likelihood of an accident is severe. In addition, this study also consists of descriptive analysis to recognise the key features affecting the accident severity. Supervised machine learning methods such as Decision tree, K-nearest neighbour, and Random forest were used to create classification models. The predictive model results show that the Random Forest model performs with an accuracy of 83.95% for the train set and 80.69% for the test set, proving that the Random forest model performs better in accurately detecting the most relevant factors describing a road accident severity.
Nigerian Journal of Technology, 2022
Road Traffic Crash (RTC) is among the leading causes of death in the world and has a significant impact on the socioeconomic development in a society. Generally, RTC can be caused by one or a combination of the following factors: Human, environment and vehicle. This study utilized five data mining algorithm classifiers (Decision Tree (DT), K-Nearest Neighbor (KNN), J-Repeated Incremental Pruning to Produce Error Reduction (JRIP), Naïve Bayes (NB), and Multi-layer Perceptron (MLP)) to classify the severity of RTC and identify the significant causes of RTC in Kaduna State, Nigeria. The RTC data used in this study included 26 RTC attributes with 1580 instances from 2016 to 2018 that covered fatal, serious and minor cases obtained from the Federal Road Safety Corps, Kaduna sector command. Two sets of experiments were performed on the classifiers (without and with feature selection). The study results showed that among the five data mining algorithms used, K-NN had the best accuracies of 94.8% and 96.1% respectively for the without and with feature selection experiments.
-2101 16 Paper Road Accident Prediction Model Using Data Mining Techniques
2022
Road Accident is an all-inclusive disaster with consistently raising pattern. In India according to Indian road safety campaign every minute there is a road accident and almost 17 people die per hour in road accidents. There are different categories of vehicle accidents like rear end, head on and rollover accidents. The state recorded police reports or FIR’s are the documents which contains the information about the accidents. The incident may be selfreported by the people or recorded by the state police. In this paper the frequent patterns of road accidents is been predicted using Apriori and Naïve Bayesian techniques. This pattern will help the government or NGOs to improve the safety and take preventive measures in the roads that have major accident zones. Data mining (DM) techniques (artificial neural networks (ANNs) and support vector machines (SVM)) were used to model accident and incident data compiled from the historical data. Based on the R-Tools, results were compared with those from some classical statistical techniques (logistic regression (LR), revealing the superiority of ANNs and SVM in predicting and identifying the factors underlying accidents in toll road.
Road traffic accidents are among the leading causes of death and injury worldwide. In Abu Dhabi, in 2014, 971 traffic accidents were recorded, which contributed to 121 fatalities and 135 severe injuries. Several factors contribute to injury severity, including driver-related factors, road-related factors, and accident-related factors. In this article, data-mining techniques were employed to establish models (classifiers) to predict the injury severity of any new accident with reasonable accuracy, based on 5,973 traffic accident records in Abu Dhabi over a 6year period from 2008 to 2013. Additionally, the research aimed to establish a set of rules that can be used by the United Arab Emirates (UAE) Traffic Agencies to identify the main factors that contribute to accident severity. Using Waikato Environment for Knowledge Analysis (WEKA) data mining software, four wellknown classification algorithms were employed to model the severity of injury. These algorithms included: Decision Tree (DT) (J48), Rule Induction (PART), Na€ ıve Bayes (NB), and Multilayer Perceptron (MLP). The effectiveness of each method in predicting accident severity was evaluated in three different ways. First, the entire data set was used as a training set for the algorithm. Second, accuracy was evaluated using crossvalidation with 10-fold. Third, to overcome the problems that resulted from the imbalanced distribution of accident severity in our data set, the data set was resampled to bias the accident severity distribution toward a uniform distribution, and then cross-validation with 10-fold was used again to evaluate the performance. Furthermore, to establish the main contributing factors for road accidents severity, rules generated by the DT J48 algorithm were further explored. The results showed that the overall accuracy of the DT J48 classifier, the PART classifier, and the MLP classifier in predicting the severity of injury resulting from traffic accidents, using 10-fold cross-validation, were similar. The NB classifier exhibited less accuracy. Additionally, the prediction accuracy of the classifiers was enhanced after resampling the training set. The results indicated that the most important factors associated with fatal severity were age, gender, nationality, year of accident, casualty status, and collision type. 18-to 30-year-olds were the most vulnerable age group to traffic accidents. There was a clear trend in accident reduction over the period of the study. Drivers were involved more frequently in traffic accidents than passengers and pedestrians. Male drivers were involved more frequently in traffic accidents than female drivers. UAE, Asian, and Arab nationalities had the highest traffic accident frequency; Gulf and other nationalities had lower traffic accident frequency. The highest number of traffic accidents occurred at right angles. Pedestrian-vehicle type collisions had the next highest number of traffic accidents, followed by rear-end collisions and sideswipe collisions.
Road Accidents Prediction and Classification
International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022
Road accidents leads to death, disability and hospitalization of people across world which leads to loss of potential income of individual and also affects the economy of the country. For every 10 people killed during road accidents across world one person belongs to India. In year 2020, total of 3,66,138 road accidents occurred leading to loss of 1,31,714 persons lives, injuring 3,48,279 persons. Number of road accidents and damage caused by it can be reduced by identifying the factors leading to it. In this project, we are applying the concepts of data mining and machine learning to identify the various factors that affect road accidents and its severity. The application will take variety inputs such as age of vehicle, light condition, road surface condition, speed limit etc. and will use random forest machine learning algorithm to calculate the severity of a possible accident. The severity of a possible accident will be displayed on a scale of 1 to 3, 1 being the highest and 3 being the least severe so, that they drive safely and take precautions. This data can be used in future to analyze inputs and improves the accuracy of the system output. In case of severity 1 which is the case of possibility of fatal accident an alert message will be sent to police so that they can take any preventive measures and therefore this application can prove to be very helpful in reducing accident fatality rates in the country.