Utilizing linear regression and random forest models for money laundering identification (original) (raw)
2024, TELKOMNIKA Telecommunication Computing Electronics and Control
This paper investigates the effectiveness of traditional machine learning techniques, namely linear regression and random forest (RF), in enhancing the detection of money laundering (ML) activities within financial systems. As ML schemes evolve in complexity, traditional rule-based methods struggle with high false favorable rates and a lack of adaptability, prompting the need for more sophisticated analytical approaches. In contrast to the complexities of deep learning models, this study explores the potential of these more accessible machine learning methods in identifying and analyzing suspicious transactional patterns. We apply linear regression and RF models to transactional data to detect anomalous activities that could indicate ML. Our research thoroughly compares these models based on key performance metrics such as accuracy, precision, and recall. The findings suggest that while less complex than deep learning frameworks, linear regression, and RF models offer substantial benefits. They provide a more streamlined, interpretable, and efficient alternative to conventional rule-based systems in the context of ML detection. This study contributes to the ongoing discourse on the application of machine learning in financial crime detection, demonstrating the practicality and effectiveness of these methods in a critical area of financial security
Related papers
Deep learning based phishing website detection
TELKOMNIKA Telecommunication Computing Electronics and Control, 2024
Phishing attacks use fraudulent websites that trick people into disclosing sensitive information. More effective and precise methods are required to identify phishing websites so that people and organisations can be protected from the damaging effects of these online threats. The aim of this work is to develop a model that can identify phishing uniform resource locator (URLs) more accurately than current approaches while requiring less training time, testing time, and storage space. This research work proposes a novel method for identifying phishing websites using a long short-term memory (LSTM) gated recurrent unit (GRU) algorithm to detect phishing URLs. The accuracy of the suggested method is 98.89%, which is significantly better than the findings of earlier studies. The model also showed a need for shorter training and testing time, and a reduced amount of storage space.
TELKOMNIKA Telecommunication Computing Electronics and Control, 2023
Classification is a predictive modelling task in machine learning (ML), where the class label is determined for a specific example of predefined features. In determining handwriting characters, identifying spam, detecting disease, identifying signals, and so on, classification requires training data with many features and label instances. In medical informatics, high precision and recall are mandatory issues besides the high accuracy of the ML classifiers. Most of the real-life datasets have imbalanced characteristics that hamper the overall performance of the classifiers. Existing data balancing techniques perform the whole dataset at a time that sometimes causes overfitting and underfitting. We propose a data balancing technique that follows the divide and conquer procedure to cluster the dataset into several segments, and both oversampling and undersampling operation is performed on each cluster. Finally, the cluster joined together and built a balanced dataset. We chose the sample data of two heart disease datasets: Hungarian and Long Beach. Logistic regression and random forest classifier are the representatives of ML algorithms. We compare our proposed techniques with existing SMOTE, NearMiss, and SMOTETomek data balancing techniques. Both algorithms perform better on the proposed technique-balanced dataset. This technique can be the optimal solution for the imbalanced data handling strategy. This is an open access article under the CC BY-SA license.
ADAPTIVE FRAUD DETECTION SYSTEMS: USING ML TO IDENTIFY AND RESPOND TO EVOLVING FINANCIAL THREATS
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science, 2024
The rise of digital transactions has increased the vulnerability of financial systems to fraud. Traditional fraud detection methods, often reliant on static rules and historical data, struggle to keep pace with evolving fraudulent tactics. This paper explores the development of adaptive fraud detection systems leveraging machine learning (ML) techniques to enhance the identification and response to financial threats. By analysing large datasets, these systems can detect anomalies in real-time, adjusting to new patterns of behaviour indicative of fraud. We discuss various ML algorithms, such as supervised and unsupervised learning, and their effectiveness in improving detection rates while minimizing false positives. Additionally, we emphasize the importance of continuous learning mechanisms that allow the system to evolve with emerging threats. Case studies illustrate successful implementations of adaptive systems in financial institutions, demonstrating significant improvements in fraud detection efficiency. Our findings indicate that integrating ML in fraud detection not only increases accuracy but also reduces response times, ultimately safeguarding financial assets and enhancing customer trust.
Detecting money laundering with Benford's law and machine learning
2019
The thesis develops a new tool that detects money laundering criminals and can be used by financial institutions. It builds on the basis of Benford’s Law and machine learning techniques, applied to the banking data: transactions, carried out by private customers of a mobile bank. The developed algorithm is shown to outperform the traditional rule-based approach.
Using decision tree classifier to detect Trojan Horse based on memory data
TELKOMNIKA Telecommunication Computing Electronics and Control, 2024
Trojan Horse is a major threat that has grown with the spread of the digital world. Data gathered through the study of memory can provide valuable insights into the Trojan Horse's behavior patterns. Because of this, memory analysis techniques are one of the topics that should be investigated in Trojan Horse detection. This study proposes the use of memory data in Trojan Horse detection. Trojan Horse detection used a decision tree (DT) classifier with memory data. Experiments were performed on the Trojan Horse samples from the CIC-MalMem-2022 dataset. The binary classification was made using DT, gradient boosted tree, Naive Bayes (NB), linear vector support machine, K-nearest neighbors (KNN), and machine learning (ML) classifiers. The comparison of the various classification methods was performed utilizing the accuracy, recall, precision, and F1score metrics. As a result, the most successful Trojan Horse detection was gained with the DT classifier, which achieved accuracy of 99.96% using memory data. The NB classifier showed the lowest achievement in Trojan Horse detection using memory data, which achieved accuracy of 98.41%. In addition, numerous of the classifiers utilized have attained very high results. Based on the achieved results, the data from memory analysis is very valuable in detecting Trojan Horse.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.