Multilayer Stacked Ensemble Learning Model to Detect Phishing Websites (original) (raw)

Phishing Websites Detection by Using Optimized Stacking Ensemble Model

Computer Systems Science and Engineering

Phishing attacks are security attacks that do not affect only individuals' or organizations' websites but may affect Internet of Things (IoT) devices and networks. IoT environment is an exposed environment for such attacks. Attackers may use thingbots software for the dispersal of hidden junk emails that are not noticed by users. Machine and deep learning and other methods were used to design detection methods for these attacks. However, there is still a need to enhance detection accuracy. Optimization of an ensemble classification method for phishing website (PW) detection is proposed in this study. A Genetic Algorithm (GA) was used for the proposed method optimization by tuning several ensemble Machine Learning (ML) methods parameters, including Random Forest (RF), AdaBoost (AB), XGBoost (XGB), Bagging (BA), GradientBoost (GB), and LightGBM (LGBM). These were accomplished by ranking the optimized classifiers to pick out the best classifiers as a base for the proposed method. A PW dataset that is made up of 4898 PWs and 6157 legitimate websites (LWs) was used for this study's experiments. As a result, detection accuracy was enhanced and reached 97.16 percent.

An Optimized Stacking Ensemble Model for Phishing Websites Detection

Electronics

Security attacks on legitimate websites to steal users’ information, known as phishing attacks, have been increasing. This kind of attack does not just affect individuals’ or organisations’ websites. Although several detection methods for phishing websites have been proposed using machine learning, deep learning, and other approaches, their detection accuracy still needs to be enhanced. This paper proposes an optimized stacking ensemble method for phishing website detection. The optimisation was carried out using a genetic algorithm (GA) to tune the parameters of several ensemble machine learning methods, including random forests, AdaBoost, XGBoost, Bagging, GradientBoost, and LightGBM. The optimized classifiers were then ranked, and the best three models were chosen as base classifiers of a stacking ensemble method. The experiments were conducted on three phishing website datasets that consisted of both phishing websites and legitimate websites—the Phishing Websites Data Set from U...

A Review of Ensemble Learning-Based Solutions for Phishing Website Detection

International Journal of Emerging Trends in Engineering Research , 2021

Phishing is the deception of a trustworthy person in an electronic connection in order to obtain confidential information from individuals or organisations usernames, passwords, and credit card numbers are just a few examples. Phishers imitate legitimate websites by creating websites that are visually and semantically identical. As technology advances, phishing techniques have become more sophisticated, necessitating the use of antiphishing measures to detect phishing attacks. To solve the phishing attacks problems. We got the data for the Phishing website from the Kaggle open source website, which is a Google Limited Liability Company-owned online community of data scientists and machine learning experts (LLC). We are using Ensemble learning to detecting website. We are also analize accurary. We compared the results of multiple machine learning methods for predicting phishing websites.

Detection of Phishing Websites Using Ensemble Machine Learning Approach

ITM Web of Conferences, 2021

In this paper, we propose the use of Ensemble Machine Learning Methods such as Random Forest Algorithm and Extreme Gradient Boosting (XGBOOST) Algorithm for efficient and accurate phishing website detection based on its Uniform Resource Locator. Phishing is one of the most widely executed cybercrimes in the modern digital sphere where an attacker imitates an existing - and often trusted - person or entity in an attempt to capture a victim’s login credentials, account information, and other sensitive data. Phishing websites are visually and semantically similar to real ones. The rise in online trading activities has resulted in a rise in the number of phishing scams. Cybersecurity jobs are the most difficult to fill, and the development of an automated system for phishing website detection is the need of the hour. Machine Learning is one of the most feasible methods to approach this situation, as it is capable of handling the dynamic nature of phishing techniques, in addition to prov...

Solving the Problem of Detecting Phishing Websites Using Ensemble Learning Models

Scientific Journal of Astana IT University

Due to the popularity of the easiest way to obtain personal information among attackers, phishing detection is becoming a popular area for research aimed at countering the implementation of such attacks. Malicious website detection is essential to prevent the spread of malware and protect end users from victims. Unfortunately, malicious URL detection still needs to be better understood due to a lack of features and inaccurate classification. Possible sources were examined in order to investigate the subject. Based on the collected information from previous studies, this study is devoted to solving the problem of detecting phishing websites using Ensemble Learning. The aim of the work is to choose the most optimal algorithm for classifying phishing websites using gradient boosting algorithms. AdaBoost, CatBoost, and Gradient Boosting Classifier were chosen as Ensemble Learning algorithms and were used to improve the efficiency of classifiers. Practical studies of the parameters of ea...

Phishing Website Detection Using Ensemble Learning

International Journal of Emerging Trends in Engineering Research, 2023

Phishing is also the most common type of data breach. As a result, it is carried out by sending an email with links that lead to fraudulent websites. This technique is especially targeted to large companies. Usually, the attackers send emails with work-related information. Machine learning is one of the most successful techniques for detecting phishing. This paper analyzed the results of various machine learning techniques for predicting phishing websites. And also describes the various methods that are used to identify phishing websites. Some of these include the SVM classification method, Random Forest method, and AdaBoost method. Ensemble model that combines the SVM, Random Forest, and AdaBoost methods was able to classify a phishing site with an accuracy of 96%.

A Boosting-Based Hybrid Feature Selection and Multi-Layer Stacked Ensemble Learning Model to Detect Phishing Websites

IEEE Access

Phishing is a type of online scam where the attacker tries to trick you into giving away your personal information, such as passwords or credit card details, by posing as a trustworthy entity like a bank, email provider, or social media site. These attacks have been around for a long time and unfortunately, they continue to be a common threat. In this paper, we propose a boosting based multi layer stacked ensemble learning model that uses hybrid feature selection technique to select the relevant features for the classification. The dataset with selected features are sent to various classifiers at different layers where the predictions of lower layers are fed as input to the upper layers for the phishing detection. From the experimental analysis, it is observed that the proposed model achieved an accuracy ranging from 96.16 to 98.95% without feature selection across different datasets and also achieved an accuracy ranging from 96.18 to 98.80% with feature selection. The proposed model is compared with baseline models and it has outperformed the existing models with a significant difference.

A Machine Learning Approach to Identifying Phishing Websites: A Comparative Study of Classification Models and Ensemble Learning Techniques

ICST Transactions on Scalable Information Systems

Phishing assaults are one of the more prevalent types of cybercrime in the world today. To steal information, users are sent emails and messages. Moreover, websites are used for it. Phishing primarily targets corporate web-sites, such as those for e-commerce, finance, and governmental organizations. In order to obtain sensitive user information, attackers impersonate websites, a phenomenon known as phishing. In addition to exploring the use of machine learning algorithms to identify and stop web phishing assaults, this research suggests utilizing machine learning techniques to detect phish-ing URLs by analysing various aspects of the URLs. The study includes classification models like Logistic Regression, Random Forest, Decision trees, KNN, Naive bayes, SVM and other ensemble learning techniques like Gradient Boosting, XGBoost, Histogram Gradient Boosting, Light Gradient Boosting and AdaBoost were used to detect phishing websites.

Detecting Cloud Based Phishing Attacks Using Stacking Ensemble Machine Learning Technique

International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2023

Cloud computing enables users to access computing services over the Internet, but this also presents a security risk due to the anonymous nature of the Internet. Social engineering attacks are one of the most common security breaches in cloud computing, where attackers trick cloud users to reveal sensitive information. Detecting phishing attacks in cloud computing is challenging, and various solutions have been proposed, including rule-based and anomaly-based detection methods. Machine learning techniques have proven to be effective in detecting and classifying phishing attacks, particularly for distinguishing between legitimate and phishing websites. This paper proposes an ensemble approach utilizing four different machine learning classifiers to detect phishing websites. The study analyzes various features, such as address bar-based, domain-based, and HTML & JavaScript-based features, and the findings reveal that the proposed ensemble approach outperforms the base classifiers, achieving the highest accuracy of 98.8%.

Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method

SN Computer Science, 2022

Phishing is considered a big concern in this age of data and digital technologies because of its significant influence on the banking and online retailing industries. Cybercriminals target all economic activity on the Internet; thus, it is critical to take security precautions to safeguard assets. One of the first steps in constructing a safe cyberspace is to prevent phishing attacks before they happen. The detection mechanisms for these assaults were created using machine learning and other methods. However, there is still room for improvement in terms of detection accuracy. This paper proposes the optimization of an ensemble classification algorithm for phishing website (PW) detection. The suggested technique was optimised using a hybrid features selection method (Chi-square, extra tree, and heatmap) by modifying numerous machine learning (ML) method parameters, including random forest, naive Bayes, J48, and KNN. These were achieved by rating the optimal classifiers and selecting the top classifiers to serve as the foundation for the suggested technique. The obtained results by all experiments show that assigned optimized stacking ensemble approach outperforms previous ML-based detection methods. The level of precision attained was 99.7%.