Anti-Phishing Technology Using Machine Learning Approach (original) (raw)

IRJET- Anti-Phishing Technology using Machine Learning Approach

IRJET, 2020

Phishing is a social engineering attack that aims at exploiting the weakness found in the system at the user's end. Phishing attacks are the most common type of attacks leveraging social engineering techniques. Attackers use emails, social media to trick victims into providing sensitive information or visiting malicious URL (Uniform Resource Locator) in the attempt to compromise their systems. For individuals, this includes unauthorized purchases, the stealing of funds or identifies theft. An organization succumbing to such an attack typically sustains severe financial losses in addition to declining market share, reputation and consumer trust. It refers to exploiting weakness on the user side, which is vulnerable to such attacks. The phishing problem is huge and there does not exist only one solution to minimize all vulnerabilities effectively, thus multiple techniques are implemented. In this paper, we discuss Random Forest Machine Learning approach for detecting phishing websites. First step is to extract various features of URL such as domain license, elements, idiosyncrasies; then checking legitimacy of website by predicting the result. We make use of Machine Learning techniques and algorithms for evaluation of these different features of URL and websites. In this paper, an overview about these approaches is presented.

Machine Learning-Based Phishing Attack Detection

International Journal of Advanced Computer Science and Applications, 2020

This paper explores machine learning techniques and evaluates their performances when trained to perform against datasets consisting of features that can differentiate between a Phishing Website and a safe one. This capability of telling these sites apart from one another is vital in the modernday internet surfing. As more and more of our resources shift online, one vulnerability and a leak of sensitive information by someone could bring everything down in a connected network. This paper's objective through this research is to highlight the best technique for identifying one of the most commonly occurring cyberattacks and thus allow faster identification and blacklisting of such sites, therefore leading to a safer and more secure web surfing experience for everyone. To achieve this, we describe each of the techniques we look into in great detail and use different evaluation techniques to portray their performance visually. After pitting all of these techniques against each other, we have concluded with an explanation in this paper that Random Forest Classifier does indeed work best for Phishing Website Detection.

Classification of Phishing Email Using Random Forest Machine Learning Technique

Journal of Applied Mathematics, 2014

Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature) were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN) and false positive (FP) rates.

MACHINE LEARNING TECHNIQUES FOR IDENTIFYING AND MITIGATING PHISHING ATTACKS

IAEME PUBLICATION, 2024

One of the most prevalent forms of social engineering, phishing attempts to fraudulently get sensitive information from users' email accounts. Their usage can be integrated into larger-scale assaults aimed at penetrating government or business networks. To identify and lessen the impact of these attacks, several antiphishing methods have been suggested within the past ten years. Nevertheless, they continue to be inaccurate and inefficient. Many different channels can be used for phishing, including email, phone, instant messaging, advertisements, website pop-up windows, and DNS poisoning. Significant damages, such as the disclosure of sensitive information, theft of personal or company identities, or even state secrets, can be inflicted upon victims of phishing attempts. This essay aims to evaluate these attacks by looking at how phishing is done now and how it is currently perceived. This article presents a new, comprehensive model of phishing that considers various aspects of attacks, including stages, threats, targets, media, and tactics. Here, we use machine learning methods like Logistic Regression, Random Forest, and XGBoost to classify websites as either legitimate or phishing. In addition to helping readers understand the lifecycle of a phishing assault, the proposed anatomy will make people more aware of these attacks, the tactics used, and how to build a thorough anti-phishing system

IJERT-Detection of Phishing Websites using an Efficient Machine Learning Framework

International Journal of Engineering Research and Technology (IJERT), 2020

https://www.ijert.org/detection-of-phishing-websites-using-an-efficient-machine-learning-framework https://www.ijert.org/research/detection-of-phishing-websites-using-an-efficient-machine-learning-framework-IJERTV9IS050888.pdf Phishing attack is one of the commonly known attack where the information from the internet users are stolen by the intruder. The internet users are losses their sensitive information such as Protected passwords, personal information and their transactions to the intruders. The Phishing attack is normally carried by the attackers where the legitimate frequently used websites are manipulated and masked to gather the personal information of the users. The Intruders use the personal information and can manipulate the transactions and get definite from them. From the literature there are various anti-Phishing websites by the various authors. Some of the techniques are Blacklist or Whitelist and heuristic and visual similarity based methods. In spite of the users using these techniques most of the users are getting attacked by the intruders by means of Phishing to gather their sensitive information. A novel Machine Learning based classification algorithm has been proposed in this paper which uses heuristic features where feature selection can be extracted from the attributes such as Uniform Resource Locator, Source Code, Session, Type of security involve, Protocol used, type of website. The proposed model has been evaluated using five machine learning algorithms such as random forest, K Nearest Neighbor, Decision Tree, Support Vector Machine, Logistic regression. Out of these models, the random forest algorithm performs better with attack detection accuracy of 91.4%. Moreover the Random Forest Model uses orthogonal and oblique classifiers to select the best classifiers for accurate detection of Phishing attacks in the websites.

Feature based Phishing Website Detection using Random Forest Classifier

International Journal for Research in Applied Science and Engineering Technology, 2021

In today’s world, one of the most vulnerable security threat which poses a problem to the internet users is phishing. Phishing is an attack made to steal the sensitive information of the users such as password, PIN, card details etc., In a phishing attack, the attacker creates a fake website to make the users click it and steal the sensitive information of users. . In this paper, we propose a feature-based phishing detection technique that uses uniform resource locator (URL) features. This paper focuses on the extracting the features which are then classified based on their effect within a website. The feature groups include address- bar related features, abnormal- based features, HTML – JavaScript based features and domain based features. We plan to use machine learning and implement some classification algorithms and compare the performance of these algorithms on our dataset.

Phish Catch: Machine Learning Way of Detecting Phishing Websites

2020

With the advent of 4G technology, the internet became available to masses. Everyone started to use internet services in different spheres of their life, making them vulnerable to diverse threats. One of the primary risks for internet users is Phishing Websites. Instead of breaching the security of systems phishing websites try to fool the users and make them give away the credentials which they are not supposed to share with anyone. In this study, we took 21 features and tried to predict their class i.e legitimate or phish using a supervised learning algorithm Index Terms Phishing, Machine Learning, SVM, Decision Tree, Random Forest, Internet, Security

Identification and Classification of Phishing Websites Using Machine Learning – Random Forest MSc Internship Cyber Security

2020

I hereby certify that the information contained in this (my submission) is information pertaining to research I conducted for this project. All information other than my own contribution will be fully referenced and listed in the relevant bibliography section at the rear of the project. ALL internet material must be referenced in the bibliography section. Students are required to use the Referencing Standard specified in the report template. To use other author's written or electronic work is illegal (plagiarism) and may result in disciplinary action.

A Comprehensive Review of Phishing Attack Detection Using Machine Learning Techniques

International Journal of Advanced Research in Science, Communication and Technology (IJARSCT), 2024

Phishing attacks have become a significant cybersecurity concern, affecting millions of users and organizations by stealing confidential information. The rise of machine learning (ML) techniques has provided innovative ways to detect and mitigate phishing attacks. This review paper explores various ML algorithms, including Decision Trees (DT), Random Forest (RF), and Principal Component Analysis (PCA), in detecting phishing attacks. Through a review of recent studies, it is evident that ML models such as RF can achieve high accuracy, up to 97%, in phishing detection. However, challenges such as evolving phishing strategies, data imbalance, and feature extraction remain critical issues. Future research directions should focus on deep learning models and real-time detection systems to enhance the robustness and effectiveness of phishing detection mechanisms

Phishing Website Detection Using Machine Learning: A Review

Wasit Journal for Pure sciences

Phishing, a form of cyber attack in which perpetrators employ fraudulent websites or emails to Deceive individuals into divulging sensitive information such as passwords or financial data, can be mitigated through various machine-learning algorithms for website detection. These algorithms, including decision trees, support vector machines, and Random Forest, analyze multiple website features, such as URL structure, website content, and the presence of specific keywords or patterns, to ascertain the likelihood of a website being a phishing site. This comprehensive review elucidates the concept of phishing website detection and the diverse techniques employed while summarizing previous studies, their outcomes, and their contributions. Overall, machine learning algorithms serve as a potent tool in the identification of phishing websites, thereby safeguarding users against falling prey to such malicious attacks.