IRJET-Machine Learning Techniques to Seek Out Malicious Websites (original) (raw)

IJERT-Detection of Phishing Websites using an Efficient Machine Learning Framework

International Journal of Engineering Research and Technology (IJERT), 2020

https://www.ijert.org/detection-of-phishing-websites-using-an-efficient-machine-learning-framework https://www.ijert.org/research/detection-of-phishing-websites-using-an-efficient-machine-learning-framework-IJERTV9IS050888.pdf Phishing attack is one of the commonly known attack where the information from the internet users are stolen by the intruder. The internet users are losses their sensitive information such as Protected passwords, personal information and their transactions to the intruders. The Phishing attack is normally carried by the attackers where the legitimate frequently used websites are manipulated and masked to gather the personal information of the users. The Intruders use the personal information and can manipulate the transactions and get definite from them. From the literature there are various anti-Phishing websites by the various authors. Some of the techniques are Blacklist or Whitelist and heuristic and visual similarity based methods. In spite of the users using these techniques most of the users are getting attacked by the intruders by means of Phishing to gather their sensitive information. A novel Machine Learning based classification algorithm has been proposed in this paper which uses heuristic features where feature selection can be extracted from the attributes such as Uniform Resource Locator, Source Code, Session, Type of security involve, Protocol used, type of website. The proposed model has been evaluated using five machine learning algorithms such as random forest, K Nearest Neighbor, Decision Tree, Support Vector Machine, Logistic regression. Out of these models, the random forest algorithm performs better with attack detection accuracy of 91.4%. Moreover the Random Forest Model uses orthogonal and oblique classifiers to select the best classifiers for accurate detection of Phishing attacks in the websites.

IRJET- Machine Learning Method to Detect the Phishing Websites

IRJET, 2020

Phishing continues to be one of the most commonly committed crimes in whole world since people give away their personal details or any confidential information online unknowingly. The malicious websites takes advantage of such person by undergoing into this crime to obtain their personal details like username, password, bank details, etc and use it either to steal their secret information or exploit their financial situation. Today, technology is growing rapidly, we try to safeguard people by such crimes and educate them more about such events. As the technology is booming, there are various tools to detect and analyze such attacks or crimes. We have used Machine Learning which is a powerful tool to fight against such phishing and related crimes. Alongside random forest and logistic regression algorithms are also used to obtain better results on detecting the illegitimate sites.

IRJET- Phishing Websites Detection using Machine Learning Techniques

IRJET, 2021

In general, usage of websites is the most common things lately, it may be for e-commerce purposes or entertainment, whatever it may be. In this project, our main factor is a website, whether it is a fraudulent one or a legit one. Detection of this quality of a website is the main theme of the project. Conventionally, a website can be detected whether it is harmful or not by the browser protection service, if it is redirecting to unusual or malicious sites, such sites are marked as harmful with a symbol before the URL. Even though, the browser's firewall is enabled, it can never detect a phishing website. Because, Phishing site is not malicious site it steals data without the user even knowing it. So, to detect such sites we are training an ML model using different algorithms to determine the phishing site based on URL feature extraction. Based on different features of URL, such as like Domain length, character length etc, we will train the model with one algorithm at a time store their results and compare to find the more accurate one and display the results using the approved algorithm.

IRJET- Anti-Phishing Technology using Machine Learning Approach

IRJET, 2020

Phishing is a social engineering attack that aims at exploiting the weakness found in the system at the user's end. Phishing attacks are the most common type of attacks leveraging social engineering techniques. Attackers use emails, social media to trick victims into providing sensitive information or visiting malicious URL (Uniform Resource Locator) in the attempt to compromise their systems. For individuals, this includes unauthorized purchases, the stealing of funds or identifies theft. An organization succumbing to such an attack typically sustains severe financial losses in addition to declining market share, reputation and consumer trust. It refers to exploiting weakness on the user side, which is vulnerable to such attacks. The phishing problem is huge and there does not exist only one solution to minimize all vulnerabilities effectively, thus multiple techniques are implemented. In this paper, we discuss Random Forest Machine Learning approach for detecting phishing websites. First step is to extract various features of URL such as domain license, elements, idiosyncrasies; then checking legitimacy of website by predicting the result. We make use of Machine Learning techniques and algorithms for evaluation of these different features of URL and websites. In this paper, an overview about these approaches is presented.

A Machine Learning Approach to Identifying Phishing Websites: A Comparative Study of Classification Models and Ensemble Learning Techniques

ICST Transactions on Scalable Information Systems

Phishing assaults are one of the more prevalent types of cybercrime in the world today. To steal information, users are sent emails and messages. Moreover, websites are used for it. Phishing primarily targets corporate web-sites, such as those for e-commerce, finance, and governmental organizations. In order to obtain sensitive user information, attackers impersonate websites, a phenomenon known as phishing. In addition to exploring the use of machine learning algorithms to identify and stop web phishing assaults, this research suggests utilizing machine learning techniques to detect phish-ing URLs by analysing various aspects of the URLs. The study includes classification models like Logistic Regression, Random Forest, Decision trees, KNN, Naive bayes, SVM and other ensemble learning techniques like Gradient Boosting, XGBoost, Histogram Gradient Boosting, Light Gradient Boosting and AdaBoost were used to detect phishing websites.

Detection of Phishing Websites Using Ensemble Machine Learning Approach

ITM Web of Conferences, 2021

In this paper, we propose the use of Ensemble Machine Learning Methods such as Random Forest Algorithm and Extreme Gradient Boosting (XGBOOST) Algorithm for efficient and accurate phishing website detection based on its Uniform Resource Locator. Phishing is one of the most widely executed cybercrimes in the modern digital sphere where an attacker imitates an existing - and often trusted - person or entity in an attempt to capture a victim’s login credentials, account information, and other sensitive data. Phishing websites are visually and semantically similar to real ones. The rise in online trading activities has resulted in a rise in the number of phishing scams. Cybersecurity jobs are the most difficult to fill, and the development of an automated system for phishing website detection is the need of the hour. Machine Learning is one of the most feasible methods to approach this situation, as it is capable of handling the dynamic nature of phishing techniques, in addition to prov...

IRJET- Detection and Prevention of Phishing Websites using Machine Learning Approach

IRJET, 2020

Phishing costs Internet user's lots of dollars per year. It refers to exploiting weakness on the user side, which is vulnerable to such attacks. The phishing problem is huge and there does not exist only one solution to minimize all vulnerabilities effectively, thus multiple techniques are implemented. In this paper, we discuss three approaches for detecting phishing websites. First is by analyzing various features of URL , second is by checking legitimacy of website by knowing where the website is being hosted and who are managing it, the third approach uses visual appearance based analysis for checking genuineness of website. We make use of Machine Learning techniques and algorithms for evaluation of these different features of URL and websites. In this paper, an overview about these approaches is presented.

Anti-Phishing Technology Using Machine Learning Approach

2020

Phishing is a social engineering attack that aims at exploiting the weakness found in the system at the user’s end. Phishing attacks are the most common type of attacks leveraging social engineering techniques. Attackers use emails, social media to trick victims into providing sensitive information or visiting malicious URL (Uniform Resource Locator) in the attempt to compromise their systems. For individuals, this includes unauthorized purchases, the stealing of funds or identifies theft. An organization succumbing to such an attack typically sustains severe financial losses in addition to declining market share, reputation and consumer trust. It refers to exploiting weakness on the user side, which is vulnerable to such attacks. The phishing problem is huge and there does not exist only one solution to minimize all vulnerabilities effectively, thus multiple techniques are implemented. In this paper, we discuss Random Forest Machine Learning approach for detecting phishing websites...

Phishing Website Detection Using Ensemble Learning

International Journal of Emerging Trends in Engineering Research, 2023

Phishing is also the most common type of data breach. As a result, it is carried out by sending an email with links that lead to fraudulent websites. This technique is especially targeted to large companies. Usually, the attackers send emails with work-related information. Machine learning is one of the most successful techniques for detecting phishing. This paper analyzed the results of various machine learning techniques for predicting phishing websites. And also describes the various methods that are used to identify phishing websites. Some of these include the SVM classification method, Random Forest method, and AdaBoost method. Ensemble model that combines the SVM, Random Forest, and AdaBoost methods was able to classify a phishing site with an accuracy of 96%.

Prediction of phishing websites using machine learning

Spatial Informing Research, 2022

With the growing popularity of the information science, more application is being integrated with websites that can be accessed directly through the internet. This has increased the possibility of attack by ill-legal persons to steal personal information. To identify a phishing assault, several strategies have been presented. However, there is still opportunity for progress in the fight against phishing. The objective of this research paper is to develop a more accurate prediction model using Decision Tree (DT), Random Forest (RF) and Gradient Boosting Classifiers (GBC) with three features selection techniques Extra Tree (ET), Chi-Square and Recursive Feature Elimination (RFE). Since phishing websites dataset contains 89 features, therefore we have applied extra tree and chi-square, feature selection method to identify the limited important features and then recursive features elimination technique has been used to reduce the dataset up-to optimum important features. We have compared the performance of the developed model using machine learning algorithms and find the best prediction performance using GBC, followed by RF and DT. These algorithmic models capture the trends from various cases of phishing with over R-square, Root Mean Square Error (RMSE), and Mean Absolute Error (MAE), in each case.