Comparative Analysis for Detecting DNS Tunneling Using Machine Learning Techniques (original) (raw)

A Survey of DNS Tunnelling Detection Techniques Using Machine Learning

2018

The Domain Name System (DNS) is an essential network service translating human-friendly host names into numerical IP addresses. Prior to almost any network communication, a communication with a DNS server is, the most likely, needed. For this reason, DNS cyber-attacks are now one of the most challenging threats in the information security community due to its wide availability and the fact that it’s not monitored in terms of security not intended for data transfer. Particularly, DNS tunnelling embedding data in DNS queries and response is receiving a lot of attention in the research field over the last years. Recent studies have focused on DNS tunnelling detection using machine learning. The aim of this paper is to provide a comprehensive survey of some different techniques proposed recently in the literature for detecting DNS tunnels using machine learning, while highlighting on the main findings and comparing their obtained results. Keywords— Domain Name System, Cyber-attacks, Tun...

Real-Time Detection System for Data Exfiltration over DNS Tunneling Using Machine Learning

Electronics

The domain name system (DNS) plays a vital role in network services for name resolution. By default, this service is seldom blocked by security solutions. Thus, it has been exploited for security breaches using the DNS covert channel (tunnel). One of the greatest current data leakage techniques is DNS tunneling, which uses DNS packets to exfiltrate sensitive and confidential data. Data protection against stealthy exfiltration attacks is critical for human beings and organizations. As a result, many security techniques have been proposed to address exfiltration attacks starting with building security policies and ending with designing security solutions, such as firewalls, intrusion detection or prevention, and others. In this paper, a hybrid DNS tunneling detection system has been proposed based on the packet length and selected features for the network traffic. The proposed system takes advantage of the outcome results conducted using the testbed and Tabu-PIO feature selection algo...

Unsupervised Learning and Rule Extraction for DNS Tunneling Detection

Internet Technology Letters

The paper deals with k-means clustering and Logic Learning Machine (LLM) for the detection of DNS tunneling. As the LLM shows more versatility in rule generation and classification precision with respect to traditional Decision Trees, the approach reveals to be robust to a large set of system conditions. The detection algorithm is designed to be applied over streaming data, without accurate tuning of algorithm' parameters. An extensive performance evaluation is provided with respect to different tunnelling tools and applications; silent intruders are considered. Results show robustness on a test set that exhibits a different behavior from training.

A hybrid method of genetic algorithm and support vector machine for DNS tunneling detection

International Journal of Electrical and Computer Engineering (IJECE), 2021

With the expansion of the business over the internet, corporations nowadays are investing numerous amounts of money in the web applications. However, there are different threats could make the corporations vulnerable for potential attacks. One of these threats is harnessing the domain name protocol for passing harmful information, this kind of threats is known as DNS tunneling. As a result, confidential information would be exposed and violated. Several studies have investigated the machine learning in order to propose a detection approach. In their approaches, authors have used different and numerous types of features such as domain length, number of bytes, content, volume of DNS traffic, number of hostnames per domain, geographic location and domain history. Apparently, there is a vital demand to accommodate feature selection task in order to identify the best features. This paper proposes a hybrid method of genetic algorithm feature selection approach with the support vector machine classifier for the sake of identifying the best features that have the ability to optimize the detection of DNS tunneling. To evaluate the proposed method, a benchmark dataset of DNS tunneling has been used. Results showed that the proposed method has outperformed the conventional SVM by achieving 0.946 of F-measure. 1. INTRODUCTION Nowadays, the need for web to perform wide range of transactions is getting imperative for large organizations and end-users. This can be represented by the search, explore and reach operations conducted to access valuable information related to medical, financial and education purposes. Particularly, various businesses recently are considerably depending on the internet for conducting their daily-basis transactions. With such a dependency of internet, wide range of challenges would be posed. A significant aspect of such challenges is the information security where confidential data belong to critical parties such as medical and military would be exposed. Security threats take different forms, but one of these form is taking the advantage of domain name system (DNS) protocol for passing dangerous and malicious procedures, this attack attempt is known as DNS tunneling [1]. DNS is characterized by its simplicity where it intends to offer a straightforward way for accessing particular server through the domain name instead of the IP address [2-5]. Because of its simplicity, attackers attempt to use it for creating a tunnel to execute malicious scripts that intended to capture confidential information, gaining a super access, or attempting to harm the server [6].

Classifying DNS Tunneling Tools For Malicious DoH Traffic

2021 IEEE Symposium Series on Computational Intelligence (SSCI), 2021

Cyber adversaries continuously seek new ways to penetrate security systems and infect computer infrastructure. The past decade has witnessed a sharp increase in attacks targeting Domain Name Server (DNS) systems used to store information about the domain names and their corresponding IP addresses (zone file). Therefore, preventing these require a new method for attacks and their strategies. Researchers suggest that appropriate remedial actions against cyber attacks can be attained by detailed investigation about the environment of digital systems. Although initially cited as a solution to attacks such as DNS spoofing and DNS tunneling, DNS over HTTPS (DoH) has introduced novel privacy challenges. Therefore, this paper contributes to the investigation of machine learning models as solutions to DNS tunneling and DoH security issues. Thus, focusing to determine how well the classifiers can distinguish between DNS tunneling types using different machine learning models which are frequently used among other researchers. The CIRA-CIC-DoHBrw-2020 data set is used for the experiments of ML models. The obtained results confirm that applying the classifiers to generate the models are good choices to detect DNS tunnelings of DNS attacks on DoH traffic. The efficacy of these models' performance was evaluated by measuring the precision, recall, F1-score, accuracy, and confusion matrix.

DNS Tunneling: a Review on Features

International Journal of Engineering & Technology, 2018

One of the significant threats that faces the web nowadays is the DNS tunneling which is an attack that exploit the domain name protocol in order to bypass security gateways. This would lead to lose critical information which is a disastrous situation for many organizations. Recently, researchers have pay more attention in the machine learning techniques regarding the process of DNS tunneling. Machine learning is significantly impacted by the utilized features. However, the lack of benchmarking standard dataset for DNS tunneling, researchers have captured the features of DNS tunneling using different techniques. This paper aims to present a review on the features used for the DNS tunneling.

Analysis and Investigation of Malicious DNS Queries Using CIRA-CIC-DoHBrw-2020 Dataset

Domain Name System (DNS) is one of the earliest vulnerable network protocols with various security gaps that have been exploited repeatedly over the last decades. DNS abuse is one of the most challenging threats for cybersecurity specialists. However, providing secure DNS is still a big challenging mission as attackers use complicated methodologies to inject malicious code in DNS inquiries. Many researchers have explored different machine learning (ML) techniques to encounter this challenge. However, there are still several challenges and barriers to utilizing ML. This paper introduces a systematic approach for identifying malicious and encrypted DNS queries by examining the network traffic and deriving statistical characteristics. Afterward, implementing several ML methods:

DNS Tunneling: A Deep Learning based Lexicographical Detection Approach

ArXiv, 2020

Domain Name Service is a trusted protocol made for name resolution, but during past years some approaches have been developed to use it for data transfer. DNS Tunneling is a method where data is encoded inside DNS queries, allowing information exchange through the DNS. This characteristic is attractive to hackers who exploit DNS Tunneling method to establish bidirectional communication with machines infected with malware with the objective of exfiltrating data or sending instructions in an obfuscated way. To detect these threats fast and accurately, the present work proposes a detection approach based on a Convolutional Neural Network (CNN) with a minimal architecture complexity. Due to the lack of quality datasets for evaluating DNS Tunneling connections, we also present a detailed construction and description of a novel dataset that contains DNS Tunneling domains generated with five well-known DNS tools. Despite its simple architecture, the resulting CNN model correctly detected m...

Botnet Detection Based On Machine Learning Techniques Using DNS Query Data

Future Internet, 2018

In recent years, botnets have become one of the major threats to information security because they have been constantly evolving in both size and sophistication. A number of botnet detection measures, such as honeynet-based and Intrusion Detection System (IDS)-based, have been proposed. However, IDS-based solutions that use signatures seem to be ineffective because recent botnets are equipped with sophisticated code update and evasion techniques. A number of studies have shown that abnormal botnet detection methods are more effective than signature-based methods because anomaly-based botnet detection methods do not require pre-built botnet signatures and hence they have the capability to detect new or unknown botnets. In this direction, this paper proposes a botnet detection model based on machine learning using Domain Name Service query data and evaluates its effectiveness using popular machine learning techniques. Experimental results show that machine learning algorithms can be used effectively in botnet detection and the random forest algorithm produces the best overall detection accuracy of over 90%.