Cloud-based multiclass anomaly detection and categorization using ensemble learning (original) (raw)

A Deep Learning Ensemble for Network Anomaly and Cyber-Attack Detection

Sensors

Currently, expert systems and applied machine learning algorithms are widely used to automate network intrusion detection. In critical infrastructure applications of communication technologies, the interaction among various industrial control systems and the Internet environment intrinsic to the IoT technology makes them susceptible to cyber-attacks. Given the existence of the enormous network traffic in critical Cyber-Physical Systems (CPSs), traditional methods of machine learning implemented in network anomaly detection are inefficient. Therefore, recently developed machine learning techniques, with the emphasis on deep learning, are finding their successful implementations in the detection and classification of anomalies at both the network and host levels. This paper presents an ensemble method that leverages deep models such as the Deep Neural Network (DNN) and Long Short-Term Memory (LSTM) and a meta-classifier (i.e., logistic regression) following the principle of stacked ge...

Machine Learning for Anomaly Detection and Categorization in Multi-cloud Environments

CSCloud, 2017

Cloud computing has been widely adopted by application service providers (ASPs) and enterprises to reduce both capital expenditures (CAPEX) and operational expenditures (OPEX). Applications and services previously running on private data centers are now being migrated to private or public clouds. Since most of the ASPs and enterprises have globally distributed user bases, their services need to be distributed across multiple clouds, spread across the globe which can achieve better performance in terms of latency, scalability and load balancing. The shift has eventually led the research community to study multi-cloud environments. However, the widespread acceptance of such environments has been hampered by major security concerns. Firewalls and traditional rule-based security protection techniques are not sufficient to protect user-data in multi-cloud scenarios. Recently, advances in machine learning techniques have attracted the attention of the research community to build intrusion detection systems (IDS) that can detect anomalies in the network traffic. Most of the research works, however, do not differentiate among different types of attacks. This is, in fact, necessary for appropriate countermeasures and defense against attacks. In this paper, we investigate both detecting and categorizing anomalies rather than just detecting, which is a common trend in the contemporary research works. We have used a popular publicly available dataset to build and test learning models for both detection and categorization of different attacks. To be precise, we have used two supervised machine learning techniques, namely linear regression (LR) and random forest (RF). We show that even if detection is perfect, categorization can be less accurate due to similarities between attacks. Our results demonstrate more than 99% detection accuracy and categorization accuracy of 93.6%, with the inability to categorize some attacks. Further, we argue that such categorization can be applied to multi-cloud environments using the same machine learning techniques.

Anomaly Detection Using XGBoost Ensemble of Deep Neural Network Models

Cybernetics and Information Technologies, 2021

Intrusion Detection Systems (IDSs) utilise deep learning techniques to identify intrusions with maximum accuracy and reduce false alarm rates. The feature extraction is also automated in these techniques. In this paper, an ensemble of different Deep Neural Network (DNN) models like MultiLayer Perceptron (MLP), BackPropagation Network (BPN) and Long Short Term Memory (LSTM) are stacked to build a robust anomaly detection model. The performance of the ensemble model is analysed on different datasets, namely UNSW-NB15 and a campus generated dataset named VIT_SPARC20. Other types of traffic, namely unencrypted normal traffic, normal encrypted traffic, encrypted and unencrypted malicious traffic, are captured in the VIT_SPARC20 dataset. Encrypted normal and malicious traffic of VIT_SPARC20 is categorised by the deep learning models without decrypting its contents, thus preserving the confidentiality and integrity of the data transmitted. XGBoost integrates the results of each deep learni...

HELAD: A novel network anomaly detection model based on heterogeneous ensemble learning

Computer Networks, 2020

Network traffic anomaly detection is an important technique of ensuring network security. However, there are usually three problems with existing machine learning based anomaly detection algorithms. First, most of the models are built for stale data sets, making them less adaptable in real-world environments; Second, most of the anomaly detection algorithms do not have the ability to learn new models again based on changes in the attack environment; Third, from the perspective of data multi-dimensionality, a single detection algorithm has a peak value and cannot be well adapted to the needs of a complex network attack environment. Thus, we propose a new anomaly detection framework, and this framework is based on the organic integration of multiple deep learning techniques. In the first step, we used the Damped Incremental Statistics algorithm to extract features from network traffic; Second, we train Autoencoder with a small amount of label data; Third, we use Autoencoder to mark the abnormal score of network traffic; Fourth, the data with the abnormal score label is used to train the LSTM; Finally, the weighted method is used to get the final abnormal score. The experimental results show that our HELAD algorithm has better adaptability and accuracy than other state of the art algorithms.

Ensemble and Deep-Learning Methods for Two-Class and Multi-Attack Anomaly Intrusion Detection: An Empirical Study

International Journal of Advanced Computer Science and Applications

Cyber-security, as an emerging field of research, involves the development and management of techniques and technologies for protection of data, information and devices. Protection of network devices from attacks, threats and vulnerabilities both internally and externally had led to the development of ceaseless research into Network Intrusion Detection System (NIDS). Therefore, an empirical study was conducted on the effectiveness of deep learning and ensemble methods in NIDS, thereby contributing to knowledge by developing a NIDS through the implementation of machine and deep-learning algorithms in various forms on recent network datasets that contains more recent attacks types and attackers' behaviours (UNSW-NB15 dataset). This research involves the implementation of a deep-learning algorithm-Long Short-Term Memory (LSTM)-and two ensemble methods (a homogeneous method-using optimised bagged Random-Forest algorithm, and a heterogeneous method-an Averaged Probability method of Voting ensemble). The heterogeneous ensemble was based on four (4) standard classifiers with different computational characteristics (Naïve Bayes, kNN, RIPPER and Decision Tree). The respective model implementations were applied on the UNSW_NB15 datasets in two forms: as a two-classed attack dataset and as a multi-attack dataset. LSTM achieved a detection accuracy rate of 80% on the two-classed attack dataset and 72% detection accuracy rate on the multi-attack dataset. The homogeneous method had an accuracy rate of 98% and 87.4% on the two-class attack dataset and the multi-attack dataset, respectively. Moreover, the heterogeneous model had 97% and 85.23% detection accuracy rate on the two-class attack dataset and the multi-attack dataset, respectively.

Toward A Holistic, Efficient, Stacking Ensemble Intrusion Detection System using a Real Cloud-based Dataset

International Journal of Advanced Computer Science and Applications

Network intrusion detection is a key step in securing today's constantly developing networks. Various experiments have been put forward to propose new methods for resisting harmful cyber behaviors. Though, as cyber-attacks turn out to be more complex, the present methodologies fail to adequately solve the problem. Thus, network intrusion detection is now a significant decision-making challenge that requires an effective and intelligent approach. Various machine learning algorithms such as decision trees, neural networks, K nearest neighbor, logistic regression, support vector machine, and Naive Bayes have been utilized to detect anomalies in network traffic. However, such algorithms require adequate datasets to train and evaluate anomaly-based network intrusion detection systems. This paper presents a testbed that could be a model for building real-world datasets, as well as a newly generated dataset, derived from real network traffic, for intrusion detection. To utilize this real dataset, the paper also presents an ensemble intrusion detection model using a meta-classification approach enabled by stacked generalization to address the issue of detection accuracy and false alarm rate in intrusion detection systems.

CloudShield: Real-time Anomaly Detection in the Cloud

ArXiv, 2021

In cloud computing, it is desirable if suspicious activities can be detected by automatic anomaly detection systems. Although anomaly detection has been investigated in the past, it remains unsolved in cloud computing. Challenges are: characterizing the normal behavior of a cloud server, distinguishing between benign and malicious anomalies (attacks), and preventing alert fatigue due to false alarms. We propose CloudShield, a practical and generalizable realtime anomaly and attack detection system for cloud computing. Cloudshield uses a general, pretrained deep learning model with different cloud workloads, to predict the normal behavior and provide real-time and continuous detection by examining the model reconstruction error distributions. Once an anomaly is detected, to reduce alert fatigue, CloudShield automatically distinguishes between benign programs, known attacks, and zero-day attacks, by examining the prediction error distributions. We evaluate the proposed CloudShield on ...

Detection of Anomaly using Machine Learning: A Comprehensive Survey

International Journal of Emerging Technology and Advanced Engineering

Anomaly detection is an important element in the domain of security. As a result, we undertook a literature review on ML algorithms that identify abnormalities. In this paper, we are presenting a review of the 101 research articles describing ML techniques for anomaly detection published between 2015 - 2022.The goal of this paper is to review research papers that have used machine learning to develop anomaly detection algorithmThe forms of anomaly detection examined in this study include system log anomaly detection, network anomaly detection, cloud-based anomaly detection, and anomaly detection in the medical profession. After assessing the selected research articles, we present more than 10 applications of anomaly detection. Also, we have shared a range of datasets used in anomaly detection research, in addition to revealing 30+ new ML models employed in anomaly detection. We have discovered 55 new datasets for anomaly detection. We've noticed that the majority of researchers ...

Classification Ensemble Based Anomaly Detection in Network Traffic

Review of Computer Engineering Research, 2019

Recently, the expansion of information technologies and the exponential increase of the digital data have deepened more the security and confidentiality issues in computer networks. In the Big Data era information security has become the main direction of scientific research and Big Data analytics is considered being the main tool in the solution of information security issue. Anomaly detection is one of the main issues in data analysis and used widely for detecting network threats. The potential sources of outliers can be noise and errors, events, and malicious attacks on the network. In this work, a short review of network anomaly detection methods is given, is looked at related works. In the article, a more exact and simple multi-classifier model is proposed for anomaly detection in network traffic based on Big Data. Experiments have been performed on the NSL-KDD data set by using the Weka. The offered model has shown decent results in terms of anomaly detection accuracy. Contribution/Originality: This study proposed multi-classifier model for increasing anomaly detection accuracy in network traffic. The model consists of the J48, LogitBoost, IBk, AdaBoost, RandomTree classifiers. This work performed a comparative analysis of used classifiers and their combination to see which one will give the best result In study classifiers and their combination have been implemented on NSL-KDD open source dataset using WEKA tool. The results show that the ensemble classifiers provide the better result than using these classifiers individually. The computer network traffic analysis with employment of our model can help network engineers and administrators to create a more reliable network, avoid possible discharges and take precautionary measures.

Ensemble based Effective Intrusion Detection System for Cloud Environment over UNSW-NB15 Dataset

Soft Computing Research Society eBooks, 2021

Advanced computing innovations are rapidly evolving, resulting in the advent of new organizational and operational strategies. Cloud computing has emerged as one of the pre-eminent innovation in the recent years. Cloud computing enables its clients to access flexible, distributed computing domain via internet. Cloud has manifested itself as a viable framework that facilitates the use of application domains, data and infrastructural facilities that mainly encompasses workstations, network and storage infrastructure. Regardless of robust and comprehensive server processing capabilities in contrast to client's processing capabilities and efficiency there are numerous security risks to the cloud from both outside and within the cloud that might exploit security flaws to cause damage. Traditional security measures have some flaws when it comes to completely shielding the networks and devices from increasingly advanced attacks. Consequently, it is all important to build an intrusion detection system to detect and prevent all kinds of intrusions in the cloud with high accuracy along with low false alarms. In this study we have suggested an anomaly-based intrusion detection system that employs ML algorithms for detection of unknown malicious attacks using an ensemble approach over the UNSW-NB15 dataset. The experimental output demonstrated the accuracy of 99.28% and 99.47% for random forest and bagging algorithms respectively.