Toward A Holistic, Efficient, Stacking Ensemble Intrusion Detection System using a Real Cloud-based Dataset (original) (raw)

A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets

Security and Communication Networks, 2020

The problem of network intrusion detection poses innumerable challenges to the research community, industry, and commercial sectors. Moreover, the persistent attacks occurring on the cyber-threat landscape compel researchers to devise robust approaches in order to address the recurring problem. Given the presence of massive network traffic, conventional machine learning algorithms when applied in the field of network intrusion detection are quite ineffective. Instead, a hybrid multimodel solution when sought improves performance thereby producing reliable predictions. Therefore, this article presents an ensemble model using metaclassification approach enabled by stacked generalization. Two contemporary as well as heterogeneous datasets, namely, UNSW NB-15, a packet-based dataset, and UGR’16, a flow-based dataset, that were captured in emulated as well as real network traffic environment, respectively, were used for experimentation. Empirical results indicate that the proposed stacki...

Ensemble Classifiers for Network Intrusion Detection Using a Novel Network Attack Dataset

Future Internet

Due to the extensive use of computer networks, new risks have arisen, and improving the speed and accuracy of security mechanisms has become a critical need. Although new security tools have been developed, the fast growth of malicious activities continues to be a pressing issue that creates severe threats to network security. Classical security tools such as firewalls are used as a first-line defense against security problems. However, firewalls do not entirely or perfectly eliminate intrusions. Thus, network administrators rely heavily on intrusion detection systems (IDSs) to detect such network intrusion activities. Machine learning (ML) is a practical approach to intrusion detection that, based on data, learns how to differentiate between abnormal and regular traffic. This paper provides a comprehensive analysis of some existing ML classifiers for identifying intrusions in network traffic. It also produces a new reliable dataset called GTCS (Game Theory and Cyber Security) that ...

A Stacked Generalization Ensemble Approach for Improved Intrusion Detection

IJCSIS Vol. 18 No. 5 May 2020 Issue, 2020

Classical machine learning techniques have been employed severally in intrusion detection. But due to the rising cases and sophistication of attacks, more advanced machine learning techniques including ensemble-based methods, neural networks and deep learning techniques have been applied. However, there is still need for improved machine learning approach to detect attacks more effectively and efficiently. Stacked generalization approach has been shown to be capable of learning from features and meta-features but has been limited by the deficiencies of base classifiers and lack of optimization in the choice of meta-feature combination. This paper therefore proposes a stacked generalization ensemble approach based on two-tier meta-learner, in which the outputs of classical stacked ensemble are passed to multi-feature-based stacked ensemble, which is optimized. A Grid-search approach is used for the optimization. Nine data features and four meta-features derived from Logistic Regression, Support Vector Machine, Naïve Bayes, and Multilayer Perceptron neural network are used for the machine learning classification task. By applying neural networks as the meta-learner for the classification of NSL-KDD data, improved performances in terms of accuracy, precision, recall and F-measure of 0.97, 0.98, 0.98 and 0.98, respectively are achieved.

A Stacked Ensemble Intrusion Detection Approach for the Protection of Information System

International Journal for Information Security Research, 2020

Cyber attackers daily works round the clock to compromise the availability, confidentiality and integrity of information system, protection of information system has been a great challenge to network administrators. Intrusion detection system (IDS) analyse network traffics to detect and alert any attempt to compromise the computer systems and its resources, stacked ensemble build synergy among two or more IDS models to improved intrusion detection accuracy. This research focus on the application of stacked ensemble to the development of enhanced Intrusion Detection Systems (IDS) for protection of information system. Relevant features of the UNSW-15NB intrusion detection dataset were selected to train three base machine learning algorithms; comprising of K Nearest Neighbor, Naïve Bayes' and Decision Tree, to build the base-predictive models. Decision Tree model with features selected by Information Gain features selection technique, recorded the highest classification accuracy on evaluation with the test dataset. Three meta algorithms; Multi Response Linear Regression (MLR),Meta Decision Tree (MDT), and Multiple Model Trees (MMT) were trained with the predictions of base predictive models to build the stacked ensemble. Python programming language was used for the implementation of the ensemble models. The stacked ensemble recorded, improved classification accuracy of 3.0% over the highest accuracy recorded by the base models and 5.11% above the least accuracy recorded by the base models. False alarm improvement of 0.89% and 3.29% were recorded by the stacked ensemble over highest and least false alarm recorded by the base models respectively. The evaluation of this work shows a great improvement over reviewed works in literature

Ensuring network security with a robust intrusion detection system using ensemble-based machine learning

ARRAY, 2023

Intrusion detection is a critical aspect of network security to protect computer systems from unauthorized access and attacks. The capacity of traditional intrusion detection systems (IDS) to identify unknown sophisticated threats is constrained by their reliance on signature-based detection. Approaches based on machine learning have shown promising results in identifying unknown malicious attacks. No learning algorithm-based model, however, is able to accurately and consistently detect all different kinds of attacks. Besides that, the existing models are tested for a specific dataset. In this research, a novel ensemble-based machine-learning technique for intrusion detection is presented. Numerous public datasets and multiple ensemble strategies, including Random Forest, Gradient Boosting, Adaboost, Gradient XGBoost, Bagging, and Simple Stacking, will be employed to evaluate the performance of the proposed approach. The most relevant features for the detection of intrusion are selected using correlation analysis, mutual information, and principal component analysis. Our research using different ensemble methods demonstrates that the proposed approach using the Random Forest technique outperforms existing approaches in terms of accuracy and FPR, typically exceeding 99% with better evaluation metrics like Precision, Recall, F1-score, Balanced Accuracy, Cohen's Kappa, etc. This strategy may be a useful tool for strengthening the safety of computer systems and networks against emerging cyber threats.

Towards Effective Network Intrusion Detection: From Concept to Creation on Azure Cloud

IEEE Access, 2021

Network Intrusion Detection is one of the most researched topics in the field of computer security. Hacktivists use sophisticated tools to launch numerous attacks that hamper the confidentiality, integrity and availability of computer resources. There is an incessant need to safeguard these resources to avoid further damage. In the proposed study, we have presented a meta-classification approach using decision jungle to perform both binary and multiclass classification. We have established the robustness of our approach by configuring an optimal set of hyper-parameters coupled with relevant feature subsets using a production-ready environment namely Azure machine learning. We have validated the efficiency of the proposed design using three contemporary datasets namely UNSW NB-15, CICIDS 2017, and CICDDOS 2019. We could achieve an accuracy of 99.8% pertaining to UNSW NB-15 whereas the accuracy in the case of CICIDS 2017 and CICDDOS 2019 datasets has been 98% and 97% respectively. A distinctive ability of the proposed model lies in its finesse to detect thirty-three modern attack types considerably well. Unlike conventional stacking ensembles, the proposed solution relies on a train-test ratio of 40:60 to establish the legitimacy of predictions. We also conducted statistical significance tests to compare the performance of classifiers involved in the study. To extend the functionalities further, we have automated the proposed model that can be a reliable candidate for real-time network intrusion detection.

Intrusion Detection and Attack Classification using an Ensemble Approach

2020

The challenges to ensure safe and trusted communication of information between various organizations have increased multifold in recent past. Intrusion Detection Systems such as firewall, message encryption and other approaches are being employed with partial success, however the risks and chances of malicious intrusions are still posing a threat. We are proposing to make use of recent advancements in the field of machine learning to develop an intrusion detection system. In our work, the machine learning classifiers namely, random forest, decision table, multi-layer perceptron and naive bayes were used in an ensemble model showing a significant improvement in the overall accuracy. The proposed approach was implemented using a bench-marking dataset from KDDCup.

An Ensemble of Classification Techniques for Intrusion Detection Systems

IJCSIS Vol 17 No 11 November Issue, 2019

Abstract-Extenuating intrusions into a network has become a great concern for network security scholars as they pose a threat to the confidentiality, integrity and availability of the data stored as well as derogating the services rendered by the network. Several researchers have proposed diverse techniques in other to curb intrusions into a network using various mechanisms. One of the mechanisms used is data mining. However, some of these systems have high false positive rates and relatively low detection rates which signifies a flaw in the system. In other to drastically reduce false positive rate and achieve higher detection rate whilst maintaining computational efficiency, a stacking ensemble using random forest, naïve bayes and c4.5 classifiers as base learners and support vector machine as the meta learner was proposed. The proposed stacking ensemble has a detection rate of 99.5% and a false positive rate of 0.6%. Compared to existing frameworks, the proposed ensemble performed better in detecting intrusions. Keywords: data mining, ensemble, false positive rate, intrusions, stacking

Statistical performance assessment of supervised machine learning algorithms for intrusion detection system

2024

Several studies have shown that an ensemble classifier's effectiveness is directly correlated with the diversity of its members. However, the algorithms used to build the base learners are one of the issues encountered when using a stacking ensemble. Given the number of options, choosing the best ones might be challenging. In this study, we selected some of the most extensively applied supervised machine learning algorithms and performed a performance evaluation in terms of well-known metrics and validation methods using two internet of things (IoT) intrusion detection datasets, namely network-based anomaly internet of things (N-BaIoT) and internet of things intrusion detection dataset (IoTID20). Friedman and Dunn's tests are used to statistically examine the significant differences between the classifier groups. The goal of this study is to encourage security researchers to develop an intrusion detection system (IDS) using ensemble learning and to propose an appropriate method for selecting diverse base classifiers for a stacking-type ensemble. The performance results indicate that adaptive boosting, and gradient boosting (GB), gradient boosting machines (GBM), light gradient boosting machines (LGBM), extreme gradient boosting (XGB) and deep neural network (DNN) classifiers exhibit better trade-off between the performance parameters and classification time making them ideal choices for developing anomaly-based IDSs.

A Machine Learning Approach for Intrusion Detection using Ensemble Technique-A Survey

An Intrusion detection system is a machine or software that monitors the traffic in a network and on detection of a malicious packet, informs the user or a specific acting unit which can take further action and avoid the malicious packet from entering the network. In network intrusion, there may be multiple computing nodes attacked by intruders. The evidences of intrusions have to gather from all such attacked nodes. An intruder may move between multiple nodes in the network to conceal the origin of attack, or misuse some compromised hosts to launch the attack on other nodes. To detect such intrusion activities spread over the whole network, we present a new intrusion detection system (IDS) that classifies data with three different classifiers and an Ensemble technique that selects the majority of the three classifiers to assign the packet in the network as anomaly or normal. In this paper, we discuss a different ways to implement intelligent IDS, which classifies the normal traffic...