Machine-Learning Based Approaches for Anomaly Detection and Classification in Cellular Networks (original) (raw)

Abnormal Behavior Detection Based on Traffic Pattern Categorization in Mobile Networks

IEEE Transactions on Network and Service Management, 2021

Abnormal behavior in mobile cellular networks can cause network faults and consequent cell outages, a major reason for operational cost increase and revenue loss for operators. Nonetheless, network faults and cell outages can be avoided by monitoring abnormal situations in the network and acting accordingly. Thus, anomaly detection is an important component of self-healing control and network management. Network operators may use the detected abnormal behavior to quantify numerically their intensity. The quantification of abnormal behavior assists the characterization of potential regions for infrastructure updates and to support the creation of public policies for local connectivity enhancements. We propose an unsupervised learning solution for anomaly detection in mobile networks using Call Detail Records (CDR) data. We evaluate our solution using a real CDR data set provided by an Italian operator and compare it against other state-of-the-art solutions, showing a performance improvement of around 35%. We also demonstrate the relevance of considering the distinct traffic patterns of diverging geographic areas for anomaly detection in mobile networks, an aspect often ignored in the literature.

On monitoring and predicting mobile network traffic abnormality

Simulation Modelling Practice and Theory, 2014

Traffic analysis and traffic abnormality detection are emerged as an efficient way of detecting network attacks in recent years. The existing approaches can be improved by introducing a new model and a new analysis method of network user's traffic behaviors. The description dimensions to network user's traffic behaviors in the current approaches are high, resulting in high processing complexity, high delay in differentiating an individual user's abnormal traffic behavior from massive network data, and low detection rate. To improve the detection rate and efficiency, we develop a new method of establishing user's traffic behavior analysis system based on a new model of network traffic monitoring. First, we establish a more complete feature set based on the characteristics of network traffic to describe massive network user's behaviors. Then, we define a feature selection rule based on the relative deviation distance to select the optimized feature set. We use the selected feature set to locate the abnormality moment and the users who produce the abnormal traffic behavior. Finally, a traffic behavior analysis method based on prediction is developed to improve efficiency of the system. This new method is applied to evaluate the mobile users on mobile cloud. The experimental results show that the proposed method has a higher detection rate and lower delay in the analysis of abnormal user's traffic behavior than that of the existing approaches.

Call Detail Records Driven Anomaly Detection and Traffic Prediction in Mobile Cellular Networks

IEEE Access, 2018

—Mobile networks possess information about the users as well as the network. Such information is useful for making the network end-to-end visible and intelligent. Big data analytics can efficiently analyze user and network information, unearth meaningful insights with the help of machine learning tools. Utilizing big data analytics and machine learning, this work contributes in three ways. First, we utilize the call detail records (CDR) data to detect anomalies in the network. For authentication and verification of anomalies, we use k-means clustering, an unsupervised machine learning algorithm. Through effective detection of anomalies, we can proceed to suitable design for resource distribution as well as fault detection and avoidance. Second, we prepare anomaly-free data by removing anomalous activities and train a neural network model. By passing anomaly and anomaly-free data through this model, we observe the effect of anomalous activities in training of the model and also observe mean square error of anomaly and anomaly free data. Lastly, we use an autoregressive integrated moving average (ARIMA) model to predict future traffic for a user. Through simple visualization, we show that anomaly free data better generalizes the learning models and performs better on prediction task.

Deep RAN: A Scalable Data-driven platform to Detect Anomalies in Live Cellular Network Using Recurrent Convolutional Neural Network

2020 IEEE 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI), 2020

In this paper, we propose a novel algorithm to detect anomaly in terms of Key Parameter Indicators (KPI)s over live cellular networks based on the combination of Recurrent Neural Networks (RNN), and Convolutional Neural Networks (CNN), as Recurrent Convolutional Neural Networks (R-CNN). CNN models the spatial correlations and information, whereas, RNN models the temporal correlations and information. Hence, adopting R-CNN provides us with spatial-temporal analysis. In this paper, the studied cellular network consists of 2G, 3G, 4G, and 4.5G technologies, and the KPIs include Voice and data traffic of the cells. The data and voice traffics are extremely important for the owner of wireless networks, because it is directly related to the revenue, and quality of service that users experience. These traffic changes happen due to a couple of reasons: the subscriber behavior changes due to especial events, making neighbor sites on-air or down, or by shifting the traffic to the other technologies, e.g. shifting the traffic from 3G to 4G. Traditionally, in order to keep the network stable, the traffic should be observed layer by layer during each interval to detect major changes in KPIs, in large scale telecommunication networks it will be too time-consuming with the low accuracy of anomaly detection. However, the proposed algorithm is capable of detecting the abnormal KPIs for each element of the network in a time-efficient and accurate manner. It observes the traffic layer trends, and classifies them into 8 traffic categories: Normal, Suddenly Increasing, Gradually Increasing, Suddenly Decreasing, Gradually Decreasing, Faulty Site, New Site, and Down Site. This classification task enables the vendors and operators to detect anomalies in their live networks in order to keep the KPIs in normal trend. The algorithm is trained and tested on the real dataset over a cellular network with more than 25000 thousand.

A Distribution-Based Approach to Anomaly Detection and Application to 3G Mobile Traffic

GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference, 2009

In this work we present a novel scheme for statistical-based anomaly detection in 3G cellular networks. The traffic data collected by a passive monitoring system are reduced to a set of per-mobile user counters, from which time-series of unidimensional feature distributions are derived. An example of feature is the number of TCP SYN packets seen in uplink for each mobile user in fixed-length time bins. We design a changedetection algorithm to identify deviations in each distribution time-series. Our algorithm is designed specifically to cope with the marked non-stationarities, daily/weekly seasonality and longterm trend that characterize the global traffic in a real network. The proposed scheme was applied to the analysis of a large dataset from an operational 3G network. Here we present the algorithm and report on our practical experience with the analysis of real data, highlighting the key lessons learned in the perspective of the possible adoption of our anomaly detection tool on a production basis.

Network Anomaly Detection in 5G Networks

On the telecommunications front, 5G is the fifth-generation technology standard for broadband cellular networks, which is a replacement for the 4G networks used by most current phones. Hundreds of businesses, organizations, and governments suffer from cyberattacks that compromise sensitive information in which 5G is one of them. Those breaches of the data would not have occurred if there is a way to detect strange behaviors in a 5G network, and this is what this paper presenting. Network Anomaly Detection (NAD) in 5G is a way to observe the network constantly to detect any unusual behavior. However, it is not that straightforward and rather a complex process due to huge, continuous, and stochastic network traffic patterns. In the literature, several approaches and methods have been employed for anomaly detection as well as prediction. This paper illustrates state-of-the-art method to proposed achieve the NAD. For instance, pattern based, machine learning based, ensemble learning based, user intention based, and some integrated methods have been surveyed and analyzed. KNN and K-prototype algorithm were tested together on the dataset and compared with integrated approach. The integrated approach outperformed with respect to the KNN and K-prototype methods. As a conclusion, forecasting of analyst detection of cyber events is presented as a final method for future anomaly prediction.

Machine Learning based Anomaly Detection for 5G Networks

2020

Protecting the networks of tomorrow is set to be a challenging domain due to increasing cyber security threats and widening attack surfaces created by the Internet of Things (IoT), increased network heterogeneity, increased use of virtualisation technologies and distributed architectures. This paper proposes SDS (Software Defined Security) as a means to provide an automated, flexible and scalable network defence system. SDS will harness current advances in machine learning to design a CNN (Convolutional Neural Network) using NAS (Neural Architecture Search) to detect anomalous network traffic. SDS can be applied to an intrusion detection system to create a more proactive and end-to-end defence for a 5G network. To test this assumption, normal and anomalous network flows from a simulated environment have been collected and analyzed with a CNN. The results from this method are promising as the model has identified benign traffic with a 100% accuracy rate and anomalous traffic with a 96.4% detection rate. This demonstrates the effectiveness of network flow analysis for a variety of common malicious attacks and also provides a viable option for detection of encrypted malicious network traffic.

Artificial Intelligence-Powered Mobile Edge Computing-Based Anomaly Detection in Cellular Networks

IEEE Transactions on Industrial Informatics, 2019

Escalating cell outages and congestion-treated as anomalies-cost a substantial revenue loss to the cellular operators and severely affect subscriber quality of experience. Stateof-the-art literature applies feed-forward deep neural network at core network (CN) for the detection of above problems in a single cell; however, the solution is impractical as it will overload the CN that monitors thousands of cells at a time. Inspired from mobile edge computing and breakthroughs of deep convolutional neural networks (CNNs) in computer vision research, we split the network into several 100-cell regions each monitored by an edge server; and propose a framework that pre-processes raw call detail records having user activities to create an image-like volume, fed to a CNN model. The framework outputs a multilabeled vector identifying anomalous cell(s). Our results suggest that our solution can detect anomalies with up to 96% accuracy, and is scalable and expandable for industrial Internet of things environment.

Comparison of Anomaly Detection Techniques Applied to Different Problems in the Telecom Industry

2021

Nowadays, with the growth of digital transformation in companies, a huge amount of data is generated every second as a result of various processes. Often this data contains important information which, when properly analyzed, can help a company gain a competitive advantage. One data processing task common to many different applications is detection of anomalies, that is, data points or groups of data points that stand out from most of the others. Since it is not feasible to have an operator constantly analyzing the data to find anomalous values, due to the generally large volumes of data, the focus of this dissertation is the exploration of a Data Mining area called anomaly detection. In this dissertation we first develop an anomaly detection software in Python, that applies 10 different anomaly detection algorithms, after automatically optimizing their parameters, to an arbitrary dataset. Before applying these algorithms, the software also performs the task of data scaling and imputation of missing values. It outputs the results of the performance metrics of each algorithm, the values of the optimized parameters and the graphics for the results visualization generated using the method t-SNE. This software was then applied to three case studies to compare the performance of different anomaly detection approaches using real-world datasets. These datasets have an increasing level of difficulty associated with them: the amount of missing data and the uncertainty associated with the ground truth regarding the anomalies. In the first case study, we detected fraudulent bank transactions using a public dataset. Then, in the second case we identified clients of a telecommunication company who were likely to miss their payment, leading to contract termination. For this case we used a dataset from a telecommunications company. In the third case, we detected low quality of internet service, again using a large dataset with real measurements from a telecommunications company. Finally, we implemented a state of the art, neural network model, specially applicable to the task of identifying anomalies in time-series data. We optimized the parameters of the network, and applied it to address the problem of low quality of service.

Machine Learning for the Identification of Network Anomalies

Indian Scientific Journal Of Research In Engineering And Management, 2023

The most popular technique for identifying and blocking malicious network requests is the intrusion detection system, or IDS for short. They are positioned carefully to keep an eye on network traffic going to and coming from every device. Most networking devices can employ an IDS with the use of virtual machines and sophisticated switches. While having good accuracy, the classic SIDS (Signature-Based Intrusion Detection System) cannot identify many modern incursions, such as zero-day attacks, as it relies on a pattern matching technique. Instead, the majority of recently launched attacks can be detected using machine learning, statistical, and knowledge-based methods. An anomaly is defined as any significant difference between the observed behavior and the model.The training phase and the testing phase make up the two stages of the development of these models. During the training phase, a model of typical behavior is learned using the average traffic profile. The system's ability to generalize to as-yet-undiscovered intrusions is then determined during the testing step using a fresh data set. In order to identify network traffic anomalies, we have used an unsupervised machine-learning approach called Isolation Forest in this paper. Using the anomaly score, the algorithm finds the outliers. The KDD data set, a well-known benchmark in the study of Intrusion Detection methods, has been used for training and testing.