A Reinforcement Learning Based Solution for Self-Healing in LTE Networks (original) (raw)

Optimized Power and Cell Individual Offset for Cellular Load Balancing via Reinforcement Learning

2021 IEEE Wireless Communications and Networking Conference (WCNC), 2021

We consider the problem of jointly optimizing the transmission power and cell individual offsets (CIOs) in the downlink of cellular networks using reinforcement learning. To that end, we reformulate the problem as a Markov decision process (MDP). We abstract the cellular network as a state, which comprises of carefully selected key performance indicators (KPIs). We present a novel reward function, namely, the penalized throughput, to reflect the tradeoff between the total throughput of the network and the number of covered users. We employ the twin deep delayed deterministic policy gradient (TD3) technique to learn how to maximize the proposed reward function through the interaction with the cellular network. We assess the proposed technique by simulating an actual cellular network, whose parameters and base station placement are derived from a 4G network operator, using NS-3 and SUMO simulators. Our results show the following: 1) Optimizing one of the controls is significantly inferior to jointly optimizing both controls; 2) our proposed technique achieves 18.4% throughput gain compared with the baseline of fixed transmission power and zero CIOs; 3) there is a tradeoff between the total throughput of the network and the number of covered users.

Controlling self healing cellular networks using fuzzy logic

2012 IEEE Wireless Communications and Networking Conference (WCNC), 2012

Wireless cellular communication networks is undergoing a transition from being a simply optional voice communication to becoming a necessity in our everyday lives. In order to ensure uninterrupted high Quality of Experience for subscribers, network operators must ensure 100% reliability of their networks without any discontinuity either for planned maintenance or breakdown. This paper demonstrates self healing capability to the fault recovery process for each cell. It is proposed to compensate cells in failure by neighboring cells optimizing their coverage with antenna reconfiguration and power compensation resulting in filling the coverage gap and improving the QoS for users. The right choice of these reconfigured parameters is determined through a process involving fuzzy logic control and reinforcement learning. Results show an improvement in the network performance for the area under outage as perceived by each user in the system. I.

Self-Organizing Networks: A Packet Scheduling Approach for Coverage/Capacity Optimization in 4G Networks Using Reinforcement Learning

Elektronika ir Elektrotechnika, 2014

The next generation mobile networks LTE and LTE-A are all-IP based networks. In such IP based networks, the issue of Quality of Service (QoS) is becoming more and more critical with the increase in network size and heterogeneity. In this paper, a Reinforcement Learning (RL) based framework for QoS enhancement is proposed. The framework achieves the coverage/capacity optimization by adjusting the scheduling strategy. The proposed selfoptimization algorithm uses coverage/capacity compromise in Packet Scheduling (PS) to maximize the capacity of an eNB subject to the condition that minimum coverage constraint is not violated. Each eNB has an associated agent that dynamically changes the scheduling parameter value of an eNB. The agent uses the RL technique of Fuzzy Q-Learning (FQL) to learn the optimal scheduling parameter. The learning framework is designed to operate in an environment with varying traffic, user positions, and propagation conditions. A comprehensive analysis on the obtained simulation results is presented, which shows that the proposed approach can significantly improve the network coverage as well as capacity in terms of throughput.

Reinforcement learning for joint radio resource management in LTE-UMTS scenarios

Computer Networks, 2011

The limited availability of frequency bands and their capacity limitations, together with the constantly increasing demand for high-bit-rate services in wireless communication systems, require the use of smart radio resource management strategies to ensure that different services are provided with the required quality of service (QoS) and that the available radio resources are used efficiently. In addition, the evolution of technology toward higher spectral efficiency has led to the introduction of Orthogonal Frequency-Division Multiple Access (OFDMA) by 3GPP for use in future long-term evolution (LTE) systems. However, given the current penetration of legacy technologies such as Universal Mobile Telecommunications System (UMTS), operators will face some periods in which both Radio Access Technologies (RATs) coexist. In this context, Joint Radio Resource Management (JRRM) mechanisms are helpful because they enable complementarities between different RATs to be exploited and thus facilitate more efficient use of available radio resources. This paper proposes a novel dynamic JRRM algorithm for LTE-UMTS coexistence scenarios based on Reinforcement Learning (RL), which is considered to be a good candidate for achieving the desired degree of flexibility and adaptability in future reconfigurable networks. The proposed algorithm is evaluated in dynamic environments under different load conditions and is compared with various baseline solutions.

A Proactive Context-Aware Self-Healing Scheme for 5G Using Machine Learning

International Journal of Information and Communication Technology Research, 2018

Future mobile communication networks particularly 5G networks require to be efficient, reliable and agile to fulfill the targeted performance requirements. All layers of the network management need to be more intelligent due to the density and complexity anticipated for 5G networks. In this regard, one of the enabling technologies to manage the future mobile communication networks is Self-Organizing Network (SON). Three common types of SON are self-configuration, Self-Healing (SH) and self-optimization. In this paper, a framework is developed to analyze proactive SH by investigating the effect of recovery actions executed in sub-health states. Our proposed framework considers both detection and compensation processes. Learning method is employed to classify the system into several sub-health (faulty) states in detection process. The system is modeled by Markov Decision Process (MDP) in compensation process in which the equivalent Linear Programing (LP) approach is utilized to find the action or policy that maximizes a given performance metric. Numerical results obtained in several scenarios with different goals demonstrate that the optimized proposed algorithm in compensation process outperforms the algorithm with randomly selected actions.

Self Organizing Networks: A Reinforcement Learning approach for self-optimization of LTE Mobility parameters

Automatika ‒ Journal for Control, Measurement, Electronics, Computing and Communications, 2014

Original scientific paper With the evolution of broadband mobile networks towards LTE and beyond, the support for the Internet and Internet based services is growing. Self Organizing Network (SON) functionalities intend to optimize the network performance for the improved user experience while at the same time reducing the network operational cost. This paper proposes a Reinforcement Learning (RL) based framework to improve throughput of the mobile users. The problem of spectral efficiency maximization is modeled as cooperative Multi-Agent control problem between the neighbouring eNodeBs (eNBs). Each eNB has an associated agent that dynamically changes the outgoing Handover Margin (HM) to its neighbouring cells. The agent uses the RL technique of Fuzzy Q-Learning (FQL) to learn the optimal mobility parameter i.e., HM value. The learning framework is designed to operate in an environment with the variations in traffic, user positions and propagation conditions. Simulation results have shown the proposed approach improves the network capacity and user experiences in terms of throughput.

Deep Reinforcement Learning for Joint Spectrum and Power Allocation in Cellular Networks

2021 IEEE Globecom Workshops (GC Wkshps), 2021

A wireless network operator typically divides the radio spectrum it possesses into a number of subbands. In a cellular network those subbands are then reused in many cells. To mitigate co-channel interference, a joint spectrum and power allocation problem is often formulated to maximize a sum-rate objective. The best known algorithms for solving such problems generally require instantaneous global channel state information and a centralized optimizer. In fact those algorithms have not been implemented in practice in large networks with time-varying subbands. Deep reinforcement learning algorithms are promising tools for solving complex resource management problems. A major challenge here is that spectrum allocation involves discrete subband selection, whereas power allocation involves continuous variables. In this paper, a learning framework is proposed to optimize both discrete and continuous decision variables. Specifically, two separate deep reinforcement learning algorithms are designed to be executed and trained simultaneously to maximize a joint objective. Simulation results show that the proposed scheme outperforms both the state-of-the-art fractional programming algorithm and a previous solution based on deep reinforcement learning.

Using Reinforcement Learning to Allocate and Manage SFC in Cellular Networks

2020 16th International Conference on Network and Service Management (CNSM), 2020

In this paper, we propose the use of reinforcement learning to deploy a service function chain (SFC) of cellular network service and manage the VNFs operation. We consider that the SFC is deployed by the reinforcement learning agent considering a scenario with distributed data centers, where the virtual network functions (VNFs) are deployed in virtual machines in commodity servers. The VNF management is related to create, delete, and restart the VNFs. The main purpose is to reduce the number of lost packets taking into account the energy consumption of the servers. We use the Proximal Policy Optimization (PPO2) algorithm to implement the agent and preliminary results show that the agent is able to allocate the SFC and manage the VNFs, reducing the number of lost packets.

A Reinforcement Learning Framework for Autonomous Cell Activation and Customized Energy-Efficient Resource Allocation in C-RANs

KSII Transactions on Internet and Information Systems, 2019

Cloud radio access networks (C-RANs) have been regarded in recent times as a promising concept in future 5G technologies where all DSP processors are moved into a central base band unit (BBU) pool in the cloud, and distributed remote radio heads (RRHs) compress and forward received radio signals from mobile users to the BBUs through radio links. In such dynamic environment, automatic decision-making approaches, such as artificial intelligence based deep reinforcement learning (DRL), become imperative in designing new solutions. In this paper, we propose a generic framework of autonomous cell activation and customized physical resource allocation schemes for energy consumption and QoS optimization in wireless networks. We formulate the problem as fractional power control with bandwidth adaptation and full power control and bandwidth allocation models and set up a Q-learning model to satisfy the QoS requirements of users and to achieve low energy consumption with the minimum number of active RRHs under varying traffic demand and network densities. Extensive simulations are conducted to show the effectiveness of our proposed solution compared to existing schemes.

Reinforcement learning based radio resource scheduling in LTE-advanced

2011

In this paper, a novel radio resource scheduling policy for Long Term Evolution Advanced (LTE-A) radio access technology in downlink acceptance is proposed. The scheduling process works with dispatching rules which are various with different behaviors. In the literature, the scheduling disciplines are applied for the entire transmission sessions and the scheduler performance strongly depends on the exploited discipline. Our method provides a straightforward schedule within transmission time interval (TTI) frame. Hence, a mixture of disciplines can be used for each TTI instead of the single one adopted across the whole transmission. The grand objective is to bring real improvements in terms of system throughput, system capacity and spectral efficiency (operator benefit) assuring in the same time the best user fairness and Quality of Services (QoS) capabilities (user benefit). In order to meet this objective, each rule must to be called on the best matching conditions. The policy adoption and refinement are the best way to optimize the use of mixture of rules. The Q-III reinforcement learning algorithm is proposed for the policy adoption in order to transform the scheduling experiences into a permanent nature, facilitating the decision-making on which rules will be used for each TTI. The IQ-III reinforcement learning algorithm using multiagent environments refines the policy adoption by considering the agents' opinions in order to reduce the policy convergence time.