Resource Management in Wireless Networks via Multi-Agent Deep Reinforcement Learning (original) (raw)

Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks

IEEE Journal on Selected Areas in Communications, 2019

This work demonstrates the potential of deep reinforcement learning techniques for transmit power control in wireless networks. Existing techniques typically find near-optimal power allocations by solving a challenging optimization problem. Most of these algorithms are not scalable to large networks in real-world scenarios because of their computational complexity and instantaneous cross-cell channel state information (CSI) requirement. In this paper, a distributively executed dynamic power allocation scheme is developed based on model-free deep reinforcement learning. Each transmitter collects CSI and quality of service (QoS) information from several neighbors and adapts its own transmit power accordingly. The objective is to maximize a weighted sum-rate utility function, which can be particularized to achieve maximum sum-rate or proportionally fair scheduling. Both random variations and delays in the CSI are inherently addressed using deep Q-learning. For a typical network architecture, the proposed algorithm is shown to achieve near-optimal power allocation in real time based on delayed CSI measurements available to the agents. The proposed scheme is especially suitable for practical scenarios where the system model is inaccurate and CSI delay is non-negligible. Index Terms-Deep Q-learning, radio resource management, interference mitigation, power control, Jakes fading model. 1 A dynamic power allocation problem with time-varying channels for a different system model and network setup was studied in [10], where the delay performance of the classical dynamic backpressure algorithm was improved by exploiting the stochastic Lyapunov optimization framework.

When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework

ArXiv, 2019

Interference among concurrent transmissions in a wireless network is a key factor limiting the system performance. One way to alleviate this problem is to manage the radio resources in order to maximize either the average or the worst-case performance. However, joint consideration of both metrics is often neglected as they are competing in nature. In this article, a mechanism for radio resource management using multi-agent deep reinforcement learning (RL) is proposed, which strikes the right trade-off between maximizing the average and the 5th5^{th}5th percentile user throughput. Each transmitter in the network is equipped with a deep RL agent, receiving partial observations from the network (e.g., channel quality, interference level, etc.) and deciding whether to be active or inactive at each scheduling interval for given radio resources, a process referred to as link scheduling. Based on the actions of all agents, the network emits a reward to the agents, indicating how good their joi...

Radio Resource Allocation for 5G Network Using Deep Reinforcement Learning

International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2023

Resource allocation is a critical task in 5Gnetworks that determines how network resources are assigned to different devices and services. Traditional methods rely on predefined rules or heuristics, which may not always be optimal. Deep reinforcement learning (DRL)is a promising approach for radio resource allocation in 5Gnetworks as it can learn to optimize resource allocation based on feedback from the network. In DRL, an agent learns to make decisions based on rewards and penalties received from the environment.

Adaptive Wireless Network Management with Multi-Agent Reinforcement Learning

Sensors, 2022

Wireless networks are trending towards large scale systems, containing thousands of nodes, with multiple co-existing applications. Congestion is an inevitable consequence of this scale and complexity, which leads to inefficient use of the network capacity. This paper proposes an autonomous and adaptive wireless network management framework, utilising multi-agent deep reinforcement learning, to achieve efficient use of the network. Its novel reward function incorporates application awareness and fairness to address both node and network level objectives. Our experimental results demonstrate the proposed approach’s ability to be optimised for application-specific requirements, while optimising the fairness of the network. The results reveal significant performance benefits in terms of adaptive data rate and an increase in responsiveness compared to a single-agent approach. Some significant qualitative benefits of the multi-agent approach—network size independence, node-led priorities,...

Deep Reinforcement Learning for Radio Resource Allocation and Management in Next Generation Heterogeneous Wireless Networks: A Survey

ArXiv, 2021

Next generation wireless networks are expected to be extremely complex due to their massive heterogeneity in terms of the types of network architectures they incorporate, the types and numbers of smart IoT devices they serve, and the types of emerging applications they support. In such large-scale and heterogeneous networks (HetNets), radio resource allocation and management (RRAM) becomes one of the major challenges encountered during system design and deployment. In this context, emerging Deep Reinforcement Learning (DRL) techniques are expected to be one of the main enabling technologies to address the RRAM in future wireless HetNets. In this paper, we conduct a systematic in-depth, and comprehensive survey of the applications of DRL techniques in RRAM for next generation wireless networks. Towards this, we first overview the existing traditional RRAM methods and identify their limitations that motivate the use of DRL techniques in RRAM. Then, we provide a comprehensive review of...

Spectrum Sharing between Cellular and Wi-Fi Networks based on Deep Reinforcement Learning

International journal of Computer Networks & Communications

Recently, mobile traffic is growing rapidly and spectrum resources are becoming scarce in wireless networks. Due to this, the wireless network capacity will not meet the traffic demand. To address this problem, using cellular systems in an unlicensed spectrum emerged as an effective solution. In this case, cellular systems need to coexist with Wi-Fi and other systems. For that, we propose an efficient channel assignment method for Wi-Fi AP and cellular NB, based on the DRL method. To train the DDQN model, we implement an emulator as an environment for spectrum sharing in densely deployed NB and APs in wireless heterogeneous networks. Our proposed DDQN algorithm improves the average throughput from 25.5% to 48.7% in different user arrival rates compared to the conventional method. We evaluated the generalization performance of the trained agent, to confirm channel allocation efficiency in terms of average throughput under the different user arrival rates.

Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access

IEEE Transactions on Wireless Communications, 2019

We consider the problem of dynamic spectrum access for network utility maximization in multichannel wireless networks. The shared bandwidth is divided into K orthogonal channels. In the beginning of each time slot, each user selects a channel and transmits a packet with a certain transmission probability. After each time slot, each user that has transmitted a packet receives a local observation indicating whether its packet was successfully delivered or not (i.e., ACK signal). The objective is a multiuser strategy for accessing the spectrum that maximizes a certain network utility in a distributed manner without online coordination or message exchanges between users. Obtaining an optimal solution for the spectrum access problem is computationally expensive in general due to the large state space and partial observability of the states. To tackle this problem, we develop a novel distributed dynamic spectrum access algorithm based on deep multiuser reinforcement leaning. Specifically, at each time slot, each user maps its current state to spectrum access actions based on a trained deep-Q network used to maximize the objective function. Game theoretic analysis of the system dynamics is developed for establishing design principles for the implementation of the algorithm. Experimental results demonstrate strong performance of the algorithm.

Deep Multi-User Reinforcement Learning for Dynamic Spectrum Access in Multichannel Wireless Networks

GLOBECOM 2017 - 2017 IEEE Global Communications Conference, 2017

We consider the problem of dynamic spectrum access for network utility maximization in multichannel wireless networks. The shared bandwidth is divided into K orthogonal channels, and the users access the spectrum using a random access protocol. In the beginning of each time slot, each user selects a channel and transmits a packet with a certain attempt probability. After each time slot, each user that has transmitted a packet receives a local observation indicating whether its packet was successfully delivered or not (i.e., ACK signal). The objective is to find a multiuser strategy that maximizes a certain network utility in a distributed manner without online coordination or message exchanges between users. Obtaining an optimal solution for the spectrum access problem is computationally expensive in general due to the large state space and partial observability of the states. To tackle this problem, we develop a distributed dynamic spectrum access algorithm based on deep multiuser reinforcement leaning. Specifically, at each time slot, each user maps its current state to spectrum access actions based on a trained deep-Q network used to maximize the objective function. Experimental results have demonstrated that users are capable to learn good policies that achieve strong performance in this challenging partially observable setting only from their ACK signals, without online coordination, message exchanges between users, or carrier sensing.

Access and Radio Resource Management for IAB Networks Using Deep Reinforcement Learning

IEEE Access

Congestion in dense traffic networks is a prominent obstacle towards realizing the performance requirements of 5G new radio. Since traditional adaptive traffic signal control cannot resolve this type of congestion, realizing context in the network and adapting resource allocation based on real-time parameters is an attractive approach. This article proposes a radio resource management solution for congestion avoidance on the access side of an integrated access and backhaul (IAB) network using deep reinforcement learning (DRL). The objective of this article is to obtain an optimal policy under which the transmission throughput of all UEs is maximized under the dictates of environmental pressures such as traffic load and transmission power. Here, the resource management problem was converted into a constrained problem using Markov decision processes and dynamic power management, where a deep neural network was trained for optimal power allocation. By initializing a power control parameter, θ t , with zero-mean normal distribution, the DRL algorithm adopts a learning policy that aims to achieve logical allocation of resources by placing more emphasis on congestion control and user satisfaction. The performance of the proposed DRL algorithm was evaluated using two learning schemes, i.e., individual learning and nearest neighbor cooperative learning, and this was compared with the performance of a baseline algorithm. The simulation results indicate that the proposed algorithms give better overall performance when compared to the baseline algorithm. From the simulation results, there is a subtle, but critically important insight that brings into focus the fundamental connection between learning rate and the two proposed algorithms. The nearest neighbor cooperative learning algorithm is suitable for IAB networks because its throughput has a good correlation with the congestion rate. INDEX TERMS Congestion control, deep reinforcement learning, integrated access and backhaul, millimeter wave, nearest neighbor, resource allocation.

Dynamic spectrum access and sharing through actor-critic deep reinforcement learning

EURASIP Journal on Wireless Communications and Networking

When primary users of the spectrum use frequency channels intermittently, secondary users can selectively transmit without interfering with the primary users. The secondary users adjust the transmission power allocation on the frequency channels to maximize their information rate while reducing channel conflicts with the primary users. In this paper, the secondary users do not know the spectrum usage by the primary users or the channel gains of the secondary users. Based on the conflict warnings from the primary users and the signal-to-interference-plus-noise ratio measurement at the receiver, the secondary users adapt and improve spectrum utilization through deep reinforcement learning. The secondary users adopt the actor-critic deep deterministic policy gradient algorithm to overcome the challenges of large state space and large action space in reinforcement learning with continuous-valued actions. In addition, multiple secondary users implement multi-agent deep reinforcement lear...

Resource Management in Wireless Networks via Multi-Agent Deep Reinforcement Learning (original) (raw)

Related papers