Drone-Assisted Cellular Networks: A Multi-Agent Reinforcement Learning Approach (original) (raw)
Related papers
ICC 2022 - IEEE International Conference on Communications
Unmanned Aerial Vehicles (UAVs) promise to become an intrinsic part of next generation communications, as they can be deployed to provide wireless connectivity to ground users to supplement existing terrestrial networks. The majority of the existing research into the use of UAV access points for cellular coverage considers rotary-wing UAV designs (i.e. quadcopters). However, we expect fixed-wing UAVs to be more appropriate for connectivity purposes in scenarios where long flight times are necessary (such as for rural coverage), as fixed-wing UAVs rely on a more energy-efficient form of flight when compared to the rotary-wing design. As fixed-wing UAVs are typically incapable of hovering in place, their deployment optimisation involves optimising their individual flight trajectories in a way that allows them to deliver high quality service to the ground users in an energy-efficient manner. In this paper, we propose a multi-agent deep reinforcement learning approach to optimise the energy efficiency of fixed-wing UAV cellular access points while still allowing them to deliver high-quality service to users on the ground. In our decentralized approach, each UAV is equipped with a Dueling Deep Q-Network (DDQN) agent which can adjust the 3D trajectory of the UAV over a series of timesteps. By coordinating with their neighbours, the UAVs adjust their individual flight trajectories in a manner that optimises the total system energy efficiency. We benchmark the performance of our approach against a series of heuristic trajectory planning strategies, and demonstrate that our method can improve the system energy efficiency by as much as 70%.
Cognitive computation, 2018
Due to the unpredictability of natural disasters, whenever a catastrophe happens, it is vital that not only emergency rescue teams are prepared, but also that there is a functional communication network infrastructure. Hence, in order to prevent additional losses of human lives, it is crucial that network operators are able to deploy an emergency infrastructure as fast as possible. In this sense, the deployment of an intelligent, mobile, and adaptable network, through the usage of drones-unmanned aerial vehicles-is being considered as one possible alternative for emergency situations. In this paper, an intelligent solution based on reinforcement learning is proposed in order to find the best position of multiple drone small cells (DSCs) in an emergency scenario. The proposed solution's main goal is to maximize the amount of users covered by the system, while drones are limited by both backhaul and radio access network constraints. Results show that the proposed -learning solutio...
A Deep Reinforcement Learning Approach to Efficient Drone Mobility Support
ArXiv, 2020
The growing deployment of drones in a myriad of applications relies on seamless and reliable wireless connectivity for safe control and operation of drones. Cellular technology is a key enabler for providing essential wireless services to flying drones in the sky. Existing cellular networks targeting terrestrial usage can support the initial deployment of low-altitude drone users, but there are challenges such as mobility support. In this paper, we propose a novel handover framework for providing efficient mobility support and reliable wireless connectivity to drones served by a terrestrial cellular network. Using tools from deep reinforcement learning, we develop a deep Q-learning algorithm to dynamically optimize handover decisions to ensure robust connectivity for drone users. Simulation results show that the proposed framework significantly reduces the number of handovers at the expense of a small loss in signal strength relative to the baseline case where a drone always connect...
Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading
IEEE Transactions on Wireless Communications, 2021
A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network. The optimization problem of joint three-dimensional (3D) trajectory design and power allocation is formulated for maximizing the throughput. Since ground mobile users are considered as roaming continuously, the UAVs need to be redeployed timely based on the movement of users. In an effort to solve this pertinent dynamic problem, a K-means based clustering algorithm is first adopted for periodically partitioning users. Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs. In contrast to the conventional DQN algorithm, the MDQN algorithm enables the experience of multiagent to be input into a shared neural network to shorten the training time with the assistance of state abstraction. Numerical results demonstrate that: 1) the proposed MDQN algorithm is capable of converging under minor constraints and has a faster convergence rate than the conventional DQN algorithm in the multi-agent case; 2) The achievable sum rate of the NOMA enhanced UAV network is 23% superior to the case of orthogonal multiple access (OMA); 3) By designing the optimal 3D trajectory of UAVs with the aid of the MDON algorithm, the sum rate of the network enjoys 142% and 56% gains than that of invoking the circular trajectory and the 2D trajectory, respectively.
Distributed Cooperative Spectrum Sharing in UAV Networks Using Multi-Agent Reinforcement Learning
2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC), 2019
In this paper, we develop a distributed mechanism for spectrum sharing among a network of unmanned aerial vehicles (UAV) and licensed terrestrial networks. This method can provide a practical solution for situations where the UAV network may need external spectrum when dealing with congested spectrum or need to change its operational frequency due to security threats. Here we study a scenario where the UAV network performs a remote sensing mission. In this model, the UAVs are categorized to two clusters of relaying and sensing UAVs. The relay UAVs provide a relaying service for a licensed network to obtain spectrum access for the rest of UAVs that perform the sensing task. We develop a distributed mechanism in which the UAVs locally decide whether they need to participate in relaying or sensing considering the fact that communications among UAVs may not be feasible or reliable. The UAVs learn the optimal task allocation using a distributed reinforcement learning algorithm. Convergence of the algorithm is discussed and simulation results are presented for different scenarios to verify the convergence 1 .
Machine Learning assisted Handover and Resource Management for Cellular Connected Drones
2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), 2020
Enabling cellular connectivity for drones introduces a wide set of challenges and opportunities. Communication of cellular-connected drones is influenced by 3-dimensional mobility and line-of-sight channel characteristics which results in higher number of handovers with increasing altitude. Our cell planning simulations in coexistence of aerial and terrestrial users indicate that the severe interference from drones to base stations is a major challenge for uplink communications of terrestrial users. Here, we first present the major challenges in coexistence of terrestrial and drone communications by considering real geographical network data for Stockholm. Then, we derive analytical models for the key performance indicators (KPIs), including communications delay and interference over cellular networks, and formulate the handover and radio resource management (H-RRM) optimization problem. Afterwards, we transform this problem into a machine learning problem, and propose a deep reinforcement learning solution to solve H-RRM problem. Finally, using simulation results, we present how the speed and altitude of drones, and the tolerable level of interference, shape the optimal H-RRM policy in the network. Especially, the heatmaps of handover decisions in different drone's altitudes/speeds have been presented, which promote a revision of the legacy handover schemes and redefining the boundaries of cells in the sky.
2022
Unmanned Aerial Vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points-of-failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and experience the challenge of interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). We aim to address research gaps that focus on optimising the system's EE using a 2D trajectory optimisation of UAVs serving only static ground users, and neglect the impact of interference from nearby UAV cells. Unlike previous work that assume global spatial knowledge of ground users' location via a central controller that periodically scans the network perimeter and provides real-time updates to the UAVs for decision making, we focus on a realistic decentralised approach suitable in emergencies. Thus, we apply a decentralised Multi-Agent Reinforcement Learning (MARL) approach that maximizes the system's EE by jointly optimising each UAV's 3D trajectory, number of connected static and mobile users, and the energy consumed, while taking into account the impact of interference and the UAVs' coordination on the system's EE in a dynamic network environment. To address this, we propose a direct collaborative Communication-Enabled Multi-Agent Decentralised Double Deep Q-Network (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share knowledge by communicating with its nearest neighbours based on existing 3GPP guidelines. Our approach is able to maximise the system's EE without hampering performance gains in the network. Simulation results show that the proposed approach outperforms existing baselines in term of maximising the systems' EE without degrading coverage performance in the network. The CMAD-DDQN approach outperforms the MAD-DDQN that neglects direct collaboration among UAVs, the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and random policy approaches that consider a 2D UAV deployment design while neglecting interference from nearby UAV cells by about 15%, 65% and 85%, respectively.
Drone Base Station Positioning and Power Allocation using Reinforcement Learning
2019 16th International Symposium on Wireless Communication Systems (ISWCS), 2019
Large scale natural disasters can cause unpredictable losses of human lives and man-made infrastructure. This can hinder the ability of both survivors as well as search and rescue teams to communicate, decreasing the probability of finding survivors. In such cases, it is crucial that a provisional communication network is deployed as fast as possible in order to re-establish communication and prevent additional casualties. As such, one promising solution for mobile and adaptable emergency communication networks is the deployment of drones equipped with base stations to act as temporary small cells. In this paper, an intelligent solution based on reinforcement learning is proposed to determine the best transmit power allocation and 3D positioning of multiple drone small cells in an emergency scenario. The main goal is to maximize the number of users covered by the drones, while considering user mobility and radio access network constraints. Results show that the proposed algorithm ca...
arXiv (Cornell University), 2022
Unmanned aerial vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points of failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and experience the challenge of interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). Recent approaches focus on optimising the system's EE by optimising the trajectory of UAVs serving only static ground users and neglecting mobile users. Several others neglect the impact of interference from nearby UAV cells, assuming an interference-free network environment. Furthermore, some works assume global spatial knowledge of ground users' location via a central controller (CC) that periodically scans the network perimeter and provides real-time updates to the UAVs for decision-making. However, this assumption may be unsuitable in disaster scenarios since it requires significant information exchange between the UAVs and CC. Moreover, it may not be possible to track users' locations in a disaster scenario. Despite growing research interest in decentralised control over centralised UAVs' control, direct collaboration among UAVs to improve coordination while optimising the systems' EE has not been adequately explored. To address this, we propose a direct collaborative communication-enabled multi-agent decentralised double deep Qnetwork (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share their telemetry via existing 3GPP guidelines by communicating with their nearest neighbours. This allows the agent-controlled UAVs to optimise their 3D flight trajectories by filling up knowledge gaps and converging to optimal policies. We account for the mobility of ground users, the UAVs' limited energy budget and interference in the environment. Our approach can maximise the system's EE without hampering performance gains in the network. Simulation results show that the proposed approach outperforms existing baselines in terms of maximising the systems' EE without degrading coverage performance in the network. The CMAD-DDQN approach outperforms the MAD-DDQN that neglects direct collaboration among UAVs, the multi-agent deep deterministic policy gradient (MADDPG) and random policy approaches that consider a 2D UAV deployment design while neglecting interference from nearby UAV cells by about 15%, 65% and 85%, respectively.
Scalable Multi-agent Reinforcement Learning Algorithm for Wireless Networks
2021
Reinforcement learning (RL) is known as a model-free and highly efficient intelligent algorithm and proved to be useful in solving radio resource management problems in wireless networks. However, for large-scale networks with high latency connection to center-server or capacity-limited backbone, it is not realistic to employ a centralized RL algorithm to perform joint real-time decision making for the entire network. The dimensional of the problem increases exponentially which introduces the scalability issue. Multi-agent RL, which allows separate execution of decision policy in each agent, has been applied to solve the scalability problem. In this paper, we propose a federated multi-agent RL architecture for large-scale wireless scenarios, where access points (agents) share parameters to form consistency, save backbone traffic, and improve the convergence performance. Our results show that the federated frequency, which is critical for backbone traffic, has a limited effect on the...