Communication-Enabled Deep Reinforcement Learning to Optimise Energy-Efficiency in UAV-Assisted Networks (original) (raw)
Related papers
2022
Unmanned Aerial Vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points-of-failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and experience the challenge of interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). We aim to address research gaps that focus on optimising the system's EE using a 2D trajectory optimisation of UAVs serving only static ground users, and neglect the impact of interference from nearby UAV cells. Unlike previous work that assume global spatial knowledge of ground users' location via a central controller that periodically scans the network perimeter and provides real-time updates to the UAVs for decision making, we focus on a realistic decentralised approach suitable in emergencies. Thus, we apply a decentralised Multi-Agent Reinforcement Learning (MARL) approach that maximizes the system's EE by jointly optimising each UAV's 3D trajectory, number of connected static and mobile users, and the energy consumed, while taking into account the impact of interference and the UAVs' coordination on the system's EE in a dynamic network environment. To address this, we propose a direct collaborative Communication-Enabled Multi-Agent Decentralised Double Deep Q-Network (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share knowledge by communicating with its nearest neighbours based on existing 3GPP guidelines. Our approach is able to maximise the system's EE without hampering performance gains in the network. Simulation results show that the proposed approach outperforms existing baselines in term of maximising the systems' EE without degrading coverage performance in the network. The CMAD-DDQN approach outperforms the MAD-DDQN that neglects direct collaboration among UAVs, the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and random policy approaches that consider a 2D UAV deployment design while neglecting interference from nearby UAV cells by about 15%, 65% and 85%, respectively.
Optimising Energy Efficiency in UAV-Assisted Networks using Deep Reinforcement Learning
arXiv (Cornell University), 2022
In this letter, we study the energy efficiency (EE) optimisation of unmanned aerial vehicles (UAVs) providing wireless coverage to static and mobile ground users. Recent multiagent reinforcement learning approaches optimise the system's EE using a 2D trajectory design, neglecting interference from nearby UAV cells. We aim to maximise the system's EE by jointly optimising each UAV's 3D trajectory, number of connected users, and the energy consumed, while accounting for interference. Thus, we propose a cooperative Multi-Agent Decentralised Double Deep Q-Network (MAD-DDQN) approach. Our approach outperforms existing baselines in terms of EE by as much as 55-80%.
Optimizing Energy Efficiency in UAV-Assisted Networks Using Deep Reinforcement Learning
IEEE Wireless Communications Letters
In this letter, we study the energy efficiency (EE) optimization of unmanned aerial vehicles (UAVs) providing wireless coverage to static and mobile ground users. Recent multiagent reinforcement learning approaches optimise the system's EE using a 2D trajectory design, neglecting interference from nearby UAV cells. We aim to maximize the system's EE by jointly optimizing each UAV's 3D trajectory, number of connected users, and the energy consumed, while accounting for interference. Thus, we propose a cooperative Multi-Agent Decentralized Double Deep Q-Network (MAD-DDQN) approach. Our approach outperforms existing baselines in terms of EE by as much as 55-80%.
ICC 2022 - IEEE International Conference on Communications
Unmanned Aerial Vehicles (UAVs) promise to become an intrinsic part of next generation communications, as they can be deployed to provide wireless connectivity to ground users to supplement existing terrestrial networks. The majority of the existing research into the use of UAV access points for cellular coverage considers rotary-wing UAV designs (i.e. quadcopters). However, we expect fixed-wing UAVs to be more appropriate for connectivity purposes in scenarios where long flight times are necessary (such as for rural coverage), as fixed-wing UAVs rely on a more energy-efficient form of flight when compared to the rotary-wing design. As fixed-wing UAVs are typically incapable of hovering in place, their deployment optimisation involves optimising their individual flight trajectories in a way that allows them to deliver high quality service to the ground users in an energy-efficient manner. In this paper, we propose a multi-agent deep reinforcement learning approach to optimise the energy efficiency of fixed-wing UAV cellular access points while still allowing them to deliver high-quality service to users on the ground. In our decentralized approach, each UAV is equipped with a Dueling Deep Q-Network (DDQN) agent which can adjust the 3D trajectory of the UAV over a series of timesteps. By coordinating with their neighbours, the UAVs adjust their individual flight trajectories in a manner that optimises the total system energy efficiency. We benchmark the performance of our approach against a series of heuristic trajectory planning strategies, and demonstrate that our method can improve the system energy efficiency by as much as 70%.
Density-Aware Reinforcement Learning to Optimise Energy Efficiency in UAV-Assisted Networks
arXiv (Cornell University), 2023
Unmanned aerial vehicles (UAVs) serving as aerial base stations can be deployed to provide wireless connectivity to mobile users, such as vehicles. However, the density of vehicles on roads often varies spatially and temporally primarily due to mobility and traffic situations in a geographical area, making it difficult to provide ubiquitous service. Moreover, as energyconstrained UAVs hover in the sky while serving mobile users, they may be faced with interference from nearby UAV cells or other access points sharing the same frequency band, thereby impacting the system's energy efficiency (EE). Recent multiagent reinforcement learning (MARL) approaches applied to optimise the users' coverage worked well in reasonably even densities but might not perform as well in uneven users' distribution, i.e., in urban road networks with uneven concentration of vehicles. In this work, we propose a density-aware communication-enabled multi-agent decentralised double deep Q-network (DACEMAD-DDQN) approach that maximises the total system's EE by jointly optimising the trajectory of each UAV, the number of connected users, and the UAVs' energy consumption while keeping track of dense and uneven users' distribution. Our result outperforms state-of-the-art MARL approaches in terms of EE by as much as 65%-85%.
IEEE Open Journal of the Communications Society
Control and performance optimization of wireless networks of Unmanned Aerial Vehicles (UAVs) require scalable approaches that go beyond architectures based on centralized network controllers. At the same time, the performance of model-based optimization approaches is often limited by the accuracy of the approximations and relaxations necessary to solve the UAV network control problem through convex optimization or similar techniques, and by the accuracy of the channel network models used. To address these challenges, this article introduces a new architectural framework to control and optimize UAV networks based on Deep Reinforcement Learning (DRL). Furthermore, it proposes a virtualized, 'ready-to-fly' emulation environment to generate the extensive wireless data traces necessary to train DRL algorithms, which are notoriously hard to generate and collect on battery-powered UAV networks. The training environment integrates previously developed wireless protocol stacks for UAVs into the CORE/EMANE emulation tool. Our 'ready-to-fly' virtual environment guarantees scalable collection of high-fidelity wireless traces that can be used to train DRL agents. The proposed DRL architecture enables distributed data-driven optimization (with up to 3.7x throughput improvement and 0.2x latency reduction in reported experiments), facilitates network reconfiguration, and provides a scalable solution for large UAV networks. INDEX TERMS UAV networks, non-terrestrial netoworks, deep reinforcement learning, AI for wireless networks, 6G.
IEEE Transactions on Vehicular Technology
Due to their promising applications and intriguing characteristics, Unmanned Aerial Vehicles (UAVs) can be dispatched as flying base stations to serve multiple energy-constrained Internet-of-Things (IoT) sensors. Moreover, to ensure fresh data collection while providing sustainable energy support to a large set of IoT devices, a required number of UAVs should be deployed to carry out these two tasks efficiently and promptly. Indeed, the data collection requires that UAVs first make Wireless Energy Transfer (WET) to supply IoT devices with the necessary energy in the downlink. Then, IoT devices perform Wireless Information Transmission (WIT) to UAVs in the uplink based on the harvested energy. However, it turns out that when the same UAV performs WIT and WET, its energy usage and the data collection time are severely penalized. Worse yet, it is difficult to efficiently coordinate between UAVs to improve the performance in terms of WET and WIT. This work proposes to divide UAVs into two teams to behave as data collectors and energy transmitters, respectively. A Multi-Agent Deep Reinforcement Learning (MADRL) method, called TEAM, is leveraged to jointly optimize both teams' trajectories, minimize the expected Age of Information (AoI), maximize the throughput of IoT devices, minimize the energy utilization of UAVs, and enhance the energy transfer. Simulation results depict that TEAM can effectively synchronize UAV teams and adapt their trajectories while serving a large-scale dynamic IoT environment.
Energy-aware optimization of UAV base stations placement via decentralized multi-agent Q-learning
arXiv (Cornell University), 2021
Unmanned aerial vehicles serving as aerial base stations (UAV-BSs) can be deployed to provide wireless connectivity to ground devices in events of increased network demand, pointsof-failure in existing infrastructure, or disasters. However, it is challenging to conserve the energy of UAVs during prolonged coverage tasks, considering their limited on-board battery capacity. Reinforcement learning-based (RL) approaches have been previously used to improve energy utilization of multiple UAVs, however, a central cloud controller is assumed to have complete knowledge of the end-devices' locations, i.e., the controller periodically scans and sends updates for UAV decision-making. This assumption is impractical in dynamic network environments with UAVs serving mobile ground devices. To address this problem, we propose a decentralized Q-learning approach, where each UAV-BS is equipped with an autonomous agent that maximizes the connectivity of mobile ground devices while improving its energy utilization. Experimental results show that the proposed design significantly outperforms the centralized approaches in jointly maximizing the number of connected ground devices and the energy utilization of the UAV-BSs.
Energy-aware placement optimization of UAV base stations via decentralized multi-agent Q-learning
arXiv (Cornell University), 2021
Unmanned aerial vehicles serving as aerial base stations (UAV-BSs) can be deployed to provide wireless connectivity to ground devices in events of increased network demand, points-offailure in existing infrastructure, or disasters. However, it is challenging to conserve the energy of UAVs during prolonged coverage tasks, considering their limited on-board battery capacity. Reinforcement learning-based (RL) approaches have been previously used to improve energy utilization of multiple UAVs, however, a central cloud controller is assumed to have complete knowledge of the end-devices' locations, i.e., the controller periodically scans and sends updates for UAV decision-making. This assumption is impractical in dynamic network environments with mobile ground devices. To address this problem, we propose a decentralized Qlearning approach, where each UAV-BS is equipped with an autonomous agent that maximizes the connectivity to ground devices while improving its energy utilization. Experimental results show that the proposed design significantly outperforms the centralized approaches in jointly maximizing the number of connected ground devices and the energy utilization of the UAV-BSs.
IEEE Transactions on Vehicular Technology, 2023
Wireless sensor networks (WSNs) with ultra-dense sensors are crucial for several industries, such as smart agricultural systems deployed in the fifth generation (5G) and beyond 5G Open Radio Access Networks (O-RAN). The WSNs employ multiple unmanned aerial vehicles (UAVs) to collect data from multiple sensor nodes (SNs) and relay it to the central controller for processing. UAVs also provide resources to SNs and extend the network coverage over a vast geographical area. The O-RAN allows the use of open standards and interfaces to create a wireless network for communications between the UAVs and ground SNs. It enables real-time data transfer, remote control, and other applications that require a reliable and high-speed connection by providing flexibility and reliability for UAV-assisted WSNs to meet the requirements of smart agricultural applications. However, the limited battery life of UAVs, transmission power, and shortage of energy resources SNs make it difficult to collect all the data and relay it to the base station, resulting in inefficient task computation and resource management in smart agricultural systems. In this paper, we propose a joint UAV task scheduling, trajectory planning, and resource-sharing framework for multi-UAV-assisted WSNs for smart agricultural monitoring scenarios that schedule UAVs' charging, data collection, and landing times and allow UAVs to share energy with SNs. The main objective of our proposed framework is to minimize the UAV energy consumption and network latency for effective data collection within a specific time frame. We formulate the multi-objective, which is a non-convex optimization problem, and transform it into a Markov decision process (MDP) with a multi-agent deep reinforcement learning (MADRL) algorithm. The simulation results show that the proposed MADRL algorithm reduces the energy consumption cost when compared to deep Q-network, Greedy, and mixed-integer linear program (MILP) by 61.92%, 68.02%, and 69.9%, respectively