Energy-aware optimization of UAV base stations placement via decentralized multi-agent Q-learning (original) (raw)
Related papers
Energy-aware placement optimization of UAV base stations via decentralized multi-agent Q-learning
arXiv (Cornell University), 2021
Unmanned aerial vehicles serving as aerial base stations (UAV-BSs) can be deployed to provide wireless connectivity to ground devices in events of increased network demand, points-offailure in existing infrastructure, or disasters. However, it is challenging to conserve the energy of UAVs during prolonged coverage tasks, considering their limited on-board battery capacity. Reinforcement learning-based (RL) approaches have been previously used to improve energy utilization of multiple UAVs, however, a central cloud controller is assumed to have complete knowledge of the end-devices' locations, i.e., the controller periodically scans and sends updates for UAV decision-making. This assumption is impractical in dynamic network environments with mobile ground devices. To address this problem, we propose a decentralized Qlearning approach, where each UAV-BS is equipped with an autonomous agent that maximizes the connectivity to ground devices while improving its energy utilization. Experimental results show that the proposed design significantly outperforms the centralized approaches in jointly maximizing the number of connected ground devices and the energy utilization of the UAV-BSs.
ICC 2022 - IEEE International Conference on Communications
Unmanned Aerial Vehicles (UAVs) promise to become an intrinsic part of next generation communications, as they can be deployed to provide wireless connectivity to ground users to supplement existing terrestrial networks. The majority of the existing research into the use of UAV access points for cellular coverage considers rotary-wing UAV designs (i.e. quadcopters). However, we expect fixed-wing UAVs to be more appropriate for connectivity purposes in scenarios where long flight times are necessary (such as for rural coverage), as fixed-wing UAVs rely on a more energy-efficient form of flight when compared to the rotary-wing design. As fixed-wing UAVs are typically incapable of hovering in place, their deployment optimisation involves optimising their individual flight trajectories in a way that allows them to deliver high quality service to the ground users in an energy-efficient manner. In this paper, we propose a multi-agent deep reinforcement learning approach to optimise the energy efficiency of fixed-wing UAV cellular access points while still allowing them to deliver high-quality service to users on the ground. In our decentralized approach, each UAV is equipped with a Dueling Deep Q-Network (DDQN) agent which can adjust the 3D trajectory of the UAV over a series of timesteps. By coordinating with their neighbours, the UAVs adjust their individual flight trajectories in a manner that optimises the total system energy efficiency. We benchmark the performance of our approach against a series of heuristic trajectory planning strategies, and demonstrate that our method can improve the system energy efficiency by as much as 70%.
2022
Unmanned Aerial Vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points-of-failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and experience the challenge of interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). We aim to address research gaps that focus on optimising the system's EE using a 2D trajectory optimisation of UAVs serving only static ground users, and neglect the impact of interference from nearby UAV cells. Unlike previous work that assume global spatial knowledge of ground users' location via a central controller that periodically scans the network perimeter and provides real-time updates to the UAVs for decision making, we focus on a realistic decentralised approach suitable in emergencies. Thus, we apply a decentralised Multi-Agent Reinforcement Learning (MARL) approach that maximizes the system's EE by jointly optimising each UAV's 3D trajectory, number of connected static and mobile users, and the energy consumed, while taking into account the impact of interference and the UAVs' coordination on the system's EE in a dynamic network environment. To address this, we propose a direct collaborative Communication-Enabled Multi-Agent Decentralised Double Deep Q-Network (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share knowledge by communicating with its nearest neighbours based on existing 3GPP guidelines. Our approach is able to maximise the system's EE without hampering performance gains in the network. Simulation results show that the proposed approach outperforms existing baselines in term of maximising the systems' EE without degrading coverage performance in the network. The CMAD-DDQN approach outperforms the MAD-DDQN that neglects direct collaboration among UAVs, the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and random policy approaches that consider a 2D UAV deployment design while neglecting interference from nearby UAV cells by about 15%, 65% and 85%, respectively.
Density-Aware Reinforcement Learning to Optimise Energy Efficiency in UAV-Assisted Networks
arXiv (Cornell University), 2023
Unmanned aerial vehicles (UAVs) serving as aerial base stations can be deployed to provide wireless connectivity to mobile users, such as vehicles. However, the density of vehicles on roads often varies spatially and temporally primarily due to mobility and traffic situations in a geographical area, making it difficult to provide ubiquitous service. Moreover, as energyconstrained UAVs hover in the sky while serving mobile users, they may be faced with interference from nearby UAV cells or other access points sharing the same frequency band, thereby impacting the system's energy efficiency (EE). Recent multiagent reinforcement learning (MARL) approaches applied to optimise the users' coverage worked well in reasonably even densities but might not perform as well in uneven users' distribution, i.e., in urban road networks with uneven concentration of vehicles. In this work, we propose a density-aware communication-enabled multi-agent decentralised double deep Q-network (DACEMAD-DDQN) approach that maximises the total system's EE by jointly optimising the trajectory of each UAV, the number of connected users, and the UAVs' energy consumption while keeping track of dense and uneven users' distribution. Our result outperforms state-of-the-art MARL approaches in terms of EE by as much as 65%-85%.
arXiv (Cornell University), 2022
Unmanned aerial vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points of failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and experience the challenge of interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). Recent approaches focus on optimising the system's EE by optimising the trajectory of UAVs serving only static ground users and neglecting mobile users. Several others neglect the impact of interference from nearby UAV cells, assuming an interference-free network environment. Furthermore, some works assume global spatial knowledge of ground users' location via a central controller (CC) that periodically scans the network perimeter and provides real-time updates to the UAVs for decision-making. However, this assumption may be unsuitable in disaster scenarios since it requires significant information exchange between the UAVs and CC. Moreover, it may not be possible to track users' locations in a disaster scenario. Despite growing research interest in decentralised control over centralised UAVs' control, direct collaboration among UAVs to improve coordination while optimising the systems' EE has not been adequately explored. To address this, we propose a direct collaborative communication-enabled multi-agent decentralised double deep Qnetwork (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share their telemetry via existing 3GPP guidelines by communicating with their nearest neighbours. This allows the agent-controlled UAVs to optimise their 3D flight trajectories by filling up knowledge gaps and converging to optimal policies. We account for the mobility of ground users, the UAVs' limited energy budget and interference in the environment. Our approach can maximise the system's EE without hampering performance gains in the network. Simulation results show that the proposed approach outperforms existing baselines in terms of maximising the systems' EE without degrading coverage performance in the network. The CMAD-DDQN approach outperforms the MAD-DDQN that neglects direct collaboration among UAVs, the multi-agent deep deterministic policy gradient (MADDPG) and random policy approaches that consider a 2D UAV deployment design while neglecting interference from nearby UAV cells by about 15%, 65% and 85%, respectively.
Optimizing Energy Efficiency in UAV-Assisted Networks Using Deep Reinforcement Learning
IEEE Wireless Communications Letters
In this letter, we study the energy efficiency (EE) optimization of unmanned aerial vehicles (UAVs) providing wireless coverage to static and mobile ground users. Recent multiagent reinforcement learning approaches optimise the system's EE using a 2D trajectory design, neglecting interference from nearby UAV cells. We aim to maximize the system's EE by jointly optimizing each UAV's 3D trajectory, number of connected users, and the energy consumed, while accounting for interference. Thus, we propose a cooperative Multi-Agent Decentralized Double Deep Q-Network (MAD-DDQN) approach. Our approach outperforms existing baselines in terms of EE by as much as 55-80%.
Optimising Energy Efficiency in UAV-Assisted Networks using Deep Reinforcement Learning
arXiv (Cornell University), 2022
In this letter, we study the energy efficiency (EE) optimisation of unmanned aerial vehicles (UAVs) providing wireless coverage to static and mobile ground users. Recent multiagent reinforcement learning approaches optimise the system's EE using a 2D trajectory design, neglecting interference from nearby UAV cells. We aim to maximise the system's EE by jointly optimising each UAV's 3D trajectory, number of connected users, and the energy consumed, while accounting for interference. Thus, we propose a cooperative Multi-Agent Decentralised Double Deep Q-Network (MAD-DDQN) approach. Our approach outperforms existing baselines in terms of EE by as much as 55-80%.
Drone Base Station Positioning and Power Allocation using Reinforcement Learning
2019 16th International Symposium on Wireless Communication Systems (ISWCS), 2019
Large scale natural disasters can cause unpredictable losses of human lives and man-made infrastructure. This can hinder the ability of both survivors as well as search and rescue teams to communicate, decreasing the probability of finding survivors. In such cases, it is crucial that a provisional communication network is deployed as fast as possible in order to re-establish communication and prevent additional casualties. As such, one promising solution for mobile and adaptable emergency communication networks is the deployment of drones equipped with base stations to act as temporary small cells. In this paper, an intelligent solution based on reinforcement learning is proposed to determine the best transmit power allocation and 3D positioning of multiple drone small cells in an emergency scenario. The main goal is to maximize the number of users covered by the drones, while considering user mobility and radio access network constraints. Results show that the proposed algorithm ca...
A Simulation of UAV Power Optimization via Reinforcement Learning
ArXiv, 2019
This paper demonstrates a reinforcement learning approach to the optimization of power consumption in a UAV system in a simplified data collection task. Here, the architecture consists of two common reinforcement learning algorithms, Q-learning and Sarsa, which are implemented through a combination of robot operating system (ROS) and Gazebo. The effect of wind as an influential factor was simulated. The implemented algorithm resulted in reasonable adjustment of UAV actions to the wind field in order to minimize its power consumption during task completion over the domain.
IEEE Transactions on Vehicular Technology
Due to their promising applications and intriguing characteristics, Unmanned Aerial Vehicles (UAVs) can be dispatched as flying base stations to serve multiple energy-constrained Internet-of-Things (IoT) sensors. Moreover, to ensure fresh data collection while providing sustainable energy support to a large set of IoT devices, a required number of UAVs should be deployed to carry out these two tasks efficiently and promptly. Indeed, the data collection requires that UAVs first make Wireless Energy Transfer (WET) to supply IoT devices with the necessary energy in the downlink. Then, IoT devices perform Wireless Information Transmission (WIT) to UAVs in the uplink based on the harvested energy. However, it turns out that when the same UAV performs WIT and WET, its energy usage and the data collection time are severely penalized. Worse yet, it is difficult to efficiently coordinate between UAVs to improve the performance in terms of WET and WIT. This work proposes to divide UAVs into two teams to behave as data collectors and energy transmitters, respectively. A Multi-Agent Deep Reinforcement Learning (MADRL) method, called TEAM, is leveraged to jointly optimize both teams' trajectories, minimize the expected Age of Information (AoI), maximize the throughput of IoT devices, minimize the energy utilization of UAVs, and enhance the energy transfer. Simulation results depict that TEAM can effectively synchronize UAV teams and adapt their trajectories while serving a large-scale dynamic IoT environment.