Deep Reinforcement Learning based Robot Navigation in Dynamic Environments using Occupancy Values of Motion Primitives (original) (raw)
Related papers
2021 IEEE International Conference on Robotics and Automation (ICRA), 2021
We present a novel Deep Reinforcement Learning (DRL) based policy to compute dynamically feasible and spatially aware velocities for a robot navigating among mobile obstacles. Our approach combines the benefits of the Dynamic Window Approach (DWA) in terms of satisfying the robot's dynamics constraints with state-of-the-art DRL-based navigation methods that can handle moving obstacles and pedestrians well. Our formulation achieves these goals by embedding the environmental obstacles' motions in a novel low-dimensional observation space. It also uses a novel reward function to positively reinforce velocities that move the robot away from the obstacle's heading direction leading to significantly lower number of collisions. We evaluate our method in realistic 3-D simulated environments and on a real differential drive robot in challenging dense indoor scenarios with several walking pedestrians. We compare our method with state-of-the-art collision avoidance methods and observe significant improvements in terms of success rate (up to 33% increase), number of dynamics constraint violations (up to 61% decrease), and smoothness. We also conduct ablation studies to highlight the advantages of our observation space formulation, and reward structure.
DEEP REINFORCEMENT LEARNING FOR NAVIGATION IN CLUTTERED ENVIRONMENTS
Collision-free motion is essential for mobile robots. Most approaches to collision-free and efficient navigation with wheeled robots require parameter tuning by experts to obtain good navigation behavior. In this paper, we aim at learning an optimal navigation policy by deep reinforcement learning to overcome this manual parameter tuning. Our approach uses proximal policy optimization to train the policy and achieve collision-free and goal-directed behavior. The output of the learned network are the robot's trans-lational and angular velocities for the next time step. Our method combines path planning on a 2D grid with reinforcement learning and does not need any supervision. Our network is first trained in a simple environment and then transferred to scenarios of increasing complexity. We implemented our approach in C++ and Python for the Robot Operating System (ROS) and thoroughly tested it in several simulated as well as real-world experiments. The experiments illustrate that our trained policy can be applied to solve complex navigation tasks. Furthermore, we compare the performance of our learned controller to the popular dynamic window approach (DWA) of ROS. As the experimental results show, a robot controlled by our learned policy reaches the goal significantly faster compared to using the DWA by closely bypassing obstacles and thus saving time.
arXiv (Cornell University), 2022
Learning agents can optimize standard autonomous navigation improving flexibility, efficiency, and computational cost of the system by adopting a wide variety of approaches. This work introduces the PIC4rl-gym, a fundamental modular framework to enhance navigation and learning research by mixing ROS2 and Gazebo, the standard tools of the robotics community, with Deep Reinforcement Learning (DRL). The paper describes the whole structure of the PIC4rl-gym, which fully integrates DRL agent's training and testing in several indoor and outdoor navigation scenarios and tasks. A modular approach is adopted to easily customize the simulation by selecting new platforms, sensors, or models. We demonstrate the potential of our novel gym by benchmarking the resulting policies, trained for different navigation tasks, with a complete set of metrics.
Deep Reinforcement Learning-Based Path Planning with Dynamic Collision Probability for Mobile Robots
WRC Symposium on Advanced Robotics and Automation (WRC SARA), 2024
This study proposed a novel approach for mobile robots path planning and avoiding collisions by using Collision Probability (CP) along with the Soft Actor-Critic Lagrangian (SACL-L) framework. Our approach enables the mobile robot to dynamically deal with static and dynamic environments while ensuring safety and efficiency. The proposed SAC-L (CP) aims to minimize the total costs, which is the combination of both negative rewards and collision occurs. This dual focus strategy ensures trajectory planning inherently safer and providing a robust solution for complex dynamic obstacles environments. The frameworkâs efficiency is validated through extensive simulations on the Gazebo platform involving three increasingly difficult scenarios, demonstrating superior performance, adaptability and safety of our approach compared to traditional Deep Reinforcement Learning (DRL) methods. Our results showcase significant improvements in social and ego safety scores, contributing to the advancement of autonomous navigation in complex environments. This framework marks a step towards safer, more reliable mobile robot navigation and opens new avenues for future research in mobile robot path planning. A supplementary video further demonstrates the effectiveness of our framework.
A Deep Learning Based Behavioral Approach to Indoor Autonomous Navigation
2018 IEEE International Conference on Robotics and Automation (ICRA), 2018
We present a semantically rich graph representation for indoor robotic navigation. Our graph representation encodes: semantic locations such as offices or corridors as nodes, and navigational behaviors such as enter office or cross a corridor as edges. In particular, our navigational behaviors operate directly from visual inputs to produce motor controls and are implemented with deep learning architectures. This enables the robot to avoid explicit computation of its precise location or the geometry of the environment, and enables navigation at a higher level of semantic abstraction. We evaluate the effectiveness of our representation by simulating navigation tasks in a large number of virtual environments. Our results show that using a simple sets of perceptual and navigational behaviors, the proposed approach can successfully guide the way of the robot as it completes navigational missions such as going to a specific office. Furthermore, our implementation shows to be effective to control the selection and switching of behaviors.
Local Navigation Among Movable Obstacles with Deep Reinforcement Learning
arXiv (Cornell University), 2023
In this paper, we introduce a method to deal with the problem of robot local path planning among pushable objects-an open problem in robotics. In particular, we achieve that by training multiple agents simultaneously in a physicsbased simulation environment, utilizing an Advantage Actor-Critic algorithm coupled with a deep neural network. The developed online policy enables these agents to push obstacles in ways that are not limited to axial alignments, adapt to unforeseen changes in obstacle dynamics instantaneously, and effectively tackle local path planning in confined areas. We tested the method in various simulated environments to prove the adaptation effectiveness to various unseen scenarios in unfamiliar settings. Moreover, we have successfully applied this policy on an actual quadruped robot, confirming its capability to handle the unpredictability and noise associated with real-world sensors and the inherent uncertainties present in unexplored object pushing tasks.
Deep Reinforcement Learning Based Mobile Robot Navigation in Unknown Indoor Environments
International Conference on Interdisciplinary Applications of Artificial Intelligence (ICIDAAI), 2021
The importance of autonomous robots has been increasing day by day with the development of technology. Difficulties in performing many tasks such as target recognition, navigation, and obstacle avoidance autonomously by mobile robots are problems that must be overcome. In recent years, the use of deep reinforcement learning algorithms in robot navigation has been increasing. One of the most important reasons why deep reinforcement learning is preferred over traditional algorithms is that robots can learn the environments by themselves without any prior knowledge or map in environments with obstacles. This study proposes a navigation system based on the dueling deep Q network algorithm, which is one of the deep reinforcement learning algorithms, for a mobile robot in an unknown environment to reach its target by avoiding obstacles. In the study, a 2D laser sensor and an RGB-D camera has been used so that the mobile robot can detect and recognize the static and dynamic obstacles in front of itself, and its surroundings. Robot Operating System (ROS) and Gazebo simulator have been used to model the robot and environment. The experiment results show that the mobile robot can reach its targets by avoiding static and dynamic obstacles in unknown environments.
Comparison of Deep Reinforcement Learning Policies to Formal Methods for Moving Obstacle Avoidance
2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019
Deep Reinforcement Learning (RL) has recently emerged as a solution for moving obstacle avoidance. Deep RL learns to simultaneously predict obstacle motions and corresponding avoidance actions directly from robot sensors, even for obstacles with different dynamics models. However, deep RL methods typically cannot guarantee policy convergences, i.e., cannot provide probabilistic collision avoidance guarantees. In contrast, stochastic reachability (SR), a computationally expensive formal method that employs a known obstacle dynamics model, identifies the optimal avoidance policy and provides strict convergence guarantees. The availability of the optimal solution for versions of the moving obstacle problem provides a baseline to compare trained deep RL policies. In this paper, we compare the expected cumulative reward and actions of these policies to SR, and find the following. 1) The state-value function approximates the optimal collision probability well, thus explaining the high empirical performance. 2) RL policies deviate from the optimal significantly thus negatively impacting collision avoidance in some cases. 3) Evidence suggests that the deviation is caused, at least partially, by the actor net failing to approximate the action corresponding to the highest stateaction value.
Dynamic Obstacle Avoidance Technique for Mobile Robot Navigation Using Deep Reinforcement Learning
International Journal of Emerging Trends in Engineering Research , 2023
In the realm of mobile robotics, navigating around obstacles is a fundamental task, particularly in constantly changing situations. Although deep reinforcement learning (DRL) techniques exist that utilize the positional information of robot's, environmental states, and input dataset for neural networks. Although, the positional information alone does not provide sufficient insights into the motion trends of obstacles. To solve this issue, this paper presents a dynamic obstacle mobility pattern approach for mobile robots (MRs) that rely on DRL. This method employs the positional details of dynamic obstacles dependent upon time for establishing a movement trend vector. This vector, in conjunction with another mobility state attribute, forms the MR mobility guidance matrix, that essentially conveys the pattern variation of dynamic obstacles trend over a specified interval. Using this matrix, the robot can choose its avoidance action. Also, this methodology uses the DRL-based dynamic policy algorithm for the testing and validation of the proposed technique through Python programming. The experimental outcomes demonstrate that this technique substantially improves the safety of avoiding dynamic obstacles.
DRL-VO: Using Velocity Obstacles to Learn Safe and Fast Navigation
2021
This paper presents a novel deep reinforcement learning-based control policy to enable a mobile robot to navigate safely and quickly through complex and human-filled environments. The policy uses a short history of lidar data, the kinematic data about nearby pedestrians, and a sub-goal point as its input, using an early fusion network to fuse these data. The reward function uses velocity obstacles to guide the robot to actively avoid pedestrians and move towards the goal. Through a series of simulated environments with 5-55 pedestrians, this policy is able to achieve a 31.6% higher success rate and 86.9% faster average speed than state-of-the-art model-based and learning-based policies. Hardware experiments demonstrate the ability of the policy to directly work in real-world environments.