Double Deep Reinforcement Learning Techniques for Low Dimensional Sensing Mapless Navigation of Terrestrial Mobile Robots (original) (raw)
Related papers
Deep Reinforcement Learning Based Mobile Robot Navigation in Unknown Indoor Environments
International Conference on Interdisciplinary Applications of Artificial Intelligence (ICIDAAI), 2021
The importance of autonomous robots has been increasing day by day with the development of technology. Difficulties in performing many tasks such as target recognition, navigation, and obstacle avoidance autonomously by mobile robots are problems that must be overcome. In recent years, the use of deep reinforcement learning algorithms in robot navigation has been increasing. One of the most important reasons why deep reinforcement learning is preferred over traditional algorithms is that robots can learn the environments by themselves without any prior knowledge or map in environments with obstacles. This study proposes a navigation system based on the dueling deep Q network algorithm, which is one of the deep reinforcement learning algorithms, for a mobile robot in an unknown environment to reach its target by avoiding obstacles. In the study, a 2D laser sensor and an RGB-D camera has been used so that the mobile robot can detect and recognize the static and dynamic obstacles in front of itself, and its surroundings. Robot Operating System (ROS) and Gazebo simulator have been used to model the robot and environment. The experiment results show that the mobile robot can reach its targets by avoiding static and dynamic obstacles in unknown environments.
Robot Navigation through the Deep Q-Learning Algorithm
The paper aims to present the results of an assessment of adherence to the Deep Q-learning algorithm, applied to a vehicular navigation robot. The robot's job was to transport parts through an environment, for this purpose, a decision system was built based on the Deep Q-learning algorithm, with the aid of an artificial neural network that received data from the sensors as input and allowed autonomous navigation in an environment. For the experiments, the mobile robot-maintained communication via the network with other robotic components present in the environment through the MQTT protocol.
Position Control of a Mobile Robot through Deep Reinforcement Learning
Applied Sciences
This article proposes the use of reinforcement learning (RL) algorithms to control the position of a simulated Kephera IV mobile robot in a virtual environment. The simulated environment uses the OpenAI Gym library in conjunction with CoppeliaSim, a 3D simulation platform, to perform the experiments and control the position of the robot. The RL agents used correspond to the deep deterministic policy gradient (DDPG) and deep Q network (DQN), and their results are compared with two control algorithms called Villela and IPC. The results obtained from the experiments in environments with and without obstacles show that DDPG and DQN manage to learn and infer the best actions in the environment, allowing us to effectively perform the position control of different target points and obtain the best results based on different metrics and indices.
Autonomous Navigation in Search and Rescue Simulated Environment using Deep Reinforcement Learning
2021
Human assisted search and rescue (SAR) robots are increasingly being used in zones of natural disasters, industrial accidents, and civil wars. Due to complex terrains, obstacles, and uncertainties in time availability, there is a need for these robots to have a certain level of autonomy to act independently for approaching certain SAR tasks. One of these tasks is autonomous navigation. Previous approaches to develop autonomous or semiautonomous SAR navigating robots use heuristics-based methods. These algorithms, however, require environment-related prior knowledge and enough sensing capabilities, which are hard to maintain due to restrictions of size and weight in highly unstructured environments such as collapsed buildings. This study approaches the problem of autonomous navigation using a modified version of the Deep Q-Network algorithm. Unlike the classical usage of the entire game screen images to train the agent, our approach uses only the images captured by the agent's lowresolution camera to train the agent for navigating through an arena avoiding obstacles and to reach a victim. This approach is a much more relevant way of decision making in complex, uncertain contexts; since in real-world SAR scenarios, it is almost impossible to have the area's full information to be used by SAR teams. We simulated a SAR scenario, which consists of an arena full of randomly generated obstacles, a victim, and an autonomous SAR robot. The simulation results show that the agent was able to reach the victim in 56% of the evaluation episodes after 400 episodes of training.
DEEP REINFORCEMENT LEARNING FOR NAVIGATION IN CLUTTERED ENVIRONMENTS
Collision-free motion is essential for mobile robots. Most approaches to collision-free and efficient navigation with wheeled robots require parameter tuning by experts to obtain good navigation behavior. In this paper, we aim at learning an optimal navigation policy by deep reinforcement learning to overcome this manual parameter tuning. Our approach uses proximal policy optimization to train the policy and achieve collision-free and goal-directed behavior. The output of the learned network are the robot's trans-lational and angular velocities for the next time step. Our method combines path planning on a 2D grid with reinforcement learning and does not need any supervision. Our network is first trained in a simple environment and then transferred to scenarios of increasing complexity. We implemented our approach in C++ and Python for the Robot Operating System (ROS) and thoroughly tested it in several simulated as well as real-world experiments. The experiments illustrate that our trained policy can be applied to solve complex navigation tasks. Furthermore, we compare the performance of our learned controller to the popular dynamic window approach (DWA) of ROS. As the experimental results show, a robot controlled by our learned policy reaches the goal significantly faster compared to using the DWA by closely bypassing obstacles and thus saving time.
Autonomous Navigation of Robots: Optimization with DQN
Applied Sciences
In the field of artificial intelligence, control systems for mobile robots have undergone significant advancements, particularly within the realm of autonomous learning. However, previous studies have primarily focused on predefined paths, neglecting real-time obstacle avoidance and trajectory reconfiguration. This research introduces a novel algorithm that integrates reinforcement learning with the Deep Q-Network (DQN) to empower an agent with the ability to execute actions, gather information from a simulated environment in Gazebo, and maximize rewards. Through a series of carefully designed experiments, the algorithm’s parameters were meticulously configured, and its performance was rigorously validated. Unlike conventional navigation systems, our approach embraces the exploration of the environment, facilitating effective trajectory planning based on acquired knowledge. By leveraging randomized training conditions within a simulated environment, the DQN network exhibits superior...
On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning and SLAM Based Approach
ArXiv, 2020
We present a map-less path planning algorithm based on Deep Reinforcement Learning (DRL) for mobile robots navigating in unknown environment that only relies on 40-dimensional raw laser data and odometry information. The planner is trained using a reward function shaped based on the online knowledge of the map of the training environment, obtained using grid-based Rao-Blackwellized particle filter, in an attempt to enhance the obstacle awareness of the agent. The agent is trained in a complex simulated environment and evaluated in two unseen ones. We show that the policy trained using the introduced reward function not only outperforms standard reward functions in terms of convergence speed, by a reduction of 36.9\% of the iteration steps, and reduction of the collision samples, but it also drastically improves the behaviour of the agent in unseen environments, respectively by 23\% in a simpler workspace and by 45\% in a more clustered one. Furthermore, the policy trained in the sim...
2021
Autonomous vehicle navigation in an unknown dynamic environment is crucial for both supervised- and Reinforcement Learning-based autonomous maneuvering. The cooperative fusion of these two learning approaches has the potential to be an effective mechanism to tackle indefinite environmental dynamics. Most of the state-of-the-art autonomous vehicle navigation systems are trained on a specific mapped model with familiar environmental dynamics. However, this research focuses on the cooperative fusion of supervised and Reinforcement Learning technologies for autonomous navigation of land vehicles in a dynamic and unknown environment. The Faster R-CNN, a supervised learning approach, identifies the ambient environmental obstacles for untroubled maneuver of the autonomous vehicle. Whereas, the training policies of Double Deep Q-Learning, a Reinforcement Learning approach, enable the autonomous agent to learn effective navigation decisions form the dynamic environment. The proposed model is...
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2019
We introduce a new autonomous path planning algorithm for mobile robots for reaching target locations in an unknown environment where the robot relies on its on-board sensors. In particular, we describe the design and evaluation of a deep reinforcement learning motion planner with continuous linear and angular velocities to navigate to a desired target location based on deep deterministic policy gradient (DDPG). Additionally, the algorithm is enhanced by making use of the available knowledge of the environment provided by a grid-based SLAM with Rao-Blackwellized particle filter algorithm in order to shape the reward function in an attempt to improve the convergence rate, escape local optima and reduce the number of collisions with the obstacles. A comparison is made between a reward function shaped based on the map provided by the SLAM algorithm and a reward function when no knowledge of the map is available. Results show that the required learning time has been decreased in terms of number of episodes required to converge, which is 560 episodes compared to 1450 episodes in the standard RL algorithm, after adopting the proposed approach and the number of obstacle collision is reduced as well with a success ratio of 83% compared to 56% in the standard RL algorithm. The results are validated in a simulated experiment on a skid-steering mobile robot.
Dynamic Obstacle Avoidance Technique for Mobile Robot Navigation Using Deep Reinforcement Learning
International Journal of Emerging Trends in Engineering Research , 2023
In the realm of mobile robotics, navigating around obstacles is a fundamental task, particularly in constantly changing situations. Although deep reinforcement learning (DRL) techniques exist that utilize the positional information of robot's, environmental states, and input dataset for neural networks. Although, the positional information alone does not provide sufficient insights into the motion trends of obstacles. To solve this issue, this paper presents a dynamic obstacle mobility pattern approach for mobile robots (MRs) that rely on DRL. This method employs the positional details of dynamic obstacles dependent upon time for establishing a movement trend vector. This vector, in conjunction with another mobility state attribute, forms the MR mobility guidance matrix, that essentially conveys the pattern variation of dynamic obstacles trend over a specified interval. Using this matrix, the robot can choose its avoidance action. Also, this methodology uses the DRL-based dynamic policy algorithm for the testing and validation of the proposed technique through Python programming. The experimental outcomes demonstrate that this technique substantially improves the safety of avoiding dynamic obstacles.