Formation control using GQ(λ) reinforcement learning (original) (raw)
Related papers
Reinforcement Learning Algorithms for Multi-Robot Organization
The decentralized control and distributed artificial intelligence became common in everyday systems, where there are techniques for coordinating multiple intelligent mobile robots gaining popularity in the robotic community. Certainly, a decentralized and distributed solution is the only viable approach for many real-world application domains that are inherently distributed in time, space, and/or behaviorally. Therefore, there has been an upsurge of interest in multiple autonomous mobile robots engaged in collective behavior that supporting and complementing each other. Many of these real-world applications are space missions, operations in hazardous environments (fire fighting, cleanup of toxic waste, nuclear power plant decommissioning, security), and military operations. The idea that group(s) of cooperating intelligent robots can be more flexible, more reliable and more fault-tolerant, and more economical than a single, monolithic robot only and only if efficient means can be found for cooperating robot group(s). However, one of several problems exist in multiple autonomous mobile robots engaged in cooperative behavior is dynamic organization, which is the difficulty in determining the proper coordination and task-allocation schemes during task execution in response to conditions in the environment and within the team itself. Solving this problem is a big challenge, and will lead to an optimal cooperative robot team(s).
Behavior-based formation control for multirobot teams
IEEE Transactions on Robotics and …, 1998
New reactive behaviors that implement formations in multirobot teams are presented and evaluated. The formation behaviors are integrated with other navigational behaviors to enable a robotic team to reach navigational goals, avoid hazards and simultaneously remain in formation. The behaviors are implemented in simulation, on robots in the laboratory and aboard DARPA's HMMWV-based Unmanned Ground Vehicles. The technique has been integrated with the Autonomous Robot Architecture (AuRA) and the UGV Demo II architecture. The results demonstrate the value of various types of formations in autonomous, human-led and communications-restricted applications, and their appropriateness in different types of task environments.
2012
This paper presents the design and implementation of a new reinforcement learning (RL) based algorithm. The proposed algorithm, ) ( CQ (collaborative ) ( Q ) allows several learning agents to acquire knowledge from each other. Acquiring knowledge learnt by an agent via collaboration with another agent enables acceleration of the entire learning system; therefore, learning can be utilized more efficiently. By developing collaborative learning algorithms, a learning task solution can be achieved significantly faster if performed by a single agent only, namely the number of learning episodes to solve a task is reduced. The proposed algorithm proved to accelerate learning in navigation robotic problem. The ) ( CQ algorithm was applied to autonomous mobile robot navigation where several robot agents serve as learning processes. Robots learned to navigate an 11 x 11 world contains obstacles and boundaries choosing the optimum path to reach a target. Simulated experiments based on 50 le...
Distributed Reinforcement Learning for Robot Teams: A Review
2022
Purpose of review: Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems consisting of hundreds/thousands of robots, with promising applications to automated manufacturing, disaster relief, harvesting, last-mile delivery, port/airport operations, or search and rescue. The community has leveraged model-free multi-agent reinforcement learning (MARL) to devise efficient, scalable controllers for multi-robot systems (MRS). This review aims to provide an analysis of the state-of-the-art in distributed MARL for multi-robot cooperation. Recent findings: Decentralized MRS face fundamental challenges, such as non-stationarity and partial observability. Building upon the "centralized training, decentralized execution" paradigm, recent MARL approaches include independent learning, centralized critic, value decomposition, and communication learning approaches. Cooperative behaviors are demonstrated through AI benchmarks and fundamental real-world robotic capabilities such as multi-robot motion/path planning. Summary: This survey reports the challenges surrounding decentralized model-free MARL for multi-robot cooperation and existing classes of approaches. We present benchmarks and robotic applications along with a discussion on current open avenues for research.
Learning of Coordination Policies for Robotic Swarms
arXiv (Cornell University), 2017
Inspired by biological swarms, robotic swarms are envisioned to solve real-world problems that are difficult for individual agents. Biological swarms can achieve collective intelligence based on local interactions and simple rules; however, designing effective distributed policies for large-scale robotic swarms to achieve a global objective can be challenging. Although it is often possible to design an optimal centralized strategy for smaller numbers of agents, those methods can fail as the number of agents increases. Motivated by the growing success of machine learning, we develop a deep learning approach that learns distributed coordination policies from centralized policies. In contrast to traditional distributed control approaches, which are usually based on human-designed policies for relatively simple tasks, this learning-based approach can be adapted to more difficult tasks. We demonstrate the efficacy of our proposed approach on two different tasks, the well-known rendezvous problem and a more difficult particle assignment problem. For the latter, no known distributed policy exists. From extensive simulations, it is shown that the performance of the learned coordination policies is comparable to the centralized policies, surpassing state-of-the-art distributed policies. Thereby, our proposed approach provides a promising alternative for real-world coordination problems that would be otherwise computationally expensive to solve or intangible to explore.
Reinforcement learning-based group navigation approach for multiple autonomous robotic systems
Advanced Robotics, 2006
In several complex applications, the use of multiple autonomous robotic systems (ARS) becomes necessary to achieve different tasks, such as foraging and transport of heavy and large objects, with less cost and more efficiency. They have to achieve a high level of flexibility, adaptability and efficiency in real environments. In this paper, a reinforcement learning (RL)-based group navigation approach for multiple ARS is suggested. Indeed, the robots must have the ability to form geometric figures and navigate without collisions while maintaining the formation. Thus, each robot must learn how to take its place in the formation, and avoid obstacles and other ARS from its interaction with the environment. This approach must provide ARS with the capability to acquire the group navigation approach among several ARS from elementary behaviors by learning with trialand-error search. Then, simulation results display the ability of the suggested approach to provide ARS with capability to navigate in a group formation in dynamic environments. With its cooperative behavior, this approach makes ARS able to work together to successfully fulfill the desired task.
Reinforcement learning based group navigation approach for multiple autonomous robotic system
Journal of Computer and System Sciences, 2005
In several complex applications, the use of multiple autonomous robotic systems (ARS) becomes necessary to achieve different tasks such as foraging and transport of heavy and large objects with less cost and more efficiency. They have to achieve a high level of flexibility, adaptability and efficiency in real environments. In this paper, a reinforcement learning (RL) based group navigation approach for multiple ARS is suggested. Indeed, the robots must have the ability to form geometric figures and navigate without collisions while maintaining the formation. Thus, each robot must learn how to take its place in the formation and avoid obstacles and other ARS from its interaction with the environment. This approach must provide ARS with capability to acquire the group navigation approach among several ARS from elementary behaviors by learning with trial and error search. Then, simulation results display the ability of the suggested approach to provide ARS with capability to navigate in a group formation in dynamic environments.
Aerospace
Cooperative formation control of unmanned ground vehicles (UGVs) has become one of the important research hotspots in the application of UGV and attracted more and more attention in the military and civil fields. Compared with traditional formation control algorithms, reinforcement-learning-based algorithms can provide a new solution with a lower complexity for real-time formation control by equipping UGVs with artificial intelligence. Therefore, in this paper, a distributed deep-reinforcement-learning-based cooperative formation control algorithm is proposed to solve the navigation, maintenance, and obstacle avoidance tasks of UGV formations. More importantly, the hierarchical triangular formation structure and the newly designed Markov decision process for UGV formations of leader and follower attributes make the control strategy learned by the algorithm reusable, so that the formation can arbitrarily increase the number of UGVs and realize a more flexible expansion. The effective...
Multi-robot team formation control in the GUARDIANS project
Industrial Robot-an International Journal, 2010
The GUARDIANS multi-robot team is to be deployed in a large warehouse in smoke. The team is to assist firefighters search the warehouse in the event or danger of a fire. The large dimensions of the environment together with development of smoke which drastically reduces visibility, represent major challenges for search and rescue operations. The GUARDIANS robots guide and accompany the firefighters on site whilst indicating possible obstacles and the locations of danger and maintaining communications links.
Fuzzy Policy Reinforcement Learning in Cooperative Multi-robot Systems
Journal of Intelligent and Robotic Systems, 2007
A multi-agent reinforcement learning algorithm with fuzzy policy is addressed in this paper. This algorithm is used to deal with some control problems in cooperative multi-robot systems. Specifically, a leader-follower robotic system and a flocking system are investigated. In the leader-follower robotic system, the leader robot tries to track a desired trajectory, while the follower robot tries to follow the reader to keep a formation. Two different fuzzy policies are developed for the leader and follower, respectively. In the flocking system, multiple robots adopt the same fuzzy policy to flock. Initial fuzzy policies are manually crafted for these cooperative behaviors. The proposed learning algorithm finely tunes the parameters of the fuzzy policies through the policy gradient approach to improve control performance. Our simulation results demonstrate that the control performance can be improved after the learning.