Efficient learning of reactive robot behaviors with a Neural-Q/spl I.bar/learning approach (original) (raw)

An integrated architecture for learning of reactive behaviors based on dynamic cell structures

Robotics and Autonomous Systems, 1997

In this contribution we want to draw the readers attention to the advantages of dynamic cell structures (DCSs) for learning reactive behaviors of autonomous robots. These include incremental on-like learning, fast output calculation, a flexible integration of different learning rules and a close connection tofizzy logic. The latter allows for incorporation of prior knowledge and to interpret learning with DCSs as fuzzy rule generation and adaptation.

Q-ran: A constructive reinforcement learning approach for robot behavior learning

Proceedings of IEEE/RSJ …, 2006

This paper presents a learning system that uses Qlearning with a resource allocating network (RAN) for behavior learning in mobile robotics. The RAN is used as a function approximator, and Q-learning is used to learn the control policy in 'off-policy' fashion that enables learning to be bootstrapped by a prior knowledge controller, thus speeding up the reinforcement learning. Our approach is verified on a PeopleBot robot executing a visual servoing based docking behavior in which the robot is required to reach a goal pose. Further experiments show that the RAN network can also be used for supervised learning prior to reinforcement learning in a layered architecture, thus further improving the performance of the docking behavior.

International Conference on Robotics and Applications ( RA 2005 ) , Cambridge , U . S . A . COLLABORATIVE Q ( λ ) REINFORCEMENT LEARNING ALGORITHM-A PROMISING ROBOT LEARNING FRAMEWORK

2012

This paper presents the design and implementation of a new reinforcement learning (RL) based algorithm. The proposed algorithm, ) ( CQ (collaborative ) ( Q ) allows several learning agents to acquire knowledge from each other. Acquiring knowledge learnt by an agent via collaboration with another agent enables acceleration of the entire learning system; therefore, learning can be utilized more efficiently. By developing collaborative learning algorithms, a learning task solution can be achieved significantly faster if performed by a single agent only, namely the number of learning episodes to solve a task is reduced. The proposed algorithm proved to accelerate learning in navigation robotic problem. The ) ( CQ algorithm was applied to autonomous mobile robot navigation where several robot agents serve as learning processes. Robots learned to navigate an 11 x 11 world contains obstacles and boundaries choosing the optimum path to reach a target. Simulated experiments based on 50 le...

A reinforcement learning control approach for underwater manipulation under position and torque constraints

2020

In marine operations underwater manipulators play a primordial role. However, due to uncertainties in the dynamic model and disturbances caused by the environment, low-level control methods require great capabilities to adapt to change. Furthermore, under position and torque constraints the requirements for the control system are greatly increased. Reinforcement learning is a data driven control technique that can learn complex control policies without the need of a model. The learning capabilities of these type of agents allow for great adaptability to changes in the operative conditions. In this article we present a novel reinforcement learning low-level controller for the position control of an underwater manipulator under torque and position constraints. The reinforcement learning agent is based on an actor-critic architecture using sensor readings as state information. Simulation results using the Reach Alpha 5 underwater manipulator show the advantages of the proposed control ...

Reinforcement Learning with Functional Approximation Using a Neural Network

2015

This work focusses on implementation of reinforcement learning on a robot tank in Robocode frame-work. Qlearning algorithm was used and the state-action pairs and the corresponding Q-values were learnt using Neural Networks. The Neural Network is pretrained with the values from the static Look up Table. The first part contains the problem formulations, which introduces the notion of states, actions, rewards considered in the system. Then later a bit of system architecture, followed by simulation results and future works. Keywords—Reinforcement Learning, Neural Networks, Qlearning, Robocode.

Reinforcement learning accelerated by using state transition model with robotic applications

2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566)

This paper discusses a method to accelerate reinforcement learning. Firstly defined is a concept that reduces the state space conserving policy. An algorithm is then given that calculates the optimal cost-to-go and the optimal policy in the reduced space from those in the original space. Using the reduced state space, learning convergence is accelerated. Its usefulness for both DP (dynamic programing) iteration and Q-learning are compared through a maze example. The convergence of the optimal cost-to-go in the original state space needs approximately N or more times as long as that in the reduced state space, where N is a ratio of the state number of the original space to the reduced space. The acceleration effect for Q-learning is more remarkable than that for the DP iteration. The proposed technique is also applied to a robot manipulator working for a peg-in-hole task with geometric constraints. The state space reduction can be considered as a model of the change of observation, i.e., one of cognitive actions. The obtained results explain that the change of observation is reasonable in terms of learning efficiency.

Multistrategy Learning of Adaptive Reactive Controllers

Reactive controllers has been widely used in mobile robots since they are able to achieve successful performance in real-time. However, the configuration of a reactive controller depends highly on the operating conditions of the robot and the environment; thus, a reactive controller configured for one class of environments may not perform adequately in another. This paper presents a formulation of learning adaptive reactive controllers. Adaptive reactive controllers inherit all the advantages of traditional reactive controllers, but in addition they are able to adjust themselves to the current operating conditions of the robot and the environment in order to improve task performance. Furthermore, learning adaptive reactive controllers can learn when and how to adapt the reactive controller so as to achieve effective performance under different conditions. The paper presents an algorithm for a learning adaptive reactive controller that combines ideas from case-based reasoning and reinforcement learning to construct a mapping between the operating conditions of a controller and the appropriate controller configuration; this mapping is in turn used to adapt the controller configuration dynamically. As a case study, the algorithm is implemented in a robotic navigation system that controls a Denning MRV-III mobile robot. The system is extensively evaluated using statistical methods to verify its learning performance and to understand the relevance of different design parameters on the performance of the system.

Simulation of the Navigation of a Mobile Robot by the QLearning using Artificial Neuron Networks

CIIA, 2009

This paper presents a type of machine learning is reinforcement learning, this approach is often used in the field of robotics. It aims to determine a control law for a mobile robot in an unknown environment. This kind of technique applies when one assumes that the only information on the quality of actions performed by the mobile robot is a scalar signal which has a reward or punishment, the process of learning is to improve the choice of actions to maximize rewards. One of the most used algorithms for solving this problem is learning the Q-learning algorithm which is based on the Qfunction, and to ensure the generation of this latter function and the proper functioning of the apprenticeship system using an artificial neural network as the statements of changing environments where mobile robots have wide open spaces, the action performed by the mobile robot in its environment is ensured by using a selection function, this action is evaluated by a scalar signal which is-1, 0 and 1.

Incremental supervised learning for mobile robot reactive control

Robotics and Autonomous Systems, 1997

Reactive control for a mobile robot can be defined as a mapping from a perceptual space to a command space. This mapping can be hard-coded by fine user (potential fields, fuzzy logic), and can also be learnt. This paper is concerned with supervised learning for perception to action mapping for a mobile robot. Among the existing neural approaches for supervised learning of a function, we have selected the grow and learn network for its properties adapted to robotic problems: incrementality and flexible structure. We will present the results we have obtained with this network using first raw sensor data and then pre-processed measures with the automatic construction of virtual sensors. 0921-8890/97/$17.00

Learning fuzzy logic controller for reactive robot behaviours

Proceedings 2003 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM 2003), 2003

Fuzzy logic plays an important role in the design of reactive robot behaviours. This paper presents a learning approach to the development of a fuzzy logic controller, based on the delayed rewards from the real world. The delayed rewards are apportioned to the individual fuzzy rules by using reinforcement Q-learning. The efficient exploration of a solution space is one of the key issues in the reinforcement learning. A specific genetic algorithm is developed in this paper to trade off the exploration of learning spaces and the exploitation of learned experience. The proposed approach is evaluated on some reactive behaviour of the football-playing robots.