Reinforcement Learning and Automated Planning (original) (raw)

Reinforcement Learning and Automated Planning: A Survey

2012

This article presents a detailed survey on Artificial Intelligent approaches, that combine Reinforcement Learning and Automated Planning. There is a close relationship between those two areas as they both deal with the process of guiding an agent, situated in a dynamic environment, in order to achieve a set of predefined goals. Therefore, it is straightforward to integrate learning and planning, in a single guiding mechanism and there have been many approaches in this direction during the past years. The approaches are organized and presented according to various characteristics, as the used planning mechanism or the reinforcement learning algorithm.

Forward and Bidirectional Planning Based on Reinforcement Learning and Neural Networks in a Simulated Robot

Lecture Notes in Computer Science, 2003

Building intelligent systems that are capable of learning, acting reactively and planning actions before their execution is a major goal of artificial intelligence. This paper presents two reactive and planning systems that contain important novelties with respect to previous neural-network planners and reinforcement-learning based planners: (a) the introduction of a new component ("matcher") allows both planners to execute genuine taskable planning (while previous reinforcement-learning based models have used planning only for speeding up learning); (b) the planners show for the first time that trained neural-network models of the world can generate long prediction chains that have an interesting robustness with regards to noise; (c) two novel algorithms that generate chains of predictions in order to plan, and control the flows of information between the systems' different neural components, are presented; (d) one of the planners uses backward "predictions" to exploit the knowledge of the pursued goal; (e) the two systems presented nicely integrate reactive behavior and planning on the basis of a measure of "confidence" in action. The soundness and potentialities of the two reactive and planning systems are tested and compared with a simulated robot engaged in a stochastic path-finding task. The paper also presents an extensive literature review on the relevant issues. and taskable, i.e. to re-use knowledge to pursue different goals (see below, and see the concept of "state anticipation" of Butz et al. in this volume).

A review of machine learning for automated planning

The Knowledge Engineering Review, 2012

Recent discoveries in automated planning are broadening the scope of planners, from toy problems to real applications. However, applying automated planners to real-world problems is far from simple. On the one hand, the definition of accurate action models for planning is still a bottleneck. On the other hand, off-the-shelf planners fail to scale-up and to provide good solutions in many domains. In these problematic domains, planners can exploit domain-specific control knowledge to improve their performance in terms of both speed and quality of the solutions. However, manual definition of control knowledge is quite difficult. This paper reviews recent techniques in machine learning for the automatic definition of planning knowledge. It has been organized according to the target of the learning process: automatic definition of planning action models and automatic definition of planning control knowledge. In addition, the paper reviews the advances in the related field of reinforcemen...

Hierarchical Reinforcement Learning with AI Planning Models

arXiv (Cornell University), 2022

Two common approaches to sequential decision-making are AI planning (AIP) and reinforcement learning (RL). Each has strengths and weaknesses. AIP is interpretable, easy to integrate with symbolic knowledge, and often efficient, but requires an up-front logical domain specification and is sensitive to noise; RL only requires specification of rewards and is robust to noise but is sample inefficient and not easily supplied with external knowledge. We propose an integrative approach that combines high-level planning with RL, retaining interpretability, transfer, and efficiency, while allowing for robust learning of the lower-level plan actions. Our approach defines options in hierarchical reinforcement learning (HRL) from AIP operators by establishing a correspondence between the state transition model of AI planning problem and the abstract state transition system of a Markov Decision Process (MDP). Options are learned by adding intrinsic rewards to encourage consistency between the MDP and AIP transition models. We demonstrate the benefit of our integrated approach by comparing the performance of RL and HRL algorithms in both MiniGrid and N-rooms environments, showing the advantage of our method over the existing ones.

Reinforcement Learning - A Technical Introduction

Journal of autonomous intelligence, 2019

Reinforcement learning provides a cognitive science perspective to behavior and sequential decision making provided that reinforcement learning algorithms introduce a computational concept of agency to the learning problem. Hence it addresses an abstract class of problems that can be characterized as follows: An algorithm confronted with information from an unknown environment is supposed to find step wise an optimal way to behave based only on some sparse, delayed or noisy feedback from some environment, that changes according to the algorithm's behavior. Hence reinforcement learning offers an abstraction to the problem of goal-directed learning from interaction. The paper offers an opinionated introduction in the algorithmic advantages and drawbacks of several algorithmic approaches to provide algorithmic design options.

Reinforcement learning as a motion planner—a survey

2012

This paper gives a brief overview of reinforcement learning as a motion planner. I cover the basics of reinforcement learning and an interpretation of its relationship to motion planning. The taxonomy of reinforcement learning is followed by a summary of the key algorithms. I put emphasis on the early papers and the historic perspective of the RL field. The second part of the paper discusses the application of reinforcement learning relevant to the motion planning.

Goal-directed behaviours by reinforcement learning

Neurocomputing, 1999

The development of high potential for out-door or hostile environment ability necessitates an adaptive and versatile control system in order to avoid the difficulties of complex and unpredictable behaviour modelling. Auto-organisation allows artificial machines to approach these goals. For that, reinforcement methods are investigated: considering that the relations between the task to perform and the environment may act as a supervisor, efficient learning is performed. Starting from a very simple structure inspired by insect behaviour, the study presented in this paper is devoted to a neural network based control system which allows a simulated six legged robot to walk and avoid obstacles even when it is partially damaged.

The 2006 International Arab Conference on Information Technology ( ACIT ' 2006 ) Reinforcement Learning through Supervision for Autonomous Agents

2015

Reinforcement Learning (RL) is a class of model-free learning control methods that can solve Markov Decision Process (MDP) problems. However, one difficulty for the application of RL control is its slow convergence, especially in MDPs with continuous state space. In this paper, a modified structure of RL is proposed to accelerate reinforcement learning control. This approach combines supervision technique with the standard Qlearning algorithm of reinforcement learning. The a priori information is provided to the RL learning agent by a direct integration of a human operator commands (a.k.a. human advices) or by an optimal LQ-controller, indicating preferred actions in some particular situations. It is shown that the convergence speed of the supervised RL agent is greatly improved compared to the conventional Q-Learning algorithm. Simulation work and results on the cart-pole balancing problem and learning navigation tasks in unknown grid world with obstacles are given to illustrate th...

Control Navigation in Robots Using Reinforcement Learning

International Research Journal of Computer Science

Reinforcement learning (RL) is a subfield of machine learning which is being developed in Artificial Intelligence (AI). This technique is a data independent process. The primary aim of systems this kind is to maximize their reward signal which makes systems do better things trending to goal. Reinforcement Learning alters with techniques like supervised and unsupervised in such a way that in RL the agent gets up with its own insights and maps what action to perform in certain situations. On the other hand, Supervised and unsupervised have answers already embedded in them. In RL, in absence of new data, it can learn from its own experience where others can do. RL is used almost everywhere, the best applications of RL in Robotics specifically in motion control, planning it is also used in finance, gaming etc. Here is this paper demonstrating the navigation and motion control development of a 2 wheeled differential drive robot with the help of reinforcement learning topology. Traditionally, to design the behaviour of controllers in robots, we inevitably need models of how the robot actually behaves in the environment. But here we come up with a RL approach to design the control structure for the robot to navigate in the indoor environment.

Dynamic Path Planning Algorithm for Mobile Robots: Leveraging Reinforcement Learning for Efficient Navigation

Journal of internet services and information security, 2024

Traversing unfamiliar terrain presents a considerable challenge, particularly concerning the task of locating a viable pathway, regardless of its actual existence. This paper presents a novel navigation algorithm leveraging reinforcement learning, specifically the Markov Decision Process, to address the challenges of navigating dynamic environments. In contrast to traditional methods, this approach offers adaptability and efficiency in scenarios ranging from mobile robot navigation to complex industrial settings. The algorithm integrates an enhanced A* algorithm, showcasing its versatility in handling various tasks, from pathfinding to obstacle avoidance. To evaluate its effectiveness, the algorithm undergoes rigorous testing across multiple scenarios, comparing its performance with and without reinforcement learning. Through extensive experimentation, the algorithm demonstrates superior performance in terms of efficiency and adaptability, particularly in scenarios. The results presented highlight the algorithm's learning progress and effectiveness in finding the shortest path. Notably, the algorithm's performance surpasses that of conventional approaches, underscoring its potential for real-world applications in mobile robot navigation and beyond. In conclusion, the proposed algorithm represents a significant advancement in navigation techniques, offering a robust solution for addressing the challenges posed by dynamic environments. Its integration of reinforcement learning enhances adaptability and efficiency, making it a promising tool for various industries and applications.