Symbolic Relational Deep Reinforcement Learning based on Graph Neural Networks (original) (raw)

Relational Deep Reinforcement Learning

ArXiv, 2018

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.

Scaling up reinforcement learning with a relational representation

2003

Reinforcement learning has been repeatedly suggested as good candidate for learning in robotics. However, the large search spaces normally occurring robotics and expensive training experiences required by reinforcement learning algorithms has hampered its applicability. This paper introduces a new approach for reinforcement learning based on a relational representation which: (i) can be applied over large search spaces, (ii) can incorporate domain knowledge, and (iii) can use previously learned policies on different, although similar, problems. In the proposed framework states are represented as sets of first order relations, actions in terms of those relations, and policies are learned over such generalized representation. It is shown how this representation can capture large search spaces with a relatively small set of actions and states, and that policies learned over this generalized representation can be directly apply to other problems which can be characterized by the same set of relations.

Discovering symbolic policies with deep reinforcement learning

2021

Deep reinforcement learning (DRL) has proven successful for many difficult control problems by learning policies represented by neural networks. However, the complexity of neural network-based policies—involving thousands of composed nonlinear operators—can render them problematic to understand, trust, and deploy. In contrast, simple policies comprising short symbolic expressions can facilitate human understanding, while also being transparent and exhibiting predictable behavior. To this end, we propose deep symbolic policy, a novel approach to directly search the space of symbolic policies. We use an autoregressive recurrent neural network to generate control policies represented by tractable mathematical expressions, employing a risk-seeking policy gradient to maximize performance of the generated policies. To scale to environments with multidimensional action spaces, we propose an “anchoring” algorithm that distills pre-trained neural network-based policies into fully symbolic po...

Towards Practical Robot Manipulation using Relational Reinforcement Learning

2019

Learning robotic manipulation tasks using reinforcement learning with sparse rewards is currently impractical due to the outrageous data requirements. Many practical tasks require manipulation of multiple objects and the complexity of such tasks increases with the number of objects. Learning from a curriculum of increasingly complex tasks appears to be a natural solution, but unfortunately, only provides a small performance gain. We hypothesize that the inability of the stateof-the-art algorithms to learn from a curriculum stems from the absence of inductive biases for transferring knowledge from simpler to complex tasks. We show that graph-based relational architectures overcome this limitation and enable learning of complex tasks when provided with a simple curriculum of tasks with increasing numbers of objects. We demonstrate the utility of our framework on a simulated block-stacking task. Starting from scratch, our agent learns to stack six blocks into a tower. Despite using spa...

Deep reinforcement learning with relational inductive biases

2019

We introduce an approach for augmenting model-free deep reinforcement learning agents with a mechanism for relational reasoning over structured representations, which improves performance, learning efficiency, generalization, and interpretability. Our architecture encodes an image as a set of vectors, and applies an iterative message-passing procedure to discover and reason about relevant entities and relations in a scene. In six of seven StarCraft II Learning Environment mini-games, our agent achieved state-of-the-art performance, and surpassed human grandmasterlevel on four. In a novel navigation and planning task, our agent’s performance and learning efficiency far exceeded non-relational baselines, it was able to generalize to more complex scenes than it had experienced during training. Moreover, when we examined its learned internal representations, they reflected important structure about the problem and the agent’s intentions. The main contribution of this work is to introduc...

Learning Relational Rules from Rewards

ArXiv, 2022

Humans perceive the world in terms of objects and relations between them. In fact, for any given pair of objects, there is a myriad of relations that apply to them. How does the cognitive system learn which relations are useful to characterize the task at hand? And how can it use these representations to build a relational policy to interact effectively with the environment? In this paper we proposed that this problem can be understood through the lens of a sub-field of symbolic machine learning called relational reinforcement learning (RRL). To demonstrate the potential of our approach, we build a simple model of relational policy learning based on a function approximator developed in RRL. We trained and tested our model in three Atari games that required to consider an increasingly number of potential relations: Breakout, Pong and Demon Attack. In each game, our model was able to select adequate relational representations and build a relational policy incrementally. We discuss the r...

Creativity of AI: Hierarchical Planning Model Learning for Facilitating Deep Reinforcement Learning

arXiv (Cornell University), 2021

Despite of achieving great success in real-world applications, Deep Reinforcement Learning (DRL) is still suffering from three critical issues, i.e., data efficiency, lack of the interpretability and transferability. Recent research shows that embedding symbolic knowledge into DRL is promising in addressing those challenges. Inspired by this, we introduce a novel deep reinforcement learning framework with symbolic options. Our framework features a loop training procedure, which enables guiding the improvement of policy by planning with planning models (including action models and hierarchical task network models) and symbolic options learned from interactive trajectories automatically. The learned symbolic options alleviate the dense requirement of expert domain knowledge and provide inherent interpretability of policies. Moreover, the transferability and data efficiency can be further improved by planning with the symbolic planning models. To validate the effectiveness of our framework, we conduct experiments on two domains, Montezuma's Revenge and Office World, respectively. The results demonstrate the comparable performance, improved data efficiency, interpretability and transferability.

Deep Reinforcement Learning with Graph-based State Representations

ArXiv, 2020

Deep RL approaches build much of their success on the ability of the deep neural network to generate useful internal representations. Nevertheless, they suffer from a high sample-complexity and starting with a good input representation can have a significant impact on the performance. In this paper, we exploit the fact that the underlying Markov decision process (MDP) represents a graph, which enables us to incorporate the topological information for effective state representation learning. Motivated by the recent success of node representations for several graph analytical tasks we specifically investigate the capability of node representation learning methods to effectively encode the topology of the underlying MDP in Deep RL. To this end we perform a comparative analysis of several models chosen from 4 different classes of representation learning algorithms for policy learning in grid-world navigation tasks, which are representative of a large class of RL problems. We find that a...

Deep Reinforcement Learning in Complex Structured Environments

2018

Creating general agents capable of learning useful policies in real-world environments is a difficult task. Reinforcement learning is the field that aims to solve this problem. It provides a general, rigorously defined framework within which algorithms can be designed to solve various problems. Complex real-world environments tend to have a structure that can be exploited. Humans are extremely proficient at this. Because the structure can vary dramatically between environments, creating agents capable of discovering and exploiting such structure without prior knowledge about the environment is a long-standing and unsolved problem in reinforcement learning. Hierarchical reinforcement learning is a sub-field focused specifically on finding and exploiting a hierarchical structure in the environment. In this work, we implement and study two hierarchical reinforcement learning methods, Strategic Attentive Writer and FeUdal Networks. We propose a modification of the FeUdal Networks model ...

Deep reinforcement learning using compositional representations for performing instructions

2018

Spoken language is one of the most efficient ways to instruct robots about performing domestic tasks. However , the state of the environment has to be considered to plan and execute actions successfully. We propose a system that learns to recognise the user's intention and map it to a goal. A reinforcement learning (RL) system then generates a sequence of actions toward this goal considering the state of the environment. A novel contribution in this paper is the use of symbolic representations for both input and output of a neural Deep Q-network (DQN), which enables it to be used in a hybrid system. To show the effectiveness of our approach, the Tell-Me-Dave corpus is used to train an intention detection model and in a second step an RL agent generates the sequences of actions towards the detected objective, represented by a set of state predicates. We show that the system can successfully recognise command sequences from this corpus as well as train the deep-RL network with symbolic input. We further show that the performance can be significantly increased by exploiting the symbolic representation to generate intermediate rewards .