Learning optimal dialogue management rules by using reinforcement learning and inductive logic programming (original) (raw)

Using reinforcement learning to build a better model of dialogue state

2006

Given the growing complexity of tasks that spoken dialogue systems are trying to handle, Reinforcement Learning (RL) has been increasingly used as a way of automatically learning the best policy for a system to make. While most work has focused on generating better policies for a dialogue manager, very little work has been done in using RL to construct a better dialogue state. This paper presents a RL approach for determining what dialogue features are important to a spoken dialogue tutoring system. Our experiments show that incorporating dialogue factors such as dialogue acts, emotion, repeated concepts and performance play a significant role in tutoring and should be taken into account when designing dialogue systems.

Automatic optimization of dialogue management

Proceedings of the 18th conference on Computational linguistics -, 2000

Designing the dialogue strategy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a dialogue strategy that addresses the technical challenges in applying reinforcement learning to a working dialogue system with human users. We then show that our approach measurably improves performance in an experimental system.

Learning automata-based approach to learn dialogue policies in large state space

International Journal of Intelligent Information and Database Systems, 2012

ABSTRACT This paper addresses the problem of scalable optimisation of dialogue policies in speech-based conversational systems using reinforcement learning. More specifically, for large state spaces several difficulties like large tables, an account of prior knowledge and data sparsity are faced. Hence, we present an online policy learning algorithm based on hierarchical structure learning automata using eligibility trace method to find optimal dialogue strategies that cover large state spaces. The proposed algorithm is capable of deriving an optimal policy that prescribes what action should be taken in various states of conversation so as to maximise the expected total reward to attain the goal and incorporates good exploration and exploitation in its updates to improve the naturalness of human-computer interaction. The proposed model is tested using the most sophisticated evaluation framework PARADISE for accessing the travel information system.

Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets

Computational Linguistics, 2008

We propose a method for learning dialogue management policies from a fixed data set. The method addresses the challenges posed by Information State Update (ISU)-based dialogue systems, which represent the state of a dialogue as a large set of features, resulting in a very large state space and a huge policy space. To address the problem that any fixed data set will only provide information about small portions of these state and policy spaces, we propose a hybrid model that combines reinforcement learning with supervised learning. The reinforcement learning is used to optimize a measure of dialogue reward, while the supervised learning is used to restrict the learned policy to the portions of these spaces for which we have data. We also use linear function approximation to address the need to generalize from a fixed amount of data to large state spaces. To demonstrate the effectiveness of this method on this challenging task, we trained this model on the COMMUNICATOR corpus, to which we have added annotations for user actions and Information States. When tested with a user simulation trained on a different part of the same data set, our hybrid model outperforms a pure supervised learning model and a pure reinforcement learning model. It also outperforms the hand-crafted systems on the COMMUNICATOR data, according to automatic evaluation measures, improving over the average COMMUNICATOR system policy by 10%. The proposed method will improve techniques for bootstrapping and automatic optimization of dialogue management policies from limited initial data sets.

An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System

2002

This paper describes a novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users. The method is based on a combination of reinforcement learning and performance modeling of spoken dialogue systems. The reinforcement learning component applies Q-learning Watkins, 1989, while the performance modeling component applies the PARADISE evaluation framework Walker et al., 1997 to learn the performance function reward used in reinforcement learning. We illustrate the method with a spoken dialogue system named elvis EmaiL Voice Interactive System, that supports access to email over the phone. We conduct a set of experiments for training an optimal dialogue strategy on a corpus of 219 dialogues in which human users interact with elvis over the phone. We then test that strategy on a corpus of 18 dialogues. We show that elvis can learn to optimize its strategy selection for agent initiative, for reading messages, and for summarizing email folders.

Using Markov decision process for learning dialogue strategies

1998

In this paper we introduce a stochastic model for dialogue systems based on Markov decision process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a variety of methods, including the reinforcement learning approach. The advantages of this new paradigm include objective evaluation of dialogue systems and their automatic design and adaptation. We show some preliminary results on learning a dialogue strategy for an Air Travel Information System.

Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System

2011

Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a dialogue policy, which addresses the technical challenges in applying reinforcement learning to a working dialogue system with human users. We report on the design, construction and empirical evaluation of NJFun, an experimental spoken dialogue system that provides users with access to information about fun things to do in New Jersey. Our results show that by optimizing its performance via reinforcement learning, NJFun measurably improves system performance.

A Survey on Reinforcement Learning for Dialogue Systems

viXra, 2019

Dialogue systems are computer systems which com- municate with humans using natural language. The goal is not just to imitate human communication but to learn from these interactions and improve the system’s behaviour over time. Therefore, different machine learning approaches can be implemented with Reinforcement Learning being one of the most promising techniques to generate a contextually and semantically appropriate response. This paper outlines the current state-of- the-art methods and algorithms for integration of Reinforcement Learning techniques into dialogue systems.

Reinforcement Learning With Simulated User For Automatic Dialog Strategy Optimization

In this paper, we propose a solution to the problem of formulating strategies for a spoken dialog system. Our approach is based on reinforcement learning with the help of a simulated user in order to identify an optimal dialog strategy. Our method considers the Markov decision process to be a framework for representation of speech dialog in which the states represent history and discourse context, the actions are dialog acts and the transition strategies are decisions on actions to take between states. We present our reinforcement learning architecture with a novel objective function that is based on dialog quality rather than its duration.

Learning optimal dialogue management rules by using reinforcement learning and inductive logic programming (original) (raw)

Related papers