Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data (original) (raw)

The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

2020

Human conversations are complicated and building a human-like dialogue agent is an extremely challenging task. With the rapid development of deep learning techniques, data-driven models become more and more prevalent which need a huge amount of real conversation data. In this paper, we construct a large-scale real scenario Chinese E-commerce conversation corpus, JDDC, with more than 1 million multi-turn dialogues, 20 million utterances, and 150 million words. The dataset reflects several characteristics of human-human conversations, e.g., goal-driven, and long-term dependency among the context. It also covers various dialogue types including task-oriented, chitchat and question-answering. Extra intent information and three well-annotated challenge sets are also provided. Then, we evaluate several retrieval-based and generative models to provide basic benchmark performance on the JDDC corpus. And we hope JDDC can serve as an effective testbed and benefit the development of fundamenta...

A Multi-Task Hierarchical Approach for Intent Detection and Slot Filling

Knowledge-Based Systems, 2019

Spoken language understanding (SLU) plays an integral part in every dialogue system. To understand the intention of the user and extract the necessary information to help the user achieve desired goals is a challenging task. In this work, we propose an end-to-end hierarchical multi-task model that can jointly perform both intent detection and slot filling tasks for the datasets of varying domains. The primary aim is to capture context information in a dialogue to help the SLU module in a dialogue system to correctly understand the user and assist the user in achieving the desired goals. It is vital for the SLU module to capture the past information along with the present utterance said by the user to retrieve correct information. The dependency and correlation between the two tasks, i.e. intent detection and slot filling makes the multi-task learning framework effective in capturing the desired information provided by the user. We use Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) to capture contextual information for the utterances. We employ Conditional Random Field (CRF) to model label dependency. Both character and word level embeddings are provided as input to the models. We create a benchmark corpus for the SLU tasks, on TRAINS and FRAMES dataset for capturing more realistic and natural utterances spoken by the speakers in a human/machine dialogue system. Experimental results on multiple datasets of various domains (ATIS, SNIP, TRAINS and FRAMES) show that our proposed approach is effective compared to the individual models and the state-of-the-art methods.

GODEL: Large-Scale Pre-Training for Goal-Directed Dialog

arXiv (Cornell University), 2022

We introduce GODEL (Grounded Open Dialogue Language Model), a large pretrained language model for dialog. In contrast with earlier models such as DialoGPT, GODEL leverages a new phase of grounded pre-training designed to better support adapting GODEL to a wide range of downstream dialog tasks that require information external to the current conversation (e.g., a database or document) to produce good responses. Experiments against an array of benchmarks that encompass task-oriented dialog, conversational QA, and grounded open-domain dialog show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot finetuning setups, in terms of both human and automatic evaluation. A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses (extrinsic evaluation) in addition to their communicative features (intrinsic evaluation). We show that extrinsic evaluation offers improved inter-annotator agreement and correlation with automated metrics. Code and data processing scripts are publicly available. 1

Encoding Context in Task-Oriented Dialogue Systems Using Intent, Dialogue Acts, and Slots

2020

Extracting context from natural language conversations has been the focus of applications which communicate with humans. Understanding the meaning and the intent of the user input, and formulating responses based on a contextual analysis mimicking that of an actual person is at the heart of modern-day chatbots and conversational agents. For this purpose, dialogue systems often use context from previous dialogue history. Thus, present-day dialogue systems typically parse over user utterances and sort them into semantic frames. In this paper, a bidirectional RNN with LSTM and a CRF layer on top is used to classify each utterance into its resultant dialogue act. Furthermore, there is a separate bidirectional RNN with LSTM and attention for the purpose of slot tagging. Slot annotations use the inside-outside-beginning (IOB) scheme. Softmax regression is used to determine the intent of the entire conversation. The approach is demonstrated on data from three different domains.

Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

2020

Goal-oriented dialog systems enable users to complete specific goals like requesting information about a movie or booking a ticket. Typically the dialog system pipeline contains multiple ML models, including natural language understanding, state tracking and action prediction (policy learning). These models are trained through a combination of supervised or reinforcement learning methods and therefore require collection of labeled domain specific datasets. However, collecting annotated datasets with language and dialog-flow variations is expensive, time-consuming and scales poorly due to human involvement. In this paper, we propose an approach for automatically creating a large corpus of annotated dialogs from a few thoroughly annotated sample dialogs and the dialog schema. Our approach includes a novel goal-sampling technique for sampling plausible user goals and a dialog simulation technique that uses heuristic interplay between the user and the system (Alexa), where the user trie...

Let's Discoh: Collecting an Annotated Open Corpuswith Dialogue Acts and Reward Signals for Natural Language Helpdesks

2006 IEEE Spoken Language Technology Workshop, 2006

We motivate and explain the DISCOH project 1 , which uses a publicly deployed spoken dialogue system for conference services to collect a richly annotated corpus of mixed-initiative human-machine spoken dialogues. System users are able to call a phone number and learn about a conference, including paper submission, program, venue, accommodation options and costs, etc. The collected corpus is (1) usable for training, evaluating and comparing statistical models, (2) naturally spoken and task oriented, (3) extendible / generalizable, (4) collected using state-of-the-art research and commercial technology, (5) freely available to researchers.

Multi-Task Learning for Supervised Pretraining of Goal-Oriented Dialogue Policies

2019

This paper describes the use of Multi-Task Neural Networks (NNs) for system dialogue act selection. These models leverage the representations learned by the Natural Language Understanding (NLU) unit to enable robust initialization/bootstrapping of dialogue policies from medium sized initial data sets. We evaluate the models on two goal-oriented dialogue corpora in the travel booking domain. Results show the proposed models improve over models trained without knowledge of NLU tasks.

Toward Data-Driven Collaborative Dialogue Systems: The JILDA Dataset

Italian Journal of Computational Linguistics, 2021

Today's goal-oriented dialogue systems are designed to operate in restricted domains and with the implicit assumption that the user goals fit the domain ontology of the system. Under these assumptions dialogues exhibit only limited collaborative phenomena. However, this is not necessarily true in more complex scenarios, where user and system need to collaborate to align their knowledge of the domain in order to improve the conversation and achieve their goals. To foster research on data-driven collaborative dialogues, in this paper we present JILDA, a fully annotated dataset of chat-based, mixed-initiative Italian dialogues related to the job-offer domain. As far as we know, JILDA is the first dialogic corpus completely annotated in this domain. The analysis realised on top of the semantic annotations clearly shows the naturalness and greater complexity of JILDA's dialogues. In fact, the new dataset offers a large number of examples of pragmatic phenomena, such as proactivity (i.e., providing information not explicitly requested) and grounding, which are rarely investigated in AI conversational agents based on neural architectures. In conclusion, the annotated JILDA corpus, given its innovative characteristics, represents a new challenge for conversational agents and an important resource for tackling more complex scenarios, thus advancing the state of the art in this field.

Improved Goal Oriented Dialogue via Utterance Generation and Look Ahead

arXiv (Cornell University), 2021

Goal oriented dialogue systems have become a prominent customer-care interaction channel for most businesses. However, not all interactions are smooth, and customer intent misunderstanding is a major cause of dialogue failure. We show that intent prediction can be improved by training a deep text-to-text neural model to generate successive user utterances from unlabeled dialogue data. For that, we define a multi-task training regime that utilizes successive user-utterance generation to improve the intent prediction. Our approach achieves the reported improvement due to two complementary factors: First, it uses a large amount of unlabeled dialogue data for an auxiliary generation task. Second, it uses the generated user utterance as an additional signal for the intent prediction model. Lastly, we present a novel look-ahead approach that uses user utterance generation to improve intent prediction in inference time. Specifically, we generate counterfactual successive user utterances for conversations with ambiguous predicted intents, and disambiguate the prediction by reassessing the concatenated sequence of available and generated utterances.

The Amitiés system: Data-driven techniques for automated dialogue

Speech Communication, 2006

We present a natural-language customer service application for a telephone banking call center, developed as part of the AMITIÉS dialogue project (Automated Multilingual Interaction with Information and Services). Our dialogue system, based on empirical data gathered from real call-center conversations, features data-driven techniques that allow for spoken language understanding despite speech recognition errors, as well as mixed system/customer initiative and spontaneous conversation.