Discovering Conversational Dependencies between Messages in Dialogs (original) (raw)

Classifying Dialogue Acts in One-on-One Live Chats

Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

We explore the task of automatically classifying dialogue acts in 1-on-1 online chat forums, an increasingly popular means of providing customer service. In particular, we investigate the effectiveness of various features and machine learners for this task. While a simple bag-of-words approach provides a solid baseline, we find that adding information from dialogue structure and inter-utterance dependency provides some increase in performance; learners that account for sequential dependencies (CRFs) show the best performance. We report our results from testing using a corpus of chat dialogues derived from online shopping customer-feedback data.

Data-Driven Identification of Dialogue Acts in Chat Messages

2016

We present an approach to classify chat messages into dialogue acts, focusing on questions and directives (“to-dos”). Our multi-lingual system uses word lexica, a specialized tokenizer and rule-based shallow syntactic analysis to compute relevant features, and then trains statistical models (support vector machines, random forests, etc.) for dialogue act prediction. The classification scores we achieve are very satisfactory on question detection and promising on to-do detection, on English and German data collections.

Mining conversational text for procedures with applications in contact centers

2007

Abstract Many organizations provide dialog-based support through contact centers to sell their products, handle customer issues, and address product-and service-related issues. This is usually provided through voice calls—of late, web-chat based support is gaining prominence. In this paper, we consider any conversational text derived from web-chat systems, voice recognition systems etc., and propose a method to identify procedures that are embedded in the text.

The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

2020

Human conversations are complicated and building a human-like dialogue agent is an extremely challenging task. With the rapid development of deep learning techniques, data-driven models become more and more prevalent which need a huge amount of real conversation data. In this paper, we construct a large-scale real scenario Chinese E-commerce conversation corpus, JDDC, with more than 1 million multi-turn dialogues, 20 million utterances, and 150 million words. The dataset reflects several characteristics of human-human conversations, e.g., goal-driven, and long-term dependency among the context. It also covers various dialogue types including task-oriented, chitchat and question-answering. Extra intent information and three well-annotated challenge sets are also provided. Then, we evaluate several retrieval-based and generative models to provide basic benchmark performance on the JDDC corpus. And we hope JDDC can serve as an effective testbed and benefit the development of fundamenta...

Learning to detect conversation focus of threaded discussions

2006

Abstract In this paper we present a novel feature-enriched approach that learns to detect the conversation focus of threaded discussions by combining NLP analysis and IR techniques. Using the graph-based algorithm HITS, we integrate different features such as lexical similarity, poster trustworthiness, and speech act analysis of human conversations with feature-oriented link generation functions.

Conversation Analysis of Online Chat

This paper examined some dominant discourse features of on line chat. Using the methods of Conversation Analysis (CA), it examined the discourse strategies chatters used to maintain conversation and manage turn-taking, repair and adjacency pairs. These principles were quietly effective in the situation of online chat while sometimes problematic especially due to absence of extra linguistic features. So, interactions in this context needed specific discourse skills different from that of written or spoken discourse.

Towards a Fully Unsupervised Framework for Intent Induction in Customer Support Dialogues

arXiv (Cornell University), 2023

State of the art models in intent induction require annotated datasets. However, annotating dialogues is time-consuming, laborious and expensive. In this work, we propose a completely unsupervised framework for intent induction within a dialogue. In addition, we show how pre-processing the dialogue corpora can improve results. Finally, we show how to extract the dialogue flows of intentions by investigating the most common sequences. Although we test our work in the MultiWOZ dataset, the fact that this framework requires no prior knowledge make it applicable to any possible use case, making it very relevant to real world customer support applications across industry.

Automated Speech Act Classification For Online Chat

In this paper, we present our investigation on using supervised machine learning methods to automatically classify online chat posts into speech act categories, which are semantic categories indicating speakers' intentions. Supervised machine learning methods presuppose the existence of annotated training data based on which machine learning algorithms can be used to learn the parameters of some model that was proposed to solve the task at hand. In our case, we used the annotated Linguistic Data Consortium chat corpus to tune our model which is based on the assumption that the first few tokens/words in each chat post are very predictive of the post's speech act category. We present results for predicting the speech act category of chat posts that were obtained using two machine learning algorithms, Naïve Bayes and Decision Trees, in conjunction with several variants of the basic model that include the first 2 to 6 words and their part-of-speech tags as features. The results support the validity of our initial assumption that the first words in an utterance can be used to predict its speech act category with very good accuracy.

Utterances Assessment in Chat Conversations

Research in Computing …, 2010

With the continuous evolution of collaborative environments, the needs of automatic analyses and assessment of participants in instant messenger conferences (chat) have become essential. For these aims, on one hand, a series of factors based on natural language processing (including lexical analysis and Latent Semantic Analysis) and data-mining have been taken into consideration. On the other hand, in order to thoroughly assess participants, measures as Page's essay grading, readability and social networks analysis metrics were computed. The weights of each factor in the overall grading system are optimized using a genetic algorithm whose entries are provided by a perceptron in order to ensure numerical stability. A gold standard has been used for evaluating the system's performance.

Mining the Web for Large-Scale Conversational Content

2012

One of the biggest bottlenecks for conversational systems is large-scale provision of suitable content. In this paper, we present the use of content mined from online question-and-answer forums to automatically construct system utterances. Although this content is mined in the form of question-answer pairs, our system is able to use it to formulate utterances that drive a conversation, not just for answering user questions as has been done in previous work. We use a collection of strategies that specify how and when the question-answer pairs can be used and augmented with a small number of generic hand-crafted text snippets to generate natural and coherent system utterances. Our experiments involving 11 human participants demonstrated that this approach can indeed produce relatively natural and coherent interaction.