Data-Driven Identification of Dialogue Acts in Chat Messages (original) (raw)
Related papers
Classifying Dialogue Acts in One-on-One Live Chats
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010
We explore the task of automatically classifying dialogue acts in 1-on-1 online chat forums, an increasingly popular means of providing customer service. In particular, we investigate the effectiveness of various features and machine learners for this task. While a simple bag-of-words approach provides a solid baseline, we find that adding information from dialogue structure and inter-utterance dependency provides some increase in performance; learners that account for sequential dependencies (CRFs) show the best performance. We report our results from testing using a corpus of chat dialogues derived from online shopping customer-feedback data.
Using a slim function word classifier to recognise instruction dialogue acts
2011
This paper extends a novel technique for the classification of short texts as Dialogue Acts, based on structural information contained in function words. It investigates the new challenge of discriminating between instructions and a non-instruction mix of questions and statements. The proposed technique extracts features by replacing function words with numeric tokens and replacing each content word with a standard numeric wildcard token. Consequently this is a potentially challenging task for the function-word based approach as the salient feature of an instruction is an imperative verb, which will always be replaced by a wildcard. Nevertheless, the results of the decision tree classifiers produced provide evidence for potentially highly effective classification and they are comparable with initial work on question classification. Improved classification accuracy is expected in future through optimisation of feature extraction.
A multi-classifier approach to dialogue act classification using function words
2012
This paper extends a novel technique for the classification of sentences as Dialogue Acts, based on structural information contained in function words. Initial experiments on classifying questions in the presence of a mix of straightforward and "difficult" non-questions yielded promising results, with classification accuracy approaching 90%. However, this initial dataset does not fully represent the various permutations of natural language in which sentences may occur. Also, a higher Classification Accuracy is desirable for real-world applications. Following an analysis of categorisation of sentences, we present a series of experiments that show improved performance over the initial experiment and promising performance for categorising more complex combinations in the future.
Intent Classification for Dialogue Utterances
IEEE Intelligent Systems, 2020
In this work we investigate several machine learning methods to tackle the problem of intent classification for dialogue utterances. We start with Bag-of-Words (BoW) in combination with Naïve Bayes (NB). After that, we employ Continuous Bag-of-Words (CBoW) coupled with Support Vector Machines (SVM). Then follow Long Short-Term Memory (LSTM) networks, which are made bidirectional. The best performing model is hierarchical, such that it can take advantage of the natural taxonomy within classes. The main experiments are a comparison between these methods on an open sourced academic dataset. In the first experiment we consider the full dataset. We also consider the given subsets of data separately, in order to compare our results with state-of-the-art vendor solutions. In general we find that the SVM models outperform the LSTM models. The former models achieve the highest macro-F1 for the full dataset, and in most of the individual datasets. We also found out that the incorporation of the hierarchical structure in the intents improves the performance. Customer interaction is at the center of many organizations. In order to help customers efficiently, one could automate the interaction between the organization's representative and a customer. Customers usually contact the organization with a specific request or query. In order to help a customer, the intention of the customer needs to be classified. 1 Intent classification tries to answer the question why the customer contacted the organization and what the customer wants to achieve. The interaction can partly or fully be automated using a dialogue system 2 , which uses intent classification. The classification can also be used to help the human representatives, namely, by using intent classification to direct the incoming messages to the representative that has the right expertise. Due to its importance for dialogue handling 3 , intent classification needs to be done properly. Therefore, this research focuses on improving the existing practice of intent classification for dialogue utterances.
Dialogue Act Classification in Group Chats with DAG-LSTMs
2019
Dialogue act (DA) classification has been studied for the past two decades and has several key applications such as workflow automation and conversation analytics. Researchers have used, to address this problem, various traditional machine learning models, and more recently deep neural network models such as hierarchical convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. In this paper, we introduce a new model architecture, directed-acyclic-graph LSTM (DAG-LSTM) for DA classification. A DAG-LSTM exploits the turn-taking structure naturally present in a multi-party conversation, and encodes this relation in its model structure. Using the STAC corpus, we show that the proposed method performs roughly 0.8% better in accuracy and 1.2% better in macro-F1 score when compared to existing methods. The proposed method is generic and not limited to conversation applications.
Arabic Dialogue Act Recognition for Textual Chatbot Systems
2019
Automatic Dialogue Acts Recognition is considered a crucial step for semantic extraction in Natural Language Understanding and Dialogue Systems. In this paper, we introduce our work aiming to recognize the dialogue acts of the users in a Textual Dialogue system using Levantine Arabic dialect. Our Dialogue acts have 8 types: Greeting, Goodbye, Thanks, Confirm, Negate, Ask_repeat, Ask_for_alt, and Apology. Various Machine Learning algorithms -with different features have been used to detect the correct speech act categories: Logistic Regression, SVM, Multinomial NB, Extra Trees Classifier, Random Forest Classifier. We also used the Voting Ensemble method to make the best prediction from each classifier. We compared the results of the proposed models on a hand-crafted corpus in the restaurants orders and airline ticketing domain. The SVM algorithm with 2-gram has given the best results.
Robust Classification of Dialog Acts from the Transcription of Utterances
International Conference on Semantic Computing (ICSC 2007), 2007
This paper presents a robust classification of dialog acts from text utterances. Two different types, namely, bag-of-words and syntactic relationship among words, were used to extract the discourse level features from the transcript of utterances. Subsequently a number of feature mining methods have been used to identify the most relevant features and their roles in classifying dialog acts. The selected features are used to learn the underlying models of dialog acts using a number of existing machine learning algorithms from the WEKA toolbox. Empirical analyses using the HCRC Map Task Corpus dialog data was conducted to evaluate the performance of the proposed approach.
Automatic Detection of Dialog Acts Based on Multi-level Information Ý
Recently there has been growing interest in using dialog acts to characterize human-human and human-machine dialogs. This paper reports on our experience in the annotation and the automatic detection of dialog acts in human-human spoken dialog corpora. Our work is based on two hypotheses: first, word position is more important than the exact word in identifying the dialog act; and second, there is a strong grammar constraining the sequence of dialog acts. A memory based learning approach has been used to detect dialog acts. In a first set of experiments the number of utterances per turn is known, and in a second set, the number of utterances is hypothesized using a language model for utterance boundary detection. In order to verify our first hypothesis, the model trained on a French corpus was tested on a corpus for a similar task in English and for a second French corpus from a different domain. A correct dialog act detection rate of about 84% is obtained for the same domain and language condition and about 75% for the cross-language and cross-domain conditions.
Empirical determination of thresholds for optimal dialogue act classification
2005
We present recent experiments which build on our work in the area of Dialogue Act (da) tagging. Identifying the dialogue acts of utterances is recognised as an important step towards understanding the content and nature of what speakers say. We describe a simple dialogue act classifier based on purely intra-utterance features -principally word n-gram cue phrases. Such a classifier performs surprisingly well, rivalling scores obtained using far more sophisticated language modelling techniques for the corpus we address. The approach requires the use of thresholds effecting the selection of n-gram cues, which have previously been manually supplied. We here describe a method of automatically determining these thresholds to optimise classifier performance.
Conversation Management in Task-oriented Turkish Dialogue Agents with Dialogue Act Classification
2020
We study the problem of dialogue act classification to be used in conversation management of goal-oriented dialogue systems. Online chat behavior in human-machine dialogue systems differs from human-human spoken conversations. To this end, we develop 9 dialogue act classes by observing real-life human conversations from a banking domain Turkish dialogue agent. We then propose a dialogue policy based on these classes to correctly direct the users to their goals in a chatbot-human support hybrid dialogue system. To train a dialogue act classifier, we annotate a corpus of human-machine dialogues consisting of 426 conversations and 5020 sentences. Using the annotated corpus, we train a self-attentive bi-directional LSTM dialogue act classifier, which achieves 0.90 weighted F1-score on a sentence level classification performance. We deploy the trained model in the conversation manager to maintain the designed dialogue policy.