Real-time decision detection in multi-party dialogue (original) (raw)

Modelling and detecting decisions in multi-party dialogue

Proceedings of the 9th …, 2008

We describe a process for automatically detecting decision-making sub-dialogues in transcripts of multi-party, human-human meetings. Extending our previous work on action item identification, we propose a structured approach that takes into account the different roles utterances play in the decisionmaking process. We show that this structured approach outperforms the accuracy achieved by existing decision detection systems based on flat annotations, while enabling the extraction of more fine-grained information that can be used for summarization and reporting.

Toward Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings

Lecture Notes in Computer Science, 2006

This paper investigates a scheme for joint segmentation and classification of dialog acts (DAs) of the ICSI Meeting Corpus based on hidden-event language models and a maximum entropy classifier for the modeling of word boundary types. Specifically, the modeling of the boundary types takes into account dependencies between the duration of a pause and its surrounding words. Results for the proposed method compare favorably with our previous work on the same task.

Automatic Detection of Dialog Acts Based on Multi-level Information Ý

Recently there has been growing interest in using dialog acts to characterize human-human and human-machine dialogs. This paper reports on our experience in the annotation and the automatic detection of dialog acts in human-human spoken dialog corpora. Our work is based on two hypotheses: first, word position is more important than the exact word in identifying the dialog act; and second, there is a strong grammar constraining the sequence of dialog acts. A memory based learning approach has been used to detect dialog acts. In a first set of experiments the number of utterances per turn is known, and in a second set, the number of utterances is hypothesized using a language model for utterance boundary detection. In order to verify our first hypothesis, the model trained on a French corpus was tested on a corpus for a similar task in English and for a second French corpus from a different domain. A correct dialog act detection rate of about 84% is obtained for the same domain and language condition and about 75% for the cross-language and cross-domain conditions.

Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings

2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006

This paper investigates a scheme for joint segmentation and classification of dialog acts (DAs) of the ICSI Meeting Corpus based on hidden-event language models and a maximum entropy classifier for the modeling of word boundary types. Specifically, the modeling of the boundary types takes into account dependencies between the duration of a pause and its surrounding words. Results for the proposed method compare favorably with our previous work on the same task.

Extracting decisions from multi-party dialogue using directed graphical models and semantic similarity

2009

We use directed graphical models (DGMs) to automatically detect decision discussions in multi-party dialogue. Our approach distinguishes between different dialogue act (DA) types based on their role in the formulation of a decision. DGMs enable us to model dependencies, including sequential ones. We summarize decisions by extracting suitable phrases from DAs that concern the issue under discussion and its resolution. Here we use a semantic-similarity metric to improve results on both manual and ASR transcripts.

Detecting Action Items in Multi-party Meetings: Annotation and Initial Experiments

2006

This paper presents the results of initial investigation and experiments into automatic action item detection from transcripts of multi-party human-human meetings. We start from the flat action item annotations of [1], and show that automatic classification performance is limited. We then describe a new hierarchical annotation schema based on the roles utterances play in the action item assignment process, and propose a corresponding approach to automatic detection that promises improved classification accuracy while also enabling the extraction of useful information for summarization and reporting.

Towards a Representation for Understanding the Structure of Multiparty Conversations

2011

Dialog is a crucially important mode of communication. Understanding its structure is vital to a great number of technological innovations. Within the domain of meeting dialog applications alone, there is a need for navigating, summarizing, and extracting action items in recorded meetings and automated facilitation of meetings. These efforts, however, are hampered by a lack of agreement on how best to annotate meeting dialog to serve downstream applications. In this technical report, I describe of some of the efforts at representation and annotation of dialog acts, surveying some of the vast differences between approaches. In addition, in order to more directly compare different representations of discourse structure, two small, pilot annotations were carried out on a subset of the ICSI Meeting Corpus using different annotation schemes. The results support the idea that skew between different annotation systems renders them to some degree incompatible. Meetings are an extremely impo...

Multi-level information and automatic dialog act detection in human–human spoken dialogs

Speech Communication, 2008

This paper reports studies on annotating and automatically detecting dialog acts in human-human spoken dialogs. The work reposes on three hypotheses: first, the succession of dialog acts is strongly constrained; second, the initial word and semantic class of word are more important for identifying dialog acts than the complete exact word sequence of an utterance; third, most of the important information is encoded in specific entities. A memory based learning approach is used to detect dialog acts. For each utterance unit, 8 dialog acts are systematically annotated. Experiments have been conducted using different levels of information, with and without the use of dialog history information. In order to assess the generality of the method, the specific entity tag based model trained on a French corpus was tested on an English corpus for a similar task and on a French corpus from a different domain. A correct dialog act detection rate of about 86% is obtained for the same domain/language condition and 77% for the cross-language or cross-domain conditions.

Automatic Dialog Act Segmentation and Classification in Multiparty Meetings

Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., 2005

We explore the two related tasks of dialog act (DA) segmentation and DA classification for speech from the ICSI Meeting Corpus. We employ simple lexical and prosodic knowledge sources, and compare results for human-transcribed versus automatically recognized words. Since there is little previous work on DA segmentation and classification in the meeting domain, our study provides baseline performance rates for both tasks. We introduce a range of metrics for use in evaluation, each of which measures different aspects of interest. Results show that both tasks are difficult, particularly for a fully automatic system. We find that a very simple prosodic model aids performance over lexical information alone, especially for segmentation. Both tasks, but particularly word-based segmentation, are degraded by word recognition errors. Finally, while classification results for meeting data show some similarities to previous results for telephone conversations, findings also suggest a potential difference with respect to the effect of modeling DA context.

Automatically Generated Prosodic Cues to Lexically Ambiguous Dialog Acts in Multiparty Meetings

2000

We investigate whether automatically extracted prosodic features can serve as cues to dialog acts (DAs) in naturally- occurring meetings. We focus on the classification of four short DAs, all of which can be conveyed by the same words. DAs were hand-labeled based on the discourse context. Results for classifiers trained on automatically extracted prosodic features show significant associations with DAs