Identifying relevant phrases to summarize decisions in spoken meetings (original) (raw)

Extracting decisions from multi-party dialogue using directed graphical models and semantic similarity

2009

We use directed graphical models (DGMs) to automatically detect decision discussions in multi-party dialogue. Our approach distinguishes between different dialogue act (DA) types based on their role in the formulation of a decision. DGMs enable us to model dependencies, including sequential ones. We summarize decisions by extracting suitable phrases from DAs that concern the issue under discussion and its resolution. Here we use a semantic-similarity metric to improve results on both manual and ASR transcripts.

Modelling and detecting decisions in multi-party dialogue

Proceedings of the 9th …, 2008

We describe a process for automatically detecting decision-making sub-dialogues in transcripts of multi-party, human-human meetings. Extending our previous work on action item identification, we propose a structured approach that takes into account the different roles utterances play in the decisionmaking process. We show that this structured approach outperforms the accuracy achieved by existing decision detection systems based on flat annotations, while enabling the extraction of more fine-grained information that can be used for summarization and reporting.

Detecting and Summarizing Action Items in Multi-Party Dialogue

This paper addresses the problem of identifying action items discussed in open-domain conversational speech, and does so in two stages: firstly, detecting the subdialogues in which action items are proposed, discussed and committed to; and secondly, extracting the phrases that accurately capture or summarize the tasks they involve. While the detection problem is hard, we show that we can improve accuracy by taking account of dialogue structure. We then describe a semantic parser that identifies potential summarizing phrases, and show that for some task properties these can be more informative than plain utterance transcriptions. * This work was supported by the CALO project (DARPA grant NBCH-D-03-0010). We also thank Gokhan Tür, Andreas Stolcke and Liz Shriberg for provision of ASR output and dialogue act tags for the ICSI corpus.

Towards a Representation for Understanding the Structure of Multiparty Conversations

2011

Dialog is a crucially important mode of communication. Understanding its structure is vital to a great number of technological innovations. Within the domain of meeting dialog applications alone, there is a need for navigating, summarizing, and extracting action items in recorded meetings and automated facilitation of meetings. These efforts, however, are hampered by a lack of agreement on how best to annotate meeting dialog to serve downstream applications. In this technical report, I describe of some of the efforts at representation and annotation of dialog acts, surveying some of the vast differences between approaches. In addition, in order to more directly compare different representations of discourse structure, two small, pilot annotations were carried out on a subset of the ICSI Meeting Corpus using different annotation schemes. The results support the idea that skew between different annotation systems renders them to some degree incompatible. Meetings are an extremely impo...

What are meeting summaries? An analysis of human extractive summaries in meeting corpus

2008

Abstract Significant research efforts have been devoted to speech summarization, including automatic approaches and evaluation metrics. However, a fundamental problem about what summaries are for the speech data and whether humans agree with each other remains unclear. This paper performs an analysis of human annotated extractive summaries using the ICSI meeting corpus with an aim to examine their consistency and the factors impacting human agreement.

Generating and validating abstracts of meeting conversations: a user study

2010

In this paper we present a complete system for automatically generating natural language abstracts of meeting conversations. This system is comprised of components relating to interpretation of the meeting documents according to a meeting ontology, transformation or content selection from that source representation to a summary representation, and generation of new summary text. In a formative user study, we compare this approach to gold-standard human abstracts and extracts to gauge the usefulness of the different summary types for browsing meeting conversations. We find that our automatically generated summaries are ranked significantly higher than human-selected extracts on coherence and usability criteria. More generally, users demonstrate a strong preference for abstract-style summaries over extracts.

Summarizing Online Conversations: A Machine Learning Approach

Summarization has emerged as an increasingly useful approach to tackle the problem of information overload. Extracting information from online conversations can be of very good commercial and educational value. But majority of this information is present as noisy unstructured text making traditional document summarization techniques difficult to apply. In this paper, we propose a novel approach to address the problem of conversation summarization. We develop an automatic text summarizer which extracts sentences from the conversation to form a summary. Our approach consists of three phases. In the first phase, we prepare the dataset for usage by correcting spellings and segmenting the text. In the second phase, we represent each sentence by a set of predefined features. These features capture the statistical, linguistic and sentimental aspects along with the dialogue structure of the conversation. Finally, in the third phase we use a machine learning algorithm to train the summarizer on the set of feature vectors. Experiments performed on conversations taken from the technical domain show that our system significantly outperforms the baselines on ROUGE F-scores.

Detecting Action Items in Multi-party Meetings: Annotation and Initial Experiments

2006

This paper presents the results of initial investigation and experiments into automatic action item detection from transcripts of multi-party human-human meetings. We start from the flat action item annotations of [1], and show that automatic classification performance is limited. We then describe a new hierarchical annotation schema based on the roles utterances play in the action item assignment process, and propose a corresponding approach to automatic detection that promises improved classification accuracy while also enabling the extraction of useful information for summarization and reporting.

A Probabilistic Model of Meetings that Combines Words and Discourse Features

In order to determine the points at which meeting discourse changes from one topic to another, probabilistic models were used to approximate the process through which meeting transcripts were produced. Gibbs sampling was used to estimate the values of random variables in the models, including the locations of topic boundaries. The paper shows how discourse features were integrated into the Bayesian model, and reports empirical evaluations of the benefit obtained through the inclusion of each feature and of the suitability of alternative models of the placement of topic boundaries. It demonstrates how multiple cues to segmentation can be combined in a principled way, and empirical tests show a clear improvement over previous work.