Flexible, Robust and Dynamic Dialogue Modeling with a Speech Dialogue Interface for Controlling a Hi-Fi Audio System (original) (raw)
Related papers
Evaluation of a spoken dialogue system for controlling a Hifi audio system
2008 IEEE Spoken Language Technology Workshop, 2008
In this paper a Bayesian Networks, BNs, approach to dialogue modelling [1] is evaluated in terms of a battery of both subjective and objective metrics. A significant effort in improving the contextual information handling capabilities of the system has been done. Consequently, besides typical dialogue measurement rates for usability like task or dialogue completion rates, dialogue time, etc. we have included a new figure measuring the contextuality of the dialogue as the number of turns where contextual information is helpful for dialogue resolution. The evaluation is developed through a set of predefined scenarios according to different initiative styles and focusing on the impact of the user's level of experience.
2005
This paper presents the development of a speech interface for controlling a high fidelity system from natural language sentences. A Bayesian Belief Network approach is proposed for dialog modeling. This solution is applied to infer the user's goals corresponding to the processed utterances. Subsequently, from the inferred goals, missing or spurious concepts are automatically detected. This is used to drive the dialog prompting for missing concepts and clarifying for spurious concepts allowing more flexible and natural dialogs. A dialog strategy which makes use of the dialog history and the system's state is also presented.
In this thesis we tackle the problem of identifying the best prac-tices when designing and evaluating a spoken dialogue system. With the purpose of demonstrating that a more natural, flexi-ble and robust dialogue is possible, and introducing a spoken dialogue system for controlling a Hi-Fi audio system as the se-lected prototype, we propose a Bayesian Networks (BNs) based solution for dialogue modelling combined with carefully de-signed contextual information handling strategies. Dynamic ca-pabilities are also provided to keep the dialogue context perma-nently updated according to the evolution of the dialogue. All the thesis contributions have been evaluated finding an experi-mental support enough to demonstrate their relevance.
Interacting with Computers, 2012
In this paper a Bayesian Networks-based solution for dialogue modelling is presented. This solution is combined with carefully designed contextual information handling strategies. With the purpose of validating these solutions, and introducing a spoken dialogue system for controlling a Hi-Fi audio system as the selected prototype, a real-user evaluation has been conducted. Two different versions of the prototype are compared. Each version corresponds to a different implementation of the algorithm for the management of the actuation order, the algorithm for deciding the proper order to carry out the actions required by the user. The evaluation is carried out in terms of a battery of both subjective and objective metrics collected from speakers interacting with the Hi-Fi audio box through predefined scenarios. Defined metrics have been specifically adapted to measure: first, the usefulness and the actual relevance of the proposed solutions, and, secondly, their joint performance through their intelligent combination mainly measured as the level achieved with regard to the user satisfaction. A thorough and comprehensive study of the main differences between both approaches is presented. Two-way analysis of variance (ANOVA) tests are also included to measure the effects of both: the system used and the type of scenario factors, simultaneously. Finally, the effect of bringing this flexibility, robustness and naturalness into our home dialogue system is also analyzed through the results obtained. These results show that the intelligence of our speech interface has been well perceived, highlighting its excellent ease of use and its good acceptance by users, therefore validating the approached dialogue management solutions and demonstrating that a more natural, flexible and robust dialogue is possible thanks to them.
A BAYESIAN NETWORKS APPROACH FOR DIALOG MODELING: THE FUSION BN
2009 Ieee International Conference on Acoustics, Speech, and Signal Processing, Vols 1- 8, Proceedings, 2009
Bayesian Networks, BNs, are suitable for mixed-initiative dialog modeling allowing a more flexible and natural spoken interaction. This solution can be applied to identify the intention of the user considering the concepts extracted from the last utterance and the dialog context. Subsequently, in order to make a correct decision regarding how the dialog should continue, unnecessary, missing, wrong, optional and required concepts have to be detected according to the inferred goals. This information is useful to properly drive the dialog prompting for missing concepts, clarifying for wrong concepts, ignoring unnecessary concepts and retrieving those required and optional. This paper presents a novel BNs approach where a single BN is obtained from N goal-specific BNs through a fusion process. The new fusion BN enables a single concept analysis which is more consistent with the whole dialog context.
Spoken dialogue technology - toward the conversational user interface
2004
What would you say if your refrigerator told you, "You're having some friends round for hot chocolate later. Maybe you should order two cartons of milk"? Of course, in Spoken Dialogue Technology, Michael McTear will not give an answer to the question of whether talking to domestic appliances makes sense, but he indicates that even a normal household, for instance, may offer a wide field of application for spokenlanguage dialogue systems in the near future. Consequently his book primarily focuses on theory and practice of these systems. Addressing undergraduate students as well as postgraduate researchers and practitioners in human-computer interfaces, the book is subdivided into three parts which meet the readers' needs: "Background to Spoken Dialogue Technology" (Chapters 1-5), "Developing Spoken Dialogue Applications" (Chapters 6-11), and "Advanced Applications" (Chapters 12-14). Chapter 1, "Talking with Computers: Fact or Fiction," and Chapter 2, "Spoken Dialogue Applications: Research Directions and Commercial Deployment," present recent products and aspects of dialogue technology as well as historical linguistic and artificial intelligence approaches to dialogue and simulated conversation. Aspects of present-day commercial use of spoken dialogue technology are also discussed. In Chapter 3, "Understanding Dialogue," the term dialogue is defined, and four of its key characteristics-dialogue as discourse, dialogue as purposeful activity, dialogue as collaborative activity, and utterances in dialogue-and its structures and processes are described in detail. Chapter 4 gives an overview of the components of a spoken language dialogue system: speech recognition, language understanding, language generation, and text-to-speech synthesis. The central component (i.e., dialogue management) is specified in Chapter 5. Here, dialogue initiative (system initiative, user initiative, and mixed initiative), dialogue control (finite-state-based, frame-based, and agent-based control), and grounding (how to process the user's input) are described. Furthermore, knowledge sources (dialogue history, task record, world knowledge model, domain model, generic model, and user model) and problems that arise when interacting with an external knowledge source are discussed. The second part starts with dialogue engineering, which can be subdivided into analysis and specification of requirements, design, implementation, testing, and evaluation of a dialogue system. The use-case analysis includes user profile (type of user, language, user's experience level, etc.) and usage profile (frequency of use, input/output device type, environment, etc.). The spoken-language requirements can
Using a Bayesian model of the listener to unveil the dialogue information state
2012
Communicative listener feedback is a prevalent coordination mechanism in dialogue. Listeners use feedback to provide evidence of understanding to speakers, who, in turn, use it to reason about the listeners' mental state of listening, determine the groundedness of communicated information, and adapt their subsequent utterances to the listeners' needs. We describe a speaker-centric Bayesian model of listeners and their feedback behaviour, which can interpret the listener's feedback signal in its dialogue ...
We present an evaluation of a spoken language dialogue system with a module for the management of userrelated information, stored as user preferences and privileges. The flexibility of our dialogue management approach, based on Bayesian Networks (BN), together with a contextual information module, which performs different strategies for handling such information, allows us to include user information as a new level into the Context Manager hierarchy. We propose a set of objective and subjective metrics to measure the relevance of the different contextual information sources. The analysis of our evaluation scenarios shows that the relevance of the short-term information (i.e. the system status) remains pretty stable throughout the dialogue, whereas the dialogue history and the user profile (i.e. the middle-term and the long-term information, respectively) play a complementary role, evolving their usefulness as the dialogue evolves.