Brigitte Krenn - Academia.edu (original) (raw)
Papers by Brigitte Krenn
arXiv (Cornell University), May 28, 2021
With increased applications of collaborative robots (cobots) in industrial workplaces, behavioura... more With increased applications of collaborative robots (cobots) in industrial workplaces, behavioural effects of human-cobot interactions need to be further investigated. This is of particular importance as nonverbal behaviours of collaboration partners in human-robot teams significantly influence the experience of the human interaction partners and the success of the collaborative task. During the Ars Electronica 2020 Festival for Art, Technology & Society in Linz, we invited visitors to exploratively interact with an industrial robot, exhibiting restricted interaction capabilities: extending and retracting its arm, depending on the movements of the volunteer. The movements of the arm were pre-programmed and telecontrolled for safety reasons (which was not obvious to the participants). We recorded video data of these interactions and investigated general nonverbal behaviours of the humans interacting with the robot, as well as nonverbal behaviours of people in the audience. Our results showed that people were more interested in exploring the robot's action and perception capabilities than just reproducing the interaction game as introduced by the instructors. We also found that the majority of participants interacting with the robot approached it up to a distance which would be perceived as threatening or intimidating, if it were a human interaction partner. Regarding bystanders, we found examples where people made movements as if trying out variants of the current participant's behaviour.
We present an episodic memory component for enhancing the dialogue of artificial companions with ... more We present an episodic memory component for enhancing the dialogue of artificial companions with the capability to refer to, take up and comment on past interactions with the user, and to take into account in the dialogue long-term user preferences and interests. The proposed episodic memory is based on RDF representations of the agent's experiences and is linked to the agent's semantic memory containing the agent's knowledge base of ontological data and information about the user's interests.
We present an episodic memory component for enhancing the dialogue of artificial companions with ... more We present an episodic memory component for enhancing the dialogue of artificial companions with the capability to refer to, take up and comment on past interactions with the user, and to take into account in the dialogue long-term user preferences and interests. The proposed episodic memory is based on RDF representations of the agent's experiences and is linked to the agent's semantic memory containing the agent's knowledge base of ontological data and information about the interests of the user.
We present experiments on a multimodal dataset of situated task descriptions annotated specifical... more We present experiments on a multimodal dataset of situated task descriptions annotated specifically for crossmodal object-word learning. In particular, we investigate effects of attentional cues incorporated in statistical learning models. Attentional cues help to direct listener’s/learner’s attention in multimodal communication, and thus facilitate learning the mapping between words in natural language and objects in the world. The results of our experiments indicate that, in the context of TAKE and PUT tasks, the object currently held by the instructor or is moved next to another object are important attentional cues. Further, we show that using the instructor’s gaze as attention cue worsens the learning result for such tasks, which stands in contrast to previous work.
In this paper we introduce an active learning extension to our incremental grounded language lear... more In this paper we introduce an active learning extension to our incremental grounded language learning system implemented on the Pepper robot. This approach is inspired by recent results from child language acquisition research, which shows that children deliberately use gestures like pointing to acquire new information about the world around them. In our system, the Pepper robot learns word-object and word-action mappings by observing a human tutor manipulating objects on a table and verbally describing the actions. Under certain conditions, the robot will interrupt the tutor and actively request information by pointing at an object. We describe our first approach of facilitating active information seeking strategies to enhance our system. We motivate when and how to apply them by reviewing research on question-asking during infancy and toddlerhood.
Deliverable number D9c Deliverable title D9c: Assessment of markup languages for avatars, multime... more Deliverable number D9c Deliverable title D9c: Assessment of markup languages for avatars, multimedia and multimodal systems Abstract (for dissemination) description and assessment of markup languages with respect to their feasibility for NECA NECA D9c NECA IST-2000-28580 2
Proceedings of the 19th ACM International Conference on Multimodal Interaction, 2017
When uttering referring expressions in situated task descriptions, humans naturally use verbal an... more When uttering referring expressions in situated task descriptions, humans naturally use verbal and non-verbal channels to transmit information to their interlocutor. To develop mechanisms for robot architectures capable of resolving object references in such interaction contexts, we need to better understand the multi-modality of human situated task descriptions. In current computational models, mainly pointing gestures, eye gaze, and objects in the visual field are included as non-verbal cues, if any. We analyse reference resolution to objects in an object manipulation task and find that only up to 50% of all referring expressions to objects can be resolved including language, eye gaze and pointing gestures. Thus, we extract other non-verbal cues necessary for reference resolution to objects, investigate the reliability of the different verbal and non-verbal cues, and formulate lessons for the design of a robot's natural language understanding capabilities. CCS CONCEPTS • Computing methodologies → Artificial intelligence; • Human-centered computing → Interaction paradigms; • Computer systems organization → Robotics;
Artificial Companions are computer systems that interact and collaborate with the user over a lon... more Artificial Companions are computer systems that interact and collaborate with the user over a longer period of time, employing human-like communication. They have memory and are able to learn from the interaction with their users. Due to their long-term relationships with users and their ability to communicate in a human-like fashion they have social effects on the user. We aim at improving the dialogue capabilities of companions by endowing them with episodic memory (EM) and integrating it with action selection and dialogue control. Our research builds upon the platform and companions developed in the RASCALLI project [1][2], but can be easily transferred to other applications.
Third International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS'07), 2007
We present stategies for improving the accessibility of large music archives intended for use in ... more We present stategies for improving the accessibility of large music archives intended for use in commercial environments like music download platforms. Our approach is based on metadata enhancement and on the augmentation of traditional browsing interfaces with concise data visualizations. New metadata attributes are created by recombination and summarization of existing ones as well as through web-based data mining.
Lecture Notes in Computer Science, 2010
We present an episodic memory component for enhancing the dialogue of artificial companions with ... more We present an episodic memory component for enhancing the dialogue of artificial companions with the capability to refer to, take up and comment on past interactions with the user, and to take into account in the dialogue long-term user preferences and interests. The proposed episodic memory is based on RDF representations of the agent's experiences and is linked to the agent's semantic memory containing the agent's knowledge base of ontological data and information about the interests of the user.
Proceedings of the 8th Workshop on Performance Metrics for Intelligent Systems, 2008
The RASCALLI platform is both a runtime and a development environment for virtual systems augment... more The RASCALLI platform is both a runtime and a development environment for virtual systems augmented with cognition. It provides a framework for the implementation and execution of modular software agents. Due to the underlying software architecture and the modularity of the agents, it allows the parallel execution and evaluation of multiple agents. These agents might be all of the same kind or of vastly different kinds or they might differ only in specific (cognitive) aspects, so that the performance of these aspects can be effectively compared and evaluated.
Lecture Notes in Computer Science, 2003
In this paper, we describe an application where embodied conversational characters are integrated... more In this paper, we describe an application where embodied conversational characters are integrated into an existing application which functions as a community building tool accessible via the Web. We discuss a number of design criteria which arise on the one hand from the task to simulate animated human-like conversation (which is watched by the user), and which arise on the other hand from technical restrictions imposed by the Web and the computing facilities of an average non-expert computer user.
Intelligent Virtual Agents, 2012
In Sociolinguistics, language attitude studies based on natural voices have provided evidence tha... more In Sociolinguistics, language attitude studies based on natural voices have provided evidence that human listeners socially assess and evaluate their communication partners according to the language variety they use. Similarly, research on intelligent agents has demonstrated that the degree an artificial entity resembles a human correlates with the likelihood that the entity will evoke social and psychological processes in humans. Taking the two findings together, we hypothesize that synthetically generated language varieties have social effects similar to those reported from language attitude studies on natural speech. We present results from a language-attitude study based on three synthetic varieties of Austrian German. Our results on synthetic speech are in accordance with previous findings from natural speech. In addition, we show that language variety together with voice quality of the synthesized speech bring about attributions of different social aspects and stereotypes and influence the attitudes of the listeners toward the artificial speakers.
AI & SOCIETY, 2014
When developing artificial companions, social attribution significantly influences the attitudes ... more When developing artificial companions, social attribution significantly influences the attitudes of humans towards the agents. We present results from a language-attitude study based on three synthetic varieties of Austrian German (a standard and two Viennese varieties) in the context of a cultural-heritage application. We show that language variety together with voice quality elicit attributions of different personas and influence the attitudes of the listeners toward the speakers.
Cognitive Technologies, 2010
In order to be believable, embodied conversational agents (ECAs) must show expression of emotions... more In order to be believable, embodied conversational agents (ECAs) must show expression of emotions in a consistent and natural looking way across modalities. The ECA has to be able to display coordinated signs of emotion during realistic emotional behaviour. Such a capability requires one to study and represent emotions and coordination of modalities during non-basic realistic human behaviour, to define languages for representing such behaviours to be displayed by the ECA, to have access to mono-modal representations such as gesture repositories. This chapter is concerned about coordinating the generation of signs in multiple modalities in such an affective agent. Designers of an affective agent need to know how it should coordinate its facial expression, speech, gestures and other modalities in view of showing emotion. This synchronisation of modalities is a main feature of emotions.
Intelligent Virtual Agents, 2006
This paper describes an international effort to unify a multimodal behavior generation framework ... more This paper describes an international effort to unify a multimodal behavior generation framework for Embodied Conversational Agents (ECAs). We propose a three stage model we call SAIBA where the stages represent intent planning, behavior planning and behavior realization. A Function Markup Language (FML), describing intent without referring to physical behavior, mediates between the first two stages and a Behavior Markup Language (BML) describing desired physical realization, mediates between the last two stages. In this paper we will focus on BML. The hope is that this abstraction and modularization will help ECA researchers pool their resources to build more sophisticated virtual humans.
Künstliche Intelligenz - KI, 2003
In this paper we describe a commercial application of a net environment, and present the user dat... more In this paper we describe a commercial application of a net environment, and present the user data we have collected so far from three launches of this application. A net environment in our definition is a virtual space inhabited by avatars which have been created and are subsequently visited and instructed by users via the internet. Net environments are a useful means for studying user behaviour in general, and they are particularly well suited for presentation of multimedia content and systematic gathering of user responses on the appropriateness or effectiveness of the different presentations.
Artificial Intelligence, 2008
This paper presents the NECA approach to the generation of dialogues between Embodied Conversatio... more This paper presents the NECA approach to the generation of dialogues between Embodied Conversational Agents (ECAs). This approach consist of the automated constructtion of an abstract script for an entire dialogue (cast in terms of dialogue acts), which is incrementally enhanced by a series of modules and finally "performed" by means of text, speech and body language, by a cast of ECAs. The approach makes it possible to automatically produce a large variety of highly expressive dialogues, some of whose essential properties are under the control of a user. The paper discusses the advantages and disadvantages of NECA's approach to Fully Generated Scripted Dialogue (FGSD), and explains the main techniques used in the two demonstrators that were built. The paper can be read as a survey of issues and techniques in the construction of ECAs, focussing on the generation of behaviour (i.e., focussing on information presentation) rather than on interpretation.
arXiv (Cornell University), May 28, 2021
With increased applications of collaborative robots (cobots) in industrial workplaces, behavioura... more With increased applications of collaborative robots (cobots) in industrial workplaces, behavioural effects of human-cobot interactions need to be further investigated. This is of particular importance as nonverbal behaviours of collaboration partners in human-robot teams significantly influence the experience of the human interaction partners and the success of the collaborative task. During the Ars Electronica 2020 Festival for Art, Technology & Society in Linz, we invited visitors to exploratively interact with an industrial robot, exhibiting restricted interaction capabilities: extending and retracting its arm, depending on the movements of the volunteer. The movements of the arm were pre-programmed and telecontrolled for safety reasons (which was not obvious to the participants). We recorded video data of these interactions and investigated general nonverbal behaviours of the humans interacting with the robot, as well as nonverbal behaviours of people in the audience. Our results showed that people were more interested in exploring the robot's action and perception capabilities than just reproducing the interaction game as introduced by the instructors. We also found that the majority of participants interacting with the robot approached it up to a distance which would be perceived as threatening or intimidating, if it were a human interaction partner. Regarding bystanders, we found examples where people made movements as if trying out variants of the current participant's behaviour.
We present an episodic memory component for enhancing the dialogue of artificial companions with ... more We present an episodic memory component for enhancing the dialogue of artificial companions with the capability to refer to, take up and comment on past interactions with the user, and to take into account in the dialogue long-term user preferences and interests. The proposed episodic memory is based on RDF representations of the agent's experiences and is linked to the agent's semantic memory containing the agent's knowledge base of ontological data and information about the user's interests.
We present an episodic memory component for enhancing the dialogue of artificial companions with ... more We present an episodic memory component for enhancing the dialogue of artificial companions with the capability to refer to, take up and comment on past interactions with the user, and to take into account in the dialogue long-term user preferences and interests. The proposed episodic memory is based on RDF representations of the agent's experiences and is linked to the agent's semantic memory containing the agent's knowledge base of ontological data and information about the interests of the user.
We present experiments on a multimodal dataset of situated task descriptions annotated specifical... more We present experiments on a multimodal dataset of situated task descriptions annotated specifically for crossmodal object-word learning. In particular, we investigate effects of attentional cues incorporated in statistical learning models. Attentional cues help to direct listener’s/learner’s attention in multimodal communication, and thus facilitate learning the mapping between words in natural language and objects in the world. The results of our experiments indicate that, in the context of TAKE and PUT tasks, the object currently held by the instructor or is moved next to another object are important attentional cues. Further, we show that using the instructor’s gaze as attention cue worsens the learning result for such tasks, which stands in contrast to previous work.
In this paper we introduce an active learning extension to our incremental grounded language lear... more In this paper we introduce an active learning extension to our incremental grounded language learning system implemented on the Pepper robot. This approach is inspired by recent results from child language acquisition research, which shows that children deliberately use gestures like pointing to acquire new information about the world around them. In our system, the Pepper robot learns word-object and word-action mappings by observing a human tutor manipulating objects on a table and verbally describing the actions. Under certain conditions, the robot will interrupt the tutor and actively request information by pointing at an object. We describe our first approach of facilitating active information seeking strategies to enhance our system. We motivate when and how to apply them by reviewing research on question-asking during infancy and toddlerhood.
Deliverable number D9c Deliverable title D9c: Assessment of markup languages for avatars, multime... more Deliverable number D9c Deliverable title D9c: Assessment of markup languages for avatars, multimedia and multimodal systems Abstract (for dissemination) description and assessment of markup languages with respect to their feasibility for NECA NECA D9c NECA IST-2000-28580 2
Proceedings of the 19th ACM International Conference on Multimodal Interaction, 2017
When uttering referring expressions in situated task descriptions, humans naturally use verbal an... more When uttering referring expressions in situated task descriptions, humans naturally use verbal and non-verbal channels to transmit information to their interlocutor. To develop mechanisms for robot architectures capable of resolving object references in such interaction contexts, we need to better understand the multi-modality of human situated task descriptions. In current computational models, mainly pointing gestures, eye gaze, and objects in the visual field are included as non-verbal cues, if any. We analyse reference resolution to objects in an object manipulation task and find that only up to 50% of all referring expressions to objects can be resolved including language, eye gaze and pointing gestures. Thus, we extract other non-verbal cues necessary for reference resolution to objects, investigate the reliability of the different verbal and non-verbal cues, and formulate lessons for the design of a robot's natural language understanding capabilities. CCS CONCEPTS • Computing methodologies → Artificial intelligence; • Human-centered computing → Interaction paradigms; • Computer systems organization → Robotics;
Artificial Companions are computer systems that interact and collaborate with the user over a lon... more Artificial Companions are computer systems that interact and collaborate with the user over a longer period of time, employing human-like communication. They have memory and are able to learn from the interaction with their users. Due to their long-term relationships with users and their ability to communicate in a human-like fashion they have social effects on the user. We aim at improving the dialogue capabilities of companions by endowing them with episodic memory (EM) and integrating it with action selection and dialogue control. Our research builds upon the platform and companions developed in the RASCALLI project [1][2], but can be easily transferred to other applications.
Third International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS'07), 2007
We present stategies for improving the accessibility of large music archives intended for use in ... more We present stategies for improving the accessibility of large music archives intended for use in commercial environments like music download platforms. Our approach is based on metadata enhancement and on the augmentation of traditional browsing interfaces with concise data visualizations. New metadata attributes are created by recombination and summarization of existing ones as well as through web-based data mining.
Lecture Notes in Computer Science, 2010
We present an episodic memory component for enhancing the dialogue of artificial companions with ... more We present an episodic memory component for enhancing the dialogue of artificial companions with the capability to refer to, take up and comment on past interactions with the user, and to take into account in the dialogue long-term user preferences and interests. The proposed episodic memory is based on RDF representations of the agent's experiences and is linked to the agent's semantic memory containing the agent's knowledge base of ontological data and information about the interests of the user.
Proceedings of the 8th Workshop on Performance Metrics for Intelligent Systems, 2008
The RASCALLI platform is both a runtime and a development environment for virtual systems augment... more The RASCALLI platform is both a runtime and a development environment for virtual systems augmented with cognition. It provides a framework for the implementation and execution of modular software agents. Due to the underlying software architecture and the modularity of the agents, it allows the parallel execution and evaluation of multiple agents. These agents might be all of the same kind or of vastly different kinds or they might differ only in specific (cognitive) aspects, so that the performance of these aspects can be effectively compared and evaluated.
Lecture Notes in Computer Science, 2003
In this paper, we describe an application where embodied conversational characters are integrated... more In this paper, we describe an application where embodied conversational characters are integrated into an existing application which functions as a community building tool accessible via the Web. We discuss a number of design criteria which arise on the one hand from the task to simulate animated human-like conversation (which is watched by the user), and which arise on the other hand from technical restrictions imposed by the Web and the computing facilities of an average non-expert computer user.
Intelligent Virtual Agents, 2012
In Sociolinguistics, language attitude studies based on natural voices have provided evidence tha... more In Sociolinguistics, language attitude studies based on natural voices have provided evidence that human listeners socially assess and evaluate their communication partners according to the language variety they use. Similarly, research on intelligent agents has demonstrated that the degree an artificial entity resembles a human correlates with the likelihood that the entity will evoke social and psychological processes in humans. Taking the two findings together, we hypothesize that synthetically generated language varieties have social effects similar to those reported from language attitude studies on natural speech. We present results from a language-attitude study based on three synthetic varieties of Austrian German. Our results on synthetic speech are in accordance with previous findings from natural speech. In addition, we show that language variety together with voice quality of the synthesized speech bring about attributions of different social aspects and stereotypes and influence the attitudes of the listeners toward the artificial speakers.
AI & SOCIETY, 2014
When developing artificial companions, social attribution significantly influences the attitudes ... more When developing artificial companions, social attribution significantly influences the attitudes of humans towards the agents. We present results from a language-attitude study based on three synthetic varieties of Austrian German (a standard and two Viennese varieties) in the context of a cultural-heritage application. We show that language variety together with voice quality elicit attributions of different personas and influence the attitudes of the listeners toward the speakers.
Cognitive Technologies, 2010
In order to be believable, embodied conversational agents (ECAs) must show expression of emotions... more In order to be believable, embodied conversational agents (ECAs) must show expression of emotions in a consistent and natural looking way across modalities. The ECA has to be able to display coordinated signs of emotion during realistic emotional behaviour. Such a capability requires one to study and represent emotions and coordination of modalities during non-basic realistic human behaviour, to define languages for representing such behaviours to be displayed by the ECA, to have access to mono-modal representations such as gesture repositories. This chapter is concerned about coordinating the generation of signs in multiple modalities in such an affective agent. Designers of an affective agent need to know how it should coordinate its facial expression, speech, gestures and other modalities in view of showing emotion. This synchronisation of modalities is a main feature of emotions.
Intelligent Virtual Agents, 2006
This paper describes an international effort to unify a multimodal behavior generation framework ... more This paper describes an international effort to unify a multimodal behavior generation framework for Embodied Conversational Agents (ECAs). We propose a three stage model we call SAIBA where the stages represent intent planning, behavior planning and behavior realization. A Function Markup Language (FML), describing intent without referring to physical behavior, mediates between the first two stages and a Behavior Markup Language (BML) describing desired physical realization, mediates between the last two stages. In this paper we will focus on BML. The hope is that this abstraction and modularization will help ECA researchers pool their resources to build more sophisticated virtual humans.
Künstliche Intelligenz - KI, 2003
In this paper we describe a commercial application of a net environment, and present the user dat... more In this paper we describe a commercial application of a net environment, and present the user data we have collected so far from three launches of this application. A net environment in our definition is a virtual space inhabited by avatars which have been created and are subsequently visited and instructed by users via the internet. Net environments are a useful means for studying user behaviour in general, and they are particularly well suited for presentation of multimedia content and systematic gathering of user responses on the appropriateness or effectiveness of the different presentations.
Artificial Intelligence, 2008
This paper presents the NECA approach to the generation of dialogues between Embodied Conversatio... more This paper presents the NECA approach to the generation of dialogues between Embodied Conversational Agents (ECAs). This approach consist of the automated constructtion of an abstract script for an entire dialogue (cast in terms of dialogue acts), which is incrementally enhanced by a series of modules and finally "performed" by means of text, speech and body language, by a cast of ECAs. The approach makes it possible to automatically produce a large variety of highly expressive dialogues, some of whose essential properties are under the control of a user. The paper discusses the advantages and disadvantages of NECA's approach to Fully Generated Scripted Dialogue (FGSD), and explains the main techniques used in the two demonstrators that were built. The paper can be read as a survey of issues and techniques in the construction of ECAs, focussing on the generation of behaviour (i.e., focussing on information presentation) rather than on interpretation.