Hannes Pirker | Austrian Academy of Sciences (original) (raw)

Papers by Hannes Pirker

Research paper thumbnail of G E M E P - G E neva M ultimodal E motion P ortrayals: A corpus for the study of multimodal emotional expressions

This paper introduces the GEMEP (Geneva Multimodal Emotion Portrayals) corpus, a new repository o... more This paper introduces the GEMEP (Geneva Multimodal Emotion Portrayals) corpus, a new repository of portrayed emotional expressions. Corpora of acted portrayals tend to include portrayals of very intense emotions, which are considered to occur infrequently in daily interactions between humans or between humans and machines. Acted portrayals have therefore been challenged as unsuited for applied research purposes. Taking a different stance, we argue that: (a) portrayals produced by appropriately instructed actors are analogue to expressions that do occur in selected reallife contexts; (b) acted portrayalsas opposed to induced or real-life sampled emotional expressionsdisplay the most expressive variability and therefore constitute excellent material for the systematic study of nonverbal communication of emotions. We describe the guidelines used to record the corpus and some of our short-term research plans with the corpus.

Research paper thumbnail of The speech conductor: gestural control of speech synthesis

... The glove does not actually use MIDI protocol but Open Sound Control (OSC) instead. Contrary ... more ... The glove does not actually use MIDI protocol but Open Sound Control (OSC) instead. Contrary to MIDI which sets ... D. Overview of the work done The work has been organized along two main lines: text-to-speech synthesis and parametric voice quality synthesis. ...

Research paper thumbnail of From Information Structure to Intonation: A Phonological

The pal)er describes an interface between generator and synthesizer of tile German language conce... more The pal)er describes an interface between generator and synthesizer of tile German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performancerelated modifications to intonation patterns).

Research paper thumbnail of Thus spoke the user to the wizard

Wizard-of-Oz (WOZ) simulations are a popular means for investigating the properties of humancompu... more Wizard-of-Oz (WOZ) simulations are a popular means for investigating the properties of humancomputer interaction. In this paper the ndings from a WOZ experiment for evaluating di erent design options for a spoken dialogue system are presented.

Research paper thumbnail of From Information Structure to Intonation: A Phonolo9ical Interface for Concept-to-Speech

The pal)er describes an interface between generator and synthesizer of tile German language conce... more The pal)er describes an interface between generator and synthesizer of tile German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performancerelated modifications to intonation patterns).

Research paper thumbnail of Using Two-Level Morphology as a Generator-Synthesizer Interface in Concept-to-Speech Generation

In a project for the development of a concept-to-speech system for German, we apply extended two-... more In a project for the development of a concept-to-speech system for German, we apply extended two-level-morphology (Trost 1991) to provide a uni ed solution to the tasks of morphotactics, segmental (morpho)phonology, syllabi cation and assignment of stress. Starting from a lexeme-based lexicon, we show that a declarative two-levelimplementation of a single rule-corpus complemented with feature lters is su cient for a comprehensive account of the various mutual in uences holding between separate phonological dimensions in the phonology of German.

Research paper thumbnail of Generating Emotional Speech With A Concatenative Synthesizer

Research paper thumbnail of Generating Intonation Contours Using Tonal Specifications

We present a novel approach to intonation modelling for speech synthesis based on a two-layer tec... more We present a novel approach to intonation modelling for speech synthesis based on a two-layer technique. The generator component of a concept-to-speech system produces an abstract phonological representation of intonation based on GToBI interpreting the linguistic and discourse information available. This abstract representation must be translated into concrete acoustic parameters. The paper describes how this mapping is achieved with the use of stylized F0 contours.

Research paper thumbnail of Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference No Peanuts! Affective Cues for the Virtual Bartender

The aim of this paper is threefold: it explores methods for the detection of affective states in ... more The aim of this paper is threefold: it explores methods for the detection of affective states in text, it presents the usage of such affective cues in a conversational system and it evaluates its effectiveness in a virtual reality setting. Valence and arousal values, used for generating facial expressions of users ’ avatars, are also incorporated into the dialog, helping to bridge the gap between textual and visual modalities. The system is evaluated in terms of its ability to: i) generate a realistic dialog, ii) create an enjoyable chatting experience, and iii) establish an emotional connection with participants. Results show that user ratings for the conversational agent match those obtained in aWizardofOzsetting. 1

Research paper thumbnail of Learning Duration

In this paper, we investigate the possibilities to enhance statistic modelling of segment duratio... more In this paper, we investigate the possibilities to enhance statistic modelling of segment duration for speech synthesis.

Research paper thumbnail of No Peanuts! Affective Cues for the Virtual Bartender

The aim of this paper is threefold: it explores methods for the detection of affective states in ... more The aim of this paper is threefold: it explores methods for the detection of affective states in text, it presents the usage of such affective cues in a conversational sys- tem and it evaluates its effectiveness in a virtual reality setting. Valence and arousal values, used for generating facial expressions of users' avatars, are also incorpo- rated into the dialog, helping to bridge the gap between textual and visual modalities. The system is evaluated in terms of its ability to: i) generate a realistic dialog, ii) create an enjoyable chatting experience, and iii) es- tablish an emotional connection with participants. Re- sults show that user ratings for the conversational agent match those obtained i naW izard of Oz setting.

Research paper thumbnail of ON THE SPECIFICATION OF SENTENCE INITIAL F0- PATTERNS IN GERMAN

It is widely accepted that linguistic high level information like information structure (focus-ba... more It is widely accepted that linguistic high level information like information structure (focus-background-division, topicalization) influences accent placement and accent type (rising, falling, hat pattern etc.) in German. Previous research was concentrated on tonal patterns associated with focus and topic ((1),(2)). We will demonstrate that 1) sentence initial tonal variation and 2) syllable duration depend on the location and the type

Research paper thumbnail of VieCtoS - Speech Synthesizer, Technical Overview

This report describes the overall architecture of the speech synthesis module developedfor VieCto... more This report describes the overall architecture of the speech synthesis module developedfor VieCtoS, the Vienna concept-to-speech system for Austrian German. Atechnical description of the methods used for the representation of inventory elements,their concatenation and the facilities for interpreting and superimposing prosody is presented.An overview of the implementation and the user environment as well as somedetails concerning program and test design

Research paper thumbnail of Feature-Based Allomorphy

Meeting of the Association for Computational Linguistics, 1993

Morphotactics and allomorphy are usually modeled in different components, leading to interface pr... more Morphotactics and allomorphy are usually modeled in different components, leading to interface problems. To describe both uniformly, we define finite automata (FA) for allomorphy in the same feature description language used for morphotactics. Nonphonologically conditioned allomorphy is problematic in FA models but submits readily to treatment in a uniform formalism.

Research paper thumbnail of From information structure to intonation

Proceedings of the 36th annual meeting on Association for Computational Linguistics -, 1998

ABSTRACT The paper describes an interface between generator and synthesizer of the German languag... more ABSTRACT The paper describes an interface between generator and synthesizer of the German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performance-related modifications to intonation patterns).Phonological processing in our system comprises segmental as well as suprasegmental dimensions such as syllabification, modification of word stress positions, and a symbolic encoding of intonation. Phonological phenomena often touch upon more than one of these dimensions, so that mutual accessibility of the data structures on each dimension had to be ensured.We present a linear representation of the multidimensional phonological data based on a straightforward linearization convention, which suffices to bring this conceptually multilinear data set under the scope of the well-known processing techniques for two-level morphology.

Research paper thumbnail of Embodied Conversational Characters: Representation Formats for Multimodal Communicative Behaviours

Cognitive Technologies, 2010

This contribution deals with the requirements on representation languages employed in planning an... more This contribution deals with the requirements on representation languages employed in planning and displaying communicative multimodal behaviour of Embodied Conversational Agents (ECAs). We focus on the role of behaviour representation frameworks as part of the processing chain from intent planning to the planning and generation of multimodal communicative behaviours. On the one hand, the field is fragmented, with almost everybody working on ECAs developing their own tailor-made representations, which is amongst others reflected in the extensive references list. On the other hand, there are general aspects that need to be modelled in order to generate multimodal behaviour. Throughout the chapter we take different perspectives on existing representation languages and outline the fundament of a common framework.

Research paper thumbnail of Feature-based allomorphy

Proceedings of the 31st annual meeting on Association for Computational Linguistics -, 1993

Morphotactics and allomorphy are usually modeled in different components, leading to interface pr... more Morphotactics and allomorphy are usually modeled in different components, leading to interface problems. To describe both uniformly, we define finite automata (FA) for allomorphy in the same feature description language used for morphotactics. Nonphonologically conditioned allomorphy is problematic in FA models but submits readily to treatment in a uniform formalism.

Research paper thumbnail of CYBEREMOTIONS – Collective Emotions in Cyberspace

Procedia Computer Science, 2011

Emotions are an important part of most societal dynamics. As with face to face meetings, Internet... more Emotions are an important part of most societal dynamics. As with face to face meetings, Internet exchanges may not only include factual information but may also elicit emotional responses; how participants feel about the subject discussed or other group members. The development of automatic sentiment analysis has made large scale emotion detection and analysis possible using text messages collected from the web. We present results of two years of studies performed in the EU Large Scale Integrating Project CYBEREMOTIONS (Collective emotions in cyberspace) Our goal is to understand the role of collective emotions in creating, forming and breaking-up ICT mediated communities and to prepare the background for the next generation of emotionally-intelligent ICT services. Project results have already attracted a lot of attention from various mass media and research journals including the Science and New Scientist magazines. Nine Project teams are organised in three layers (data, theory and ICT output). © Selection and peer-review under responsibility of FET11 conference organizers and published by Elsevier B.V.

Research paper thumbnail of A system of stylized intonation contours in German

Modeling intonation, i.e., specifying adequate fundamental frequency (F0) contours, remains a cha... more Modeling intonation, i.e., specifying adequate fundamental frequency (F0) contours, remains a challenging task for speech synthesis systems. This paper discusses the development of a system for phonetically specifying intonation contours for German. It deals with the problem of translating an abstract phonological representation of intonation -namely the tone-sequence model -into a concrete phonetic model. Design options and evaluation methods are discussed.

Research paper thumbnail of Some Questions and Answers on the Prosodic Correlates of Information Structure

ofai.at

In this paper a study on the effects of varying infor-mation structure and syntactic structure on... more In this paper a study on the effects of varying infor-mation structure and syntactic structure on prosody is presented. For this purpose a corpus of German question-answer pairs was designed and established. The structure and encoding of this corpus is described and some ...

Research paper thumbnail of G E M E P - G E neva M ultimodal E motion P ortrayals: A corpus for the study of multimodal emotional expressions

This paper introduces the GEMEP (Geneva Multimodal Emotion Portrayals) corpus, a new repository o... more This paper introduces the GEMEP (Geneva Multimodal Emotion Portrayals) corpus, a new repository of portrayed emotional expressions. Corpora of acted portrayals tend to include portrayals of very intense emotions, which are considered to occur infrequently in daily interactions between humans or between humans and machines. Acted portrayals have therefore been challenged as unsuited for applied research purposes. Taking a different stance, we argue that: (a) portrayals produced by appropriately instructed actors are analogue to expressions that do occur in selected reallife contexts; (b) acted portrayalsas opposed to induced or real-life sampled emotional expressionsdisplay the most expressive variability and therefore constitute excellent material for the systematic study of nonverbal communication of emotions. We describe the guidelines used to record the corpus and some of our short-term research plans with the corpus.

Research paper thumbnail of The speech conductor: gestural control of speech synthesis

... The glove does not actually use MIDI protocol but Open Sound Control (OSC) instead. Contrary ... more ... The glove does not actually use MIDI protocol but Open Sound Control (OSC) instead. Contrary to MIDI which sets ... D. Overview of the work done The work has been organized along two main lines: text-to-speech synthesis and parametric voice quality synthesis. ...

Research paper thumbnail of From Information Structure to Intonation: A Phonological

The pal)er describes an interface between generator and synthesizer of tile German language conce... more The pal)er describes an interface between generator and synthesizer of tile German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performancerelated modifications to intonation patterns).

Research paper thumbnail of Thus spoke the user to the wizard

Wizard-of-Oz (WOZ) simulations are a popular means for investigating the properties of humancompu... more Wizard-of-Oz (WOZ) simulations are a popular means for investigating the properties of humancomputer interaction. In this paper the ndings from a WOZ experiment for evaluating di erent design options for a spoken dialogue system are presented.

Research paper thumbnail of From Information Structure to Intonation: A Phonolo9ical Interface for Concept-to-Speech

The pal)er describes an interface between generator and synthesizer of tile German language conce... more The pal)er describes an interface between generator and synthesizer of tile German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performancerelated modifications to intonation patterns).

Research paper thumbnail of Using Two-Level Morphology as a Generator-Synthesizer Interface in Concept-to-Speech Generation

In a project for the development of a concept-to-speech system for German, we apply extended two-... more In a project for the development of a concept-to-speech system for German, we apply extended two-level-morphology (Trost 1991) to provide a uni ed solution to the tasks of morphotactics, segmental (morpho)phonology, syllabi cation and assignment of stress. Starting from a lexeme-based lexicon, we show that a declarative two-levelimplementation of a single rule-corpus complemented with feature lters is su cient for a comprehensive account of the various mutual in uences holding between separate phonological dimensions in the phonology of German.

Research paper thumbnail of Generating Emotional Speech With A Concatenative Synthesizer

Research paper thumbnail of Generating Intonation Contours Using Tonal Specifications

We present a novel approach to intonation modelling for speech synthesis based on a two-layer tec... more We present a novel approach to intonation modelling for speech synthesis based on a two-layer technique. The generator component of a concept-to-speech system produces an abstract phonological representation of intonation based on GToBI interpreting the linguistic and discourse information available. This abstract representation must be translated into concrete acoustic parameters. The paper describes how this mapping is achieved with the use of stylized F0 contours.

Research paper thumbnail of Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference No Peanuts! Affective Cues for the Virtual Bartender

The aim of this paper is threefold: it explores methods for the detection of affective states in ... more The aim of this paper is threefold: it explores methods for the detection of affective states in text, it presents the usage of such affective cues in a conversational system and it evaluates its effectiveness in a virtual reality setting. Valence and arousal values, used for generating facial expressions of users ’ avatars, are also incorporated into the dialog, helping to bridge the gap between textual and visual modalities. The system is evaluated in terms of its ability to: i) generate a realistic dialog, ii) create an enjoyable chatting experience, and iii) establish an emotional connection with participants. Results show that user ratings for the conversational agent match those obtained in aWizardofOzsetting. 1

Research paper thumbnail of Learning Duration

In this paper, we investigate the possibilities to enhance statistic modelling of segment duratio... more In this paper, we investigate the possibilities to enhance statistic modelling of segment duration for speech synthesis.

Research paper thumbnail of No Peanuts! Affective Cues for the Virtual Bartender

The aim of this paper is threefold: it explores methods for the detection of affective states in ... more The aim of this paper is threefold: it explores methods for the detection of affective states in text, it presents the usage of such affective cues in a conversational sys- tem and it evaluates its effectiveness in a virtual reality setting. Valence and arousal values, used for generating facial expressions of users' avatars, are also incorpo- rated into the dialog, helping to bridge the gap between textual and visual modalities. The system is evaluated in terms of its ability to: i) generate a realistic dialog, ii) create an enjoyable chatting experience, and iii) es- tablish an emotional connection with participants. Re- sults show that user ratings for the conversational agent match those obtained i naW izard of Oz setting.

Research paper thumbnail of ON THE SPECIFICATION OF SENTENCE INITIAL F0- PATTERNS IN GERMAN

It is widely accepted that linguistic high level information like information structure (focus-ba... more It is widely accepted that linguistic high level information like information structure (focus-background-division, topicalization) influences accent placement and accent type (rising, falling, hat pattern etc.) in German. Previous research was concentrated on tonal patterns associated with focus and topic ((1),(2)). We will demonstrate that 1) sentence initial tonal variation and 2) syllable duration depend on the location and the type

Research paper thumbnail of VieCtoS - Speech Synthesizer, Technical Overview

This report describes the overall architecture of the speech synthesis module developedfor VieCto... more This report describes the overall architecture of the speech synthesis module developedfor VieCtoS, the Vienna concept-to-speech system for Austrian German. Atechnical description of the methods used for the representation of inventory elements,their concatenation and the facilities for interpreting and superimposing prosody is presented.An overview of the implementation and the user environment as well as somedetails concerning program and test design

Research paper thumbnail of Feature-Based Allomorphy

Meeting of the Association for Computational Linguistics, 1993

Morphotactics and allomorphy are usually modeled in different components, leading to interface pr... more Morphotactics and allomorphy are usually modeled in different components, leading to interface problems. To describe both uniformly, we define finite automata (FA) for allomorphy in the same feature description language used for morphotactics. Nonphonologically conditioned allomorphy is problematic in FA models but submits readily to treatment in a uniform formalism.

Research paper thumbnail of From information structure to intonation

Proceedings of the 36th annual meeting on Association for Computational Linguistics -, 1998

ABSTRACT The paper describes an interface between generator and synthesizer of the German languag... more ABSTRACT The paper describes an interface between generator and synthesizer of the German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performance-related modifications to intonation patterns).Phonological processing in our system comprises segmental as well as suprasegmental dimensions such as syllabification, modification of word stress positions, and a symbolic encoding of intonation. Phonological phenomena often touch upon more than one of these dimensions, so that mutual accessibility of the data structures on each dimension had to be ensured.We present a linear representation of the multidimensional phonological data based on a straightforward linearization convention, which suffices to bring this conceptually multilinear data set under the scope of the well-known processing techniques for two-level morphology.

Research paper thumbnail of Embodied Conversational Characters: Representation Formats for Multimodal Communicative Behaviours

Cognitive Technologies, 2010

This contribution deals with the requirements on representation languages employed in planning an... more This contribution deals with the requirements on representation languages employed in planning and displaying communicative multimodal behaviour of Embodied Conversational Agents (ECAs). We focus on the role of behaviour representation frameworks as part of the processing chain from intent planning to the planning and generation of multimodal communicative behaviours. On the one hand, the field is fragmented, with almost everybody working on ECAs developing their own tailor-made representations, which is amongst others reflected in the extensive references list. On the other hand, there are general aspects that need to be modelled in order to generate multimodal behaviour. Throughout the chapter we take different perspectives on existing representation languages and outline the fundament of a common framework.

Research paper thumbnail of Feature-based allomorphy

Proceedings of the 31st annual meeting on Association for Computational Linguistics -, 1993

Morphotactics and allomorphy are usually modeled in different components, leading to interface pr... more Morphotactics and allomorphy are usually modeled in different components, leading to interface problems. To describe both uniformly, we define finite automata (FA) for allomorphy in the same feature description language used for morphotactics. Nonphonologically conditioned allomorphy is problematic in FA models but submits readily to treatment in a uniform formalism.

Research paper thumbnail of CYBEREMOTIONS – Collective Emotions in Cyberspace

Procedia Computer Science, 2011

Emotions are an important part of most societal dynamics. As with face to face meetings, Internet... more Emotions are an important part of most societal dynamics. As with face to face meetings, Internet exchanges may not only include factual information but may also elicit emotional responses; how participants feel about the subject discussed or other group members. The development of automatic sentiment analysis has made large scale emotion detection and analysis possible using text messages collected from the web. We present results of two years of studies performed in the EU Large Scale Integrating Project CYBEREMOTIONS (Collective emotions in cyberspace) Our goal is to understand the role of collective emotions in creating, forming and breaking-up ICT mediated communities and to prepare the background for the next generation of emotionally-intelligent ICT services. Project results have already attracted a lot of attention from various mass media and research journals including the Science and New Scientist magazines. Nine Project teams are organised in three layers (data, theory and ICT output). © Selection and peer-review under responsibility of FET11 conference organizers and published by Elsevier B.V.

Research paper thumbnail of A system of stylized intonation contours in German

Modeling intonation, i.e., specifying adequate fundamental frequency (F0) contours, remains a cha... more Modeling intonation, i.e., specifying adequate fundamental frequency (F0) contours, remains a challenging task for speech synthesis systems. This paper discusses the development of a system for phonetically specifying intonation contours for German. It deals with the problem of translating an abstract phonological representation of intonation -namely the tone-sequence model -into a concrete phonetic model. Design options and evaluation methods are discussed.

Research paper thumbnail of Some Questions and Answers on the Prosodic Correlates of Information Structure

ofai.at

In this paper a study on the effects of varying infor-mation structure and syntactic structure on... more In this paper a study on the effects of varying infor-mation structure and syntactic structure on prosody is presented. For this purpose a corpus of German question-answer pairs was designed and established. The structure and encoding of this corpus is described and some ...