Understanding emotional expression using prosodic analysis of natural speech: Refining the methodology (original) (raw)
Related papers
A laboratory-based procedure for measuring emotional expression from natural speech
Behavior research methods, 2009
Despite dramatic advances in the sophistication of tools for measuring prosodic and content channels of expression from natural speech, methodological issues have limited their simultaneous measurement for laboratory research. This is particularly unfortunate as one considers the importance of emotional expression in daily living and how it can be disrupted in many psychological disorders (e.g., schizophrenia). The present study examined the Computerized assessment of Affect from Natural Speech (CANS), a laboratory-based procedure designed to measure both lexical and prosodic expression from natural speech across a range of evocative conditions. The verbal responses of 38 male and 31 females were digitally recorded as they verbalized their reactions to separate pleasant, unpleasant, and neutral stimuli. Lexical and prosodic expression variables significantly changed across these conditions, providing support for using the CANS in further laboratory research. The implications for understanding the interface between lexical and prosodic expressions are also discussed.
Gestural prosody and the expression of emotions: A perceptual and acoustic experiment
2015
This paper presents a perceptual and acoustic experiment and introduces methodological procedures to deal with qualitative and quantitative variables. Its objectives are: investigating the functions of vocal and facial gestures in the appraisal of six basic emotions (Anger, Distaste, Fear, Happiness, Sadness and Shame) and valence (positive, neutral and negative); discussing the interaction between the visual, vocal and semantic dimensions in the evaluation of audio, visual and audiovisual stimuli corresponding to 30 utterances (10 of them semantically positive, 10 neutral and10 negative). The correlation among the variables was made by non-parametric tests applying FAMD and MFA. Among the perceptual and acoustic variables investigated, the most influential for the identification of valence/emotions were found to be the VPAS and the ExpressionEvaluator measures. Judgments concerning the positive, negative and neutral valence of the utterances and the type of emotion varied according...
Acoustical Correlates of Affective Prosody
Journal of Voice, 2007
The word ''Anna'' was spoken by 12 female and 11 male subjects with six different emotional expressions: ''rage/hot anger,'' ''despair/ lamentation,'' ''contempt/disgust,'' ''joyful surprise,'' ''voluptuous enjoyment/sensual satisfaction,'' and ''affection/tenderness.'' In an acoustical analysis, 94 parameters were extracted from the speech samples and broken down by correlation analysis to 15 parameters entering subsequent statistical tests. The results show that each emotion can be characterized by a specific acoustic profile, differentiating that emotion significantly from all others. If aversive emotions are tested against hedonistic emotions as a group, it turns out that the best indicator of aversiveness is the ratio of peak frequency (frequency with the highest amplitude) to fundamental frequency, followed by the peak frequency, the percentage of time segments with nonharmonic structure (''noise''), frequency range within single time segments, and time of the maximum of the peak frequency within the utterance. Only the last parameter, however, codes aversiveness independent of the loudness of an utterance.
Perception of levels of emotion in prosody
2015
Prosody conveys information about the emotional state of the speaker. In this study we test whether listeners are able to detect different levels in the emotional state of the speaker based on prosodic features such as intonation, speech rate and intensity. We ran a perception experiment in which we ask Swiss German and Chinese listeners to recognize the intended emotions that the professional speaker produced. The results indicate that both Chinese and Swiss German listeners could identify the intended emotions. However, Swiss German listeners could detect different levels of happiness and sadness better than the Chinese listeners. This finding might show that emotional prosody does not function categorically, distinguishing only different emotions, but also indicates different degrees of the expressed emotion.
Purpose: Our aim is to explore the complex interplay of prosody (tone of speech) and semantics (verbal content) in the perception of discrete emotions in speech. Method: We implement a novel tool, the Test for Rating of Emotions in Speech. Eighty native English speakers were presented with spoken sentences made of different combinations of 5 discrete emotions (anger, fear, happiness, sadness, and neutral) presented in prosody and semantics. Listeners were asked to rate the sentence as a whole, integrating both speech channels, or to focus on one channel only (prosody or semantics). Results: We observed supremacy of congruency, failure of selective attention, and prosodic dominance. Supremacy of congruency means that a sentence that presents the same emotion in both speech channels was rated highest; failure of selective attention means that listeners were unable to selectively attend to one channel when instructed; and prosodic dominance means that prosodic information plays a larger role than semantics in processing emotional speech. Conclusions: Emotional prosody and semantics are separate but not separable channels, and it is difficult to perceive one without the influence of the other. Our findings indicate that the Test for Rating of Emotions in Speech can reveal specific aspects in the processing of emotional speech and may in the future prove useful for understanding emotion-processing deficits in individuals with pathologies.
Exploring the prosody of affective speech
ExLing Conferences, 2022
This paper introduces a research project on voice quality and affect expression. It explores affective prosody by investigating the relationship between voice source parameter changes and perceived affect. Firstly, it aims to examine the relative contribution of voice source shifts occurring globally across an utterance and shifts that are aligned to the prosodic structure of the utterance. Secondly, it aims to formulate a simple model for affect expression that could, in principle, be applied to text-to-speech synthesis systems for Irish (Gaelic) dialects. The analytic methods to be used include voice source and intonation analysis of utterances produced to portray a range of emotions, and perception experiments with stimuli varying in terms of global vs. local, structured source manipulations.
Communicating Emotion: Linking Affective Prosody and Word Meaning
Journal of Experimental Psychology-human Perception and Performance, 2008
The present study investigated the role of emotional tone of voice in the perception of spoken words. Listeners were presented with words that had either a happy, sad, or neutral meaning. Each word was spoken in a tone of voice (happy, sad, or neutral) that was congruent, incongruent, or neutral with respect to affective meaning, and naming latencies were collected. Across experiments, tone of voice was either blocked or mixed with respect to emotional meaning. The results suggest that emotional tone of voice facilitated linguistic processing of emotional words in an emotion-congruent fashion. These findings suggest that information about emotional tone is used in the processing of linguistic content influencing the recognition and naming of spoken words in an emotion-congruent manner.
Psychopathology, 2014
of spoken text along with repeated assessments at 14-day intervals allowed us to estimate the 'natural' variation of speech parameters over time, and to analyze the sensitivity of speech parameters with respect to form and content of spoken text. Additionally, our project included a longitudinal self-assessment study with university students from Zurich (n = 18) and unemployed adults from Valencia (n = 18) in order to test the feasibility of the speech analysis method in home environments. Results: The normative data showed that speaking behavior and voice sound characteristics can be quantified in a reproducible and language-independent way. The high resolution of the method was verified by a computerized assignment of speech parameter patterns to languages at a success rate of 90%, while the correct assignment to texts was 70%. In the longitudinal self-assessment study we calculated individual 'baselines' for each test person along with deviations thereof. The significance of such deviations was assessed through the normative reference data. Conclusions: Our data provided gender-, age-, and language-specific thresholds that allow one to reliably distin-Abstract Background: Human speech is greatly influenced by the speakers' affective state, such as sadness, happiness, grief, guilt, fear, anger, aggression, faintheartedness, shame, sexual arousal, love, amongst others. Attentive listeners discover a lot about the affective state of their dialog partners with no great effort, and without having to talk about it explicitly during a conversation or on the phone. On the other hand, speech dysfunctions, such as slow, delayed or monotonous speech, are prominent features of affective disorders. Methods: This project was comprised of four studies with healthy volunteers from Bristol (English: n = 117), Lausanne (French: n = 128), Zurich (German: n = 208), and Valencia (Spanish: n = 124). All samples were stratified according to gender, age, and education. The specific study design with different types
Describing the emotional states that are expressed in speech
Speech Communication, 2003
To study relations between speech and emotion, it is necessary to have methods of describing emotion. Finding appropriate methods is not straightforward, and there are difficulties associated with the most familiar. The word emotion itself is problematic: a narrow sense is often seen as ''correct'', but it excludes what may be key areas in relation to speech-including states where emotion is present but not full-blown, and related states (e.g., arousal, attitude). Everyday emotion words form a rich descriptive system, but it is intractable because it involves so many categories, and the relationships among them are undefined. Several alternative types of description are available. Emotion-related biological changes are well documented, although reductionist conceptions of them are problematic. Psychology offers descriptive systems based on dimensions such as evaluation (positive or negative) and level of activation, or on logical elements that can be used to define an appraisal of the situation. Adequate descriptive systems need to recognise the importance of both time course and interactions involving multiple emotions and/or deliberate control. From these conceptions of emotion come various tools and techniques for describing particular episodes. Different tools and techniques are appropriate for different purposes.