Oliver Niebuhr | University of Southern Denmark (original) (raw)
Papers by Oliver Niebuhr
Cadernos de Linguística
Phonetic research on the prosodic sources of perceived charisma has taken a big step towards maki... more Phonetic research on the prosodic sources of perceived charisma has taken a big step towards making a speaker’s tone-of-voice a tangible, quantifiable, and trainable matter. However, the tone-of-voice includes a complex bundle of acoustic features, and a lot of parameters have not even been looked at so far. Moreover, all previous studies focused on political or religious leaders and left aside the large field of managers and CEOs in the world of business. These are the two research gaps addressed in the present study. An acoustic analysis of about 1,350 prosodic phrases from keynotes given by a more charismatic CEO (Steve Jobs) and a less charismatic CEO (Mark Zuckerberg) suggests that the same tone-of-voice settings that make political or religious leaders sound more charismatic also work for business speakers. In addition, results point to further charisma-relevant acoustic parameters related to rhythm, emphasis, pausing, and voice quality - as well as to audience type as a signi...
EURASIP Journal on Audio, Speech, and Music Processing
In normal modally voiced utterances, voiceless fricatives like [s], [ʃ], [f], and [x] vary such t... more In normal modally voiced utterances, voiceless fricatives like [s], [ʃ], [f], and [x] vary such that their aperiodic pitch impressions mirror the pitch level of the adjacent F0 contour. For instance, if the F0 contour creates a high or low pitch context, then the aperiodic pitch impression of the fricative in this context will also be high or low. This contextmatching effect has been termed "segmental intonation". While there is accumulating evidence for segmental intonation in speech production, less is known about if and how segmental intonation is actually integrated in the perception of utterance tunes. This question is addressed here in a perception experiment in which listeners identified target words ending in either [ʃ] or [s]. The two sibilants inherently create low or high aperiodic pitch impressions in listeners due to their characteristically different spectral energy distributions. The sibilants were preceded by high or low F0 contexts in the target words. Results show a clear F0-context effect. The context effect triggered more [ʃ] identifications in high-F0 and/or more [s] identifications in low-F0 contexts. The effect was larger for sibilants that were less clearly identifiable as either /ʃ/ or /s/. The effect represents strong supporting evidence that listeners in fact perceive the segmental intonation of fricatives and integrate its aperiodic pitch with the F0-based pitch when perceiving utterance intonation. Thus, the term "segmental intonation" is perceptually appropriate. Furthermore, the results are discussed with respect to reactiontime measurements and an additional effect of the quality of the adjacent vowel phoneme on sibilant identification.
Segmental, prosodic and fluency features in phonetic learner corpora
We tested the usability of prosody visualization techniques for second language (L2) learners. Ei... more We tested the usability of prosody visualization techniques for second language (L2) learners. Eighteen Danish learners realized target sentences in German based on different visualization techniques. The sentence realizations were annotated by means of the phonological Kiel Intonation Model and then analyzed in terms of (a) prosodic-pattern consistency and (b) correctness of the prosodic patterns. In addition, the participants rated the usability of the visualization techniques. The results from the phonological analysis converged with the usability ratings in showing that iconic techniques, in particular the stylized “hat pattern” visualization, performed better than symbolic techniques, and that marking prosodic information beyond intonation can be more confusing than instructive. In discussing our findings, we also provide a description of the new Danish-German learner corpus we created: DANGER. It is freely available for interested researchers upon request.
Speech Prosody 2016, 2016
The Journal of the Acoustical Society of America, 2008
This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception o... more This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception of intonational categories. The term intonation is restricted to speech melody. Several experiments on early and medial peaks in German show that both dimensions are relevant in this intonational contrast, mirroring the findings for comparable intonational contrasts in other languages. Hence, my findings support the assumption that all these intonational contrasts originate from a common psychophonetic mechanism which is linked to holistic contours rather than to its local features.
Revista Leitura, Oct 11, 2014
The paper presents a combined production and perception study on speech rhythm in German. The per... more The paper presents a combined production and perception study on speech rhythm in German. The perception part shows that identifying complex rhythm patterns is only possible for speaking rates of 4-8 syll/sec. Even acoustically monotonous stimuli within this range trigger "subjective rhythms". In contrast, rhythm perception is flattened for speaking rates outside this range, irrespective of acoustic cues to rhythm. The production part accords with this finding. Speaking rates in everyday conversation vary between 4-8 syll/sec, and only fall below this range when speakers flatten their rhythm for emphatic purposes. Together, the production and perception evidence revealed a "rhythm window", which is targeted or avoided by speakers.
Kieler Forschungen zur Sprachwissenschaft, 2014
The line of (ongoing) research presented in this talk was inspired by the notion of truncation of... more The line of (ongoing) research presented in this talk was inspired by the notion of truncation of falling intonations by voiceless consonants at the end of utterances. The term truncation (in contrast to compression) was originally introduced for Swedish by Erikson and Altermark (1972) and refined by Bannert and Bredvad-Jensen (1975) in order to account for variation in word-accent realizations due to changes of vowel duration. Later, Grabe (1998) applied the concept of truncation to German, claiming that utterance-final falling intonations are truncated by voiceless segments, whereas utterance-final rising intonations are compressed and hence realized entirely before the voiceless segments. In the course of his prosodic labelling of the Kiel Corpus of Spontaneous Speech, it was observed by the author that, even though utterance-final F0 falls were indeed truncated to different degrees, the spectral energy distributions of aspiration noises of plosives following the truncated F0 fal...
The paper deals with perceived speech rhythm, starting from the observation that two nouns with a... more The paper deals with perceived speech rhythm, starting from the observation that two nouns with a conjunction in between ('X and/or Y', cf. title) sound more rhythmical in a particular noun order. A perception experiment on German with real and pseudo nouns provides evidence that speech rhythm is not just created prosodically by means of high and low or long and short syllables, but that the phonetic properties of the vowel nuclei and of the consonantal onsets and offsets of the stressed syllables are separate segmental constituents of speech rhythm.
The study scrutinizes the role of alignment of F0 move-ments in identifying two different pitch a... more The study scrutinizes the role of alignment of F0 move-ments in identifying two different pitch accents. Although this general issue was addressed for German, the pitch-accent contrast that was studied occurs cross-linguistically and is known as 'early' vs. 'medial' or H+L* vs. L+H*. The early pitch accent reaches the F0-peak maximum befo-re the accented-vowel onset and hence falls into the vowel, while the medial pitch accent peaks after the vowel onset. This alignment-based identification model was recently un-dermined by studies that varied the slopes and ranges of the F0 movements or the extension of the F0-peak maximum. The latter parameter is taken up in the present perception experiments. Starting from a pointed rising-falling peak ali-gned at the accented-vowel onset, a peak and a plateau series were resynthesized by shifting either the entire peak or just the rising or falling movement into and away from the ac-cented vowel. The peak and plateau stimuli were...
German knows two plateau-based phrase-final intonation con-tours: the high level plateau of the c... more German knows two plateau-based phrase-final intonation con-tours: the high level plateau of the continuation rise and the descending plateau sequence of the calling contour. They oc-cur within a narrow scaling range of only a few semitones. The paper presents production and perception evidence for a third plateau-based phrase-final intonation contour inside this narrow scaling range. The new plateau contour shows a F0 de-crease of between 1-3 st (in the form of a slightly declining plateau or a descending plateau sequence), involves additional lengthening of the vowels underneath the plateau, and occurs when resistance is futile, i.e. when speakers signal that they finally, but reluctantly, give in to a demand of the dialogue partner. Phonological implications are briefly outlined.
The present pilot study revives an old approach to intonation and reintroduces it as a new experi... more The present pilot study revives an old approach to intonation and reintroduces it as a new experimental method: the successive drawing of perceived into-nation contours. It has been shown that intonation drawings made by untrained native German listeners for sets of controlled stimulus utterances can yield valid and reliable results patterns. Additionally, intonation drawings are more straightforward than other reproduction methods and allow more detailed in-sights into the perception of intonation than other meaning-based 2AFC tasks. Based on the results received for two classes of meaningful intonational units – (nuclear) pitch accents and phrase-final intonation movements – it is argued that the relationship between the production and perception of intonation is characterized by a multiparametric coding that goes far beyond F0 and additionally crosses the traditional segment-prosody divide. Since the acoustic complexity can be translated into simpler perceptual patterns, phonolog...
The presented production experiment analyzes the phonetic differences between neutral (i.e. since... more The presented production experiment analyzes the phonetic differences between neutral (i.e. sincere) and sarcastically ironic utterances in German. Results show in line with pre-vious studies that sarcastic irony is expressed by longer utter-ance durations, lower and flatter F0 contours, and a lower in-tensity level. Moreover, extending previous findings, sarcastic irony is also characterized by a more variable (in tendency breathier) voice quality and a higher degree of segmental reduction, probably reflecting the speakers' dissociation from the wording of their utterances.
The acoustic differences that underlie the production and perception of the elements of intonatio... more The acoustic differences that underlie the production and perception of the elements of intonation in German are much more complex than predicted by the simple synchronisation concept of the Kiel Intonation Model, which relies on the timing of local peak maxima or valley minima relative to the boundaries of the accented vowel. While this relative timing is undoubtedly important, signalling intonation elements is also based on contour shape. Additionally, it goes beyond F0 and involves syllable duration and intensity, as well as variation in the quality of speech sounds. This quality variation was called "segmental intonation", as it is suitable to support the perception of intonational forms (i.e. pitch contours) and their functions. The implications of the acoustic complexity of intonation for phonological modelling are outlined.
Proc. 15th ICPhS, Barcelona, 2003
This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception o... more This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception of intonational cate- gories. The term intonation is restricted to speech melody. Several experiments on early and medial peaks in German show that both dimensions are relevant in this intonational contrast, mirroring the findings for comparable intonational contrasts in other languages. Hence,
Cadernos de Linguística
Phonetic research on the prosodic sources of perceived charisma has taken a big step towards maki... more Phonetic research on the prosodic sources of perceived charisma has taken a big step towards making a speaker’s tone-of-voice a tangible, quantifiable, and trainable matter. However, the tone-of-voice includes a complex bundle of acoustic features, and a lot of parameters have not even been looked at so far. Moreover, all previous studies focused on political or religious leaders and left aside the large field of managers and CEOs in the world of business. These are the two research gaps addressed in the present study. An acoustic analysis of about 1,350 prosodic phrases from keynotes given by a more charismatic CEO (Steve Jobs) and a less charismatic CEO (Mark Zuckerberg) suggests that the same tone-of-voice settings that make political or religious leaders sound more charismatic also work for business speakers. In addition, results point to further charisma-relevant acoustic parameters related to rhythm, emphasis, pausing, and voice quality - as well as to audience type as a signi...
EURASIP Journal on Audio, Speech, and Music Processing
In normal modally voiced utterances, voiceless fricatives like [s], [ʃ], [f], and [x] vary such t... more In normal modally voiced utterances, voiceless fricatives like [s], [ʃ], [f], and [x] vary such that their aperiodic pitch impressions mirror the pitch level of the adjacent F0 contour. For instance, if the F0 contour creates a high or low pitch context, then the aperiodic pitch impression of the fricative in this context will also be high or low. This contextmatching effect has been termed "segmental intonation". While there is accumulating evidence for segmental intonation in speech production, less is known about if and how segmental intonation is actually integrated in the perception of utterance tunes. This question is addressed here in a perception experiment in which listeners identified target words ending in either [ʃ] or [s]. The two sibilants inherently create low or high aperiodic pitch impressions in listeners due to their characteristically different spectral energy distributions. The sibilants were preceded by high or low F0 contexts in the target words. Results show a clear F0-context effect. The context effect triggered more [ʃ] identifications in high-F0 and/or more [s] identifications in low-F0 contexts. The effect was larger for sibilants that were less clearly identifiable as either /ʃ/ or /s/. The effect represents strong supporting evidence that listeners in fact perceive the segmental intonation of fricatives and integrate its aperiodic pitch with the F0-based pitch when perceiving utterance intonation. Thus, the term "segmental intonation" is perceptually appropriate. Furthermore, the results are discussed with respect to reactiontime measurements and an additional effect of the quality of the adjacent vowel phoneme on sibilant identification.
Segmental, prosodic and fluency features in phonetic learner corpora
We tested the usability of prosody visualization techniques for second language (L2) learners. Ei... more We tested the usability of prosody visualization techniques for second language (L2) learners. Eighteen Danish learners realized target sentences in German based on different visualization techniques. The sentence realizations were annotated by means of the phonological Kiel Intonation Model and then analyzed in terms of (a) prosodic-pattern consistency and (b) correctness of the prosodic patterns. In addition, the participants rated the usability of the visualization techniques. The results from the phonological analysis converged with the usability ratings in showing that iconic techniques, in particular the stylized “hat pattern” visualization, performed better than symbolic techniques, and that marking prosodic information beyond intonation can be more confusing than instructive. In discussing our findings, we also provide a description of the new Danish-German learner corpus we created: DANGER. It is freely available for interested researchers upon request.
Speech Prosody 2016, 2016
The Journal of the Acoustical Society of America, 2008
This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception o... more This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception of intonational categories. The term intonation is restricted to speech melody. Several experiments on early and medial peaks in German show that both dimensions are relevant in this intonational contrast, mirroring the findings for comparable intonational contrasts in other languages. Hence, my findings support the assumption that all these intonational contrasts originate from a common psychophonetic mechanism which is linked to holistic contours rather than to its local features.
Revista Leitura, Oct 11, 2014
The paper presents a combined production and perception study on speech rhythm in German. The per... more The paper presents a combined production and perception study on speech rhythm in German. The perception part shows that identifying complex rhythm patterns is only possible for speaking rates of 4-8 syll/sec. Even acoustically monotonous stimuli within this range trigger "subjective rhythms". In contrast, rhythm perception is flattened for speaking rates outside this range, irrespective of acoustic cues to rhythm. The production part accords with this finding. Speaking rates in everyday conversation vary between 4-8 syll/sec, and only fall below this range when speakers flatten their rhythm for emphatic purposes. Together, the production and perception evidence revealed a "rhythm window", which is targeted or avoided by speakers.
Kieler Forschungen zur Sprachwissenschaft, 2014
The line of (ongoing) research presented in this talk was inspired by the notion of truncation of... more The line of (ongoing) research presented in this talk was inspired by the notion of truncation of falling intonations by voiceless consonants at the end of utterances. The term truncation (in contrast to compression) was originally introduced for Swedish by Erikson and Altermark (1972) and refined by Bannert and Bredvad-Jensen (1975) in order to account for variation in word-accent realizations due to changes of vowel duration. Later, Grabe (1998) applied the concept of truncation to German, claiming that utterance-final falling intonations are truncated by voiceless segments, whereas utterance-final rising intonations are compressed and hence realized entirely before the voiceless segments. In the course of his prosodic labelling of the Kiel Corpus of Spontaneous Speech, it was observed by the author that, even though utterance-final F0 falls were indeed truncated to different degrees, the spectral energy distributions of aspiration noises of plosives following the truncated F0 fal...
The paper deals with perceived speech rhythm, starting from the observation that two nouns with a... more The paper deals with perceived speech rhythm, starting from the observation that two nouns with a conjunction in between ('X and/or Y', cf. title) sound more rhythmical in a particular noun order. A perception experiment on German with real and pseudo nouns provides evidence that speech rhythm is not just created prosodically by means of high and low or long and short syllables, but that the phonetic properties of the vowel nuclei and of the consonantal onsets and offsets of the stressed syllables are separate segmental constituents of speech rhythm.
The study scrutinizes the role of alignment of F0 move-ments in identifying two different pitch a... more The study scrutinizes the role of alignment of F0 move-ments in identifying two different pitch accents. Although this general issue was addressed for German, the pitch-accent contrast that was studied occurs cross-linguistically and is known as 'early' vs. 'medial' or H+L* vs. L+H*. The early pitch accent reaches the F0-peak maximum befo-re the accented-vowel onset and hence falls into the vowel, while the medial pitch accent peaks after the vowel onset. This alignment-based identification model was recently un-dermined by studies that varied the slopes and ranges of the F0 movements or the extension of the F0-peak maximum. The latter parameter is taken up in the present perception experiments. Starting from a pointed rising-falling peak ali-gned at the accented-vowel onset, a peak and a plateau series were resynthesized by shifting either the entire peak or just the rising or falling movement into and away from the ac-cented vowel. The peak and plateau stimuli were...
German knows two plateau-based phrase-final intonation con-tours: the high level plateau of the c... more German knows two plateau-based phrase-final intonation con-tours: the high level plateau of the continuation rise and the descending plateau sequence of the calling contour. They oc-cur within a narrow scaling range of only a few semitones. The paper presents production and perception evidence for a third plateau-based phrase-final intonation contour inside this narrow scaling range. The new plateau contour shows a F0 de-crease of between 1-3 st (in the form of a slightly declining plateau or a descending plateau sequence), involves additional lengthening of the vowels underneath the plateau, and occurs when resistance is futile, i.e. when speakers signal that they finally, but reluctantly, give in to a demand of the dialogue partner. Phonological implications are briefly outlined.
The present pilot study revives an old approach to intonation and reintroduces it as a new experi... more The present pilot study revives an old approach to intonation and reintroduces it as a new experimental method: the successive drawing of perceived into-nation contours. It has been shown that intonation drawings made by untrained native German listeners for sets of controlled stimulus utterances can yield valid and reliable results patterns. Additionally, intonation drawings are more straightforward than other reproduction methods and allow more detailed in-sights into the perception of intonation than other meaning-based 2AFC tasks. Based on the results received for two classes of meaningful intonational units – (nuclear) pitch accents and phrase-final intonation movements – it is argued that the relationship between the production and perception of intonation is characterized by a multiparametric coding that goes far beyond F0 and additionally crosses the traditional segment-prosody divide. Since the acoustic complexity can be translated into simpler perceptual patterns, phonolog...
The presented production experiment analyzes the phonetic differences between neutral (i.e. since... more The presented production experiment analyzes the phonetic differences between neutral (i.e. sincere) and sarcastically ironic utterances in German. Results show in line with pre-vious studies that sarcastic irony is expressed by longer utter-ance durations, lower and flatter F0 contours, and a lower in-tensity level. Moreover, extending previous findings, sarcastic irony is also characterized by a more variable (in tendency breathier) voice quality and a higher degree of segmental reduction, probably reflecting the speakers' dissociation from the wording of their utterances.
The acoustic differences that underlie the production and perception of the elements of intonatio... more The acoustic differences that underlie the production and perception of the elements of intonation in German are much more complex than predicted by the simple synchronisation concept of the Kiel Intonation Model, which relies on the timing of local peak maxima or valley minima relative to the boundaries of the accented vowel. While this relative timing is undoubtedly important, signalling intonation elements is also based on contour shape. Additionally, it goes beyond F0 and involves syllable duration and intensity, as well as variation in the quality of speech sounds. This quality variation was called "segmental intonation", as it is suitable to support the perception of intonational forms (i.e. pitch contours) and their functions. The implications of the acoustic complexity of intonation for phonological modelling are outlined.
Proc. 15th ICPhS, Barcelona, 2003
This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception o... more This paper is concerned with the role of alignment and peak shape of F0 peaks in the perception of intonational cate- gories. The term intonation is restricted to speech melody. Several experiments on early and medial peaks in German show that both dimensions are relevant in this intonational contrast, mirroring the findings for comparable intonational contrasts in other languages. Hence,