Speech Prosody Research Papers - Academia.edu (original) (raw)

Die Analyse der ersten zehn Minuten einer audio aufgezeichneten psychoanalytischen Sitzung fuhrt zu der Entdeckung und Beschreibung einer psychoanalytischen Prozedur, die wir als »Tanz der Einsicht« bezeichnen. Die Sitzung ist nach den Standards der Konversationsanalyse transkribiert und wird auf drei Ebenen analysiert. Zunachst kommt eine klinische Analyse zu dem Ergebnis, dass die Analytikerin bei zwei Gelegenheiten ihre Gesprachsposition verandert; sie spricht, als ware sie die Patientin. Die klinische Perspektive bringt diese Beobachtung mit dem Konzept der Modellszene und dem methodischen Prinzip der »Komplettierung der Szene« in Beziehung und einigen weiteren Befunden der Sauglingsforschung. Zum zweiten lasst sich mit den Mitteln der Konversationsanalyse erkennen, wie die Therapeutin dies Manover ausfuhrt und wie die Patientin darauf reagiert. Diese Analyse beschreibt detailliert die »slots«, an denen ein solches Manover auf naturliche und ungekunstelte Weise ausgefuhrt wird. ...

- by
- •
- Psychoanalysis, Philosophy, Languages and Linguistics, Art
- by Bistra Andreeva
- •
- Psychology, Speech Prosody, Intonation, Bulgarian Language
- by Daniel Hirst
- •
- Speech Prosody
- by Laszlo Hunyadi
- •
- Speech Prosody, Metrical Phonology, Generative linguistics
- by Bistra Andreeva
- •
- Speech Prosody

This study presents two experiments aimed at investigating tune-to-text alignment and pitch scaling in Lifou French, a variety spoken by bilingual speakers of French and Drehu. Descriptions of New Caledonian French have focussed on language use of European descendants or the variety spoken in the urban region, neglecting emergent varieties spoken by the indigenous population in rural areas, like the island Lifou. Due to the reduced inventory of pitch accents, dialectal variation in French intonation has proved to be difficult to detect, which has led to the assumption that French has a relatively homogeneous intonation system across its varieties. This study shows that fine-grained phonetic differences in speaking tempo and at the level of tonal alignment as well as in the scaling of AP-final peaks can be attributed to dialectal variation.

« Dak mevrouw, ich eet François en ich spel krak tennis ». De meeste leraren Nederlands die lesgeven aan Franstaligen hebben dit (vrij karikaturale) zinnetje al eens gehoord. Dit laat ons zien hoe moeilijk sommige klanken van het Nederlands uit te spreken zijn voor tweedetaalleerders, onder andere Franstaligen. Naast die constatering kunnen we ons afvragen of de
Franstaligen die moeilijke klanken van het Nederlands kunnen waarnemen, en welke factoren een invloed hebben op de discriminatie van die klanken. Daarbij rijst de vraag of de aangeboren of verworven muzikaliteit
van de leerders de discriminatie van taalklanken kunnen beïnvloeden: mensen die muziekklanken accuraat kunnen waarnemen, vertonen misschien
ook een gevoelige perceptie voor niet-muzikale klanken. Deze bijdrage onderzoekt de moeilijkheidsgraad van de discriminatie van Nederlandse klanken door Franstalige studenten en de invloed van (niet)-muzikale factoren op die vaardigheid.
Op de eerste plaats focust dit artikel op de factoren die het leren beïnvloeden (Sectie 2). Vervolgens wordt in Sectie 3 de aandacht gevestigd op het verband tussen taal en muziek en het mogelijk effect dat muzikale aanleg of muziekpraktijk op taalvaardigheden uitoefent. Sectie 4 gaat nader in op de hoofdvragen van dit onderzoek en de gebruikte methodologie. Daaropvolgend staan de resultaten en de bespreking ervan centraal (sectie 5). Dit artikel wordt afgesloten met mogelijke pedagogische implicaties van deze analyse.

- by Pauline Degrave
- •
- Music, Speech Prosody, Teaching of Foreign Languages, Language Learning
- by Simone Simonetti
- •
- Speech Prosody, Cross-modal Perception
- by Elena Yagunova
- •
- Speech Prosody, Speech perception, Speech Recognition, Speech Communication
- by Viorica Marian
- •
- Speech Prosody, Speech Production, Repetition, Korean language

Artikulu honek helburu du euskara-irakasleei gogoeta eragitea, euskararen ahoskerari eta prosodiari buruz, eta, aldi berean, gaitasun prosodikoa garatzeko jarraibide eta argibideak ematea. Bide hori egiteko, Axularren Gero maisu-lanaren "Irakurtzaileari" deritzon atalean oinarrituta, "Irakasleari" aitzinsolasean azaldu da aferaren muinaren jatorria. Ondoren, zer-nolako ibilbide urri eta malkarra izan duen ahoskerak euskararen irakaskuntzan, eta, berrikiago, Helduen Euskalduntzearen Oinarrizko Curriculumean (HEOC) nolako isla daukan alderdi fonologikoak gaitasun fonologikoaren barruan; ebaluazioan ere azaltzen da mailaz maila. Ahoskerak ere izan du Euskaltzaindiaren arreta, eta EBAZ arauaren berri ematen da: euskara batuaren ahoskera zaindua. Aurrerago ahoskeraren eta prosodiaren inguruko proposamenak eta ikerlanak berbabide, hauxe garatzen da: Txillardegiren euskal azentuaren inguruko proposamena eta hari buruzko artikuluaren aitzindaritasuna azalduz, ahoskera lantzeko argibideak eta jarduerak, Euskal Herriko Unibertsitateko irakasleek egindakoa, eta, ahoskeraren gaineko oso lan interesgarria. Azkenik, teoria guztia praktikan ipintzeko, proposamen curricular eta didaktikoa egiten da, gaitasun prosodikoa HEOCean txertatu eta hari buruzko jakintza eta kontzientzia zabaltzeko.

- by Kepa Dieguez
- •
- Speech Prosody, Basque linguistics, Pronunciation

The present paper compares the rhythmic properties of two contact varieties, Olivenza Portuguese (OP) and Olivenza Spanish (OS), with those of Castilian Spanish (CS). Based on the analysis of a corpus comprising recordings of declarative, interrogative , and imperative sentences, we show that OS generally displays intermediate %V, VarcoV, and VnPVI scores between the ones for CS and OP. The greater or lesser differences between the three varieties are explained by referring to phonologica l properties such as (presence or absence of) vowel reduction, vowel and consonant elision, and specifi c lengthening effects. Our results suggest that sente nce modality contrasts seem to be conveyed by rhythmic differences in the varieties under investigation: While durational differences between declaratives and imperatives were found in all of the three varieties (the differences being greater in OP and OS than in CS), declaratives and interrogatives only differ from one another in OP and OS.

- by Elena Kireva
- •
- Portuguese, Speech Prosody, Spanish Linguistics, Prosody
- by Chaxiraxi Díaz
- •
- Speech Prosody

The study is part of a series of studies which examine the acoustic correlates of lexical stress in several typologically different languages, in three speech styles: spontaneous speech, phrase reading, and wordlist reading. This study focuses on Czech, a language with stress fixed on the first syllable of a prosodic word, with no contrastive function at the level of individual words. The acoustic parameters examined here are F0-level, F0-variation, Duration, Sound Pressure Level, and Spectral Emphasis. Values for over 6,000 vowels were analyzed. Unlike the other languages examined so far, lexical stress in Czech is not manifested by clear prominence markings on the first, stressed syllable: the stressed syllable is neither higher, realized with greater F0 variation, longer; nor does it have a higher SPL or higher Spectral Emphasis. There are slight, but insignificant tendencies pointing to a delayed rise, that is, to higher values of some of the acoustic parameters on the second, post-stressed syllable. Since lexical stress does not serve a contrastive function in Czech, the absence of acoustic marking on the stressed syllable is not surprising.

- by Radek Skarnitzl
- •
- Speech Prosody, Phonetics, Speech Acoustics, Acoustic Phonetics

El perfil melódico de cada persona, su acento, no es simplemente una característica más o menos exótica o molesta, sino que revela su identidad como hablante y permite (o, en ocasiones, dificulta) la propia conexión con sus interlocutores: el muro que impide el paso a la comunicación humana o la puerta que abre la conversación.
En el presente volumen, se reúne una serie de investigaciones que exploran dos de los aspectos menos estudiados de la entonación del español:
• El acento dialectal del español peninsular (concretado en nueve variedades: Asturias, Navarra, País Vasco, Castilla y León, Madrid, Andalucía, Castilla La Mancha, Murcia, Extremadura) y Canarias.
• El acento extranjero, ejemplificado con los rasgos característicos de los italianos, los brasileños, los suecos y los húngaros hablando español (a los que añadimos un análisis del español hablado por catalanes, que si bien no puede considerarse un acento extranjero, tampoco es genuinamente dialectal).

- by Dolors Font-Rotchés and +1
- •
- Spanish, Speech Prosody, Applied Linguistics, Spanish as a Foreign Language

Ky studim rasti synon të vlersojë efektet e inkorperimit të terapise LSVT (Lee Silverman Voice Treatment) në terapinë tradicionale për disartinë spastike. Studimi përfshin një pacient adult 33 vjeçar me disartri spastike bilaterale të rrugëve të sipërme neurale sipasoj e një inflamacioni në moshën 2 vjecare. Pacienti nuk i është nënshtruar terapisë në femijëri. Në studim inkorperohen ushtrimet klasike për artikulimin dhe degludimin me metodat e LSVT. Synohen të maten: qartësia gjatë të folurit (ekzaminim subjektiv dhe diadokokinesik), mimika e fytyrës (niveli i hipomimisë), prosodia, rritjen e levizshmërisë së muskulatures bukofaciale dhe aspirimi. Terapia e ndjekur është intensive, katër herë në javë për nje muaj (16 orë), rreth 50 minuta për seancë. Matjet për mbledhjen e të dhënave janë kryer; para, direkt pas perfundimit të terapisë dhe mbas 3 muajsh. Rezultatet afat shkurtër treguan përmirësim të aspirimit, amplitudës së levizjes së muskulaturës orale dhe prosodisë. Përmirësimet afatgjatë, deri në 3 muaj mbas, u dalluan në volumin, cilësinë dhe tonalitetin e zerit dhe përmirësim të qartesisë dhe kuptueshmërisë gjatë të folurit. Studimi rekomandon se LSVT-ja e përshtatur me ushtrime të degludimit dhe artikulimit mund të sjell rezultate afatgjata në kuptueshmërinë e te folurit, kohezgjatjen e fonacionit dhe përmirësim të hipomimisë, por pacienti ka nevoj të ushtrohet përdite pa shkeputje të gjata për mbajten e këtyre rezultateve

- by Rei Halilaj
- •
- Speech Prosody, Dysarthria, Spasticity, Speech and language therapy
- by Antonio Galves
- •
- Computer Science, Speech Prosody

Studies on L1 and L2 productions from the same participants might contribute significantly to language acquisition process. In this study, the researchers investigated read speech pausing patterns in coordinating conjunctions produced by Turkish, Swahili, Hausa, and Arabic speakers of English. The data for the study was collected in two phases; in the first phase, the participants read out a short story in English, and in the second (a follow-up phase), independent sentences were produced in their mother tongues. In total, 2995 pauses in 1498 coordinating conjunctions were measured through Praat, and findings obtained from the data were analyzed by The Paired Samples t-Test. The results showed that pauses differed in favor of the preceding position and differences were observed to be statistically significant. Speakers of the same mother tongue backgrounds performed similar pausing patterns, which could be an important indicator of L1 read speech habits to L2 productions.

- by Ömer Eren and +3
- •
- Second Language Acquisition, Speech Prosody, Multilingualism, Applied Linguistics

This study uses musical notation to describe speech prosody in connected speech in Brazilian Portuguese and Mexican Spanish, using English as a comparison where needed. Through this research we establish the basis on which to expand our future work on speech prosody, from methodology to data collection and analyses, and then make initial observations regarding potentially significant prosodic patterns. This study shows that musical notation can inform us about: 1) the pitch ranges of the speakers in connected speech; 2) speech rate; 3) patterns of moraic and non-moraic syllables; 4) syllable timing; 5) intonation patterns, especially speakers' tessitura. The methodology that we have developed in this exploratory study may help solve unpredictable patterns of speech prosody, especially in regards to intonation, and consequently lead to the improvement of current speech prosody models.

- by Antonio Simoes and +1
- •
- Music, Speech Prosody, Notation (Music), Speech Rhythm
- by Dafydd Gibbon
- •
- Psychology, Computer Science, Speech Prosody, arXiv

This study explores the effect of different intonation strategies applied to audio commercials on the cognitive processing of the listener. A within-subjects experiment was conducted in which participants listened to 16 radio ads that had been created with different models of intonation designed to vary the announcer’s pitch range. Dependent variables were self-reported effectiveness and adequacy, skin conductance level (SCL) and recognition memory. Results showed that radio ads presented by announcers whose pitch transitioned from high frequency to low achieved better perceived effectiveness, elicited greater sympathetic nervous system activation and attained better recognition accuracy than the ads with no pitch variations and variations from low to high pitch.

- by Emma Rodero
- •
- Cognitive Psychology, Speech Prosody, Advertising, Radio
- by Joaquim Llisterri
- •
- Speech Prosody, Phonetics, Speech perception, Perceptual Phonetics
- by Robert Fuchs
- •
- Languages, Languages and Linguistics, Speech Prosody, English language

In speech communication, listeners must accurately decode vocal cues that refer to the speaker's mental state, such as their confidence or 'feeling of knowing'. However, the time course and neural mechanisms associated with online inferences about speaker confidence are unclear. Here, we used event-related potentials (ERPs) to examine the temporal neural dynamics underlying a listener's ability to infer speaker confidence from vocal cues during speech processing. We recorded listeners' real-time brain responses while they evaluated statements wherein the speaker's tone of voice conveyed one of three levels of confidence (confident, close-to-confident, unconfident) or were spoken in a neutral manner. Neural responses time-locked to event onset show that the perceived level of speaker confidence could be differentiated at distinct time points during speech processing: unconfident expressions elicited a weaker P2 than all other expressions of confidence (or neutral-intending utterances), whereas close-to-confident expressions elicited a reduced negative response in the 330e500 msec and 550e740 msec time window. Neutral-intending expressions, which were also perceived as relatively confident, elicited a more delayed, larger sustained positivity than all other expressions in the 980e1270 msec window for this task. These findings provide the first piece of evidence of how quickly the brain responds to vocal cues signifying the extent of a speaker's confidence during online speech comprehension ; first, a rough dissociation between unconfident and confident voices occurs as early as 200 msec after speech onset. At a later stage, further differentiation of the exact level of speaker confidence (i.e., close-to-confident, very confident) is evaluated via an inferential system to determine the speaker's meaning under current task settings. These findings extend three-stage models of how vocal emotion cues are processed in speech comprehension (e.g., Schirmer & Kotz, 2006) by revealing how a speaker's mental state (i.e., feeling of knowing) is simultaneously inferred from vocal expressions.

- by Xiaoming Jiang
- •
- Speech Prosody, EEG, Nonverbal Communication, Feeling of Knowing
- by Antonio Pamies Bertrán
- •
- Speech Prosody, Speech perception, Intonation, Perceptual Phonetics

Dans cet article, nous présenterons les résultats d’un test de perception où 28 informateurs finnophones ayant un niveau avancé en français ont entendu les cinq premières pages de L’étranger d’Albert Camus. Leur tâche a consisté à ajouter tous les points et les virgules au texte écrit correspondant sur la base des indices prosodiques. Les résultats montrent qu’un contour intonatif descendant, où l’intonation baisse jusqu’au plancher de la tessiture du locuteur, est fortement associé à la présence d’un point ; un contour descendant moins profond n’est pas aussi fortement associé au point. Les virgules sont performées des manières différentes, et elles sont interprétées en perception comme des virgules ou comme des continuatifs sans ponctuation.

- by Mari Wiklund
- •
- Speech Prosody, French language, Intonation, Albert Camus

The perception of lexical stress in Spanish by French speakers with and without knowledge of the language has been studied with a technique allowing the evaluation of the individual or combined effects of the acoustic parameters related to the perception of stress. Results suggest, in the first place, that the exposure to L2 makes the French speakers more sensitive to stress. Secondly, although F0 seems to constitute the crucial cue in the identification of stress position, results point out that, when stress is accurately perceived, the time necessary to detect it is affected by manipulations involving amplitude.

- by Joaquim Llisterri
- •
- Second Language Acquisition, Speech Prosody, Phonetics, Speech perception

Este trabajo describe la entonación del español hablado en la ciudad de Oaxaca, México, como parte de los trabajos realizados al construir el Corpus oral del español de México (COEM), siendo el objetivo principal de ese proyecto producir descripciones dialectales y sociolingüísticas que conduzcan al desarrollo de una geoprosodia del español mexicano. Para ello, se describe ahora una serie de enunciados que llevan a cabo diversos actos de habla, principalmente aseverativos y directivos, en los datos producidos por nueve personas en habla semiespontánea. La perspectiva del trabajo es esencialmente cualitativa, aunque se presentan diversos resúmenes cuantitativos. Todo el material ha sido analizado acústicamente y etiquetado siguiendo las convenciones del sistema Sp_ToBI. En la discusión de los resultados, se hace una comparación con la entonación de la Ciudad de México. La conclusión más relevante es que la entonación semiespontánea de Oaxaca forma parte del conjunto de las hablas del centro del país, aunque con matices propios. Se subraya que los patrones tonales empleados para proyectar prosódicamente la mayor parte de los actos de habla son más bien tendencias, no soluciones categóricas.

- by Pedro Martín Butragueño
- •
- Speech Prosody, Dialectology, Sociolinguistics, Intonation

Type: Poster
Occasion: 19th International Congress of Linguists
Location: Geneva
Year: 2013

- by Marina Snesareva
- •
- Irish Studies, Irish linguistics, Speech Prosody, Dialectology

his study is a first approximation to the acoustic description of polite phrases in the Spanish spoken in Mérida. The pragmatic difference between a question and a polite phrase, (a mitigated order), can also be observed in the prosodic level. Our variables were i) type of phrase, with it's variants: polite and question; ii) fundamental frequency and iii) syllable duration. We taped and analysed, with Speech Analyzer 1.3., the production of a polite phrase and a question, by eight women from the city of Mérida, Venezuela. Our data show that the coding of politeness is given, by a larger variability of the fundamental frequency and a higher tone and also through a more regular and longer syllable duration. In order to validate our results it will be necessary to determine in a larger corpus, the tonal and rhythmic pattern of politeness in Venezuelan Spanish, in a larger corpus.

- by Alexandra Alvarez Muro
- •
- Speech Prosody, Pragmatics, Linguistic Politeness
- by Stig Eliasson
- •
- Speech Prosody, English, Intonation, Prosody

Tutorial and a reference manual for the Prosogram tool for analysis of speech prosody and the simulation of prosody perception.

- by Piet Mertens
- •
- Speech Prosody

The present study investigates the effect of explicit teaching of segmentals and suprasegmentals on developing listening comprehension skills for Farsi-English interpreter trainees. Three groups of student interpreters were formed. All were native speakers of Farsi who studied English translation and interpreting at the BA level at the University of Applied Sciences in Tehran, Iran. Participants were assigned to groups at random, but with equal division between genders (6 female and 6 male students in each group). No significant differences in English language skills (TOEFL scores) could be established between the groups prior to the experiment. Participants took a pretest of listening comprehension before starting the program. The control group listened to authentic audio tracks in English and discussed their contents, watched authentic English movies, discussed issues in the movies in pairs in the classroom. The first experimental group spent part of the time on theoretical explanation of, and practical exercises with, English suprasegmentals. The second experimental group spent part of the time on theoretical explanation of, and practical exercises with, English segmentals. The total instruction time was the same for all three groups, i.e. 12 hours. Students then took a posttest in listening comprehension skills. The results show that the explicit teaching of segmentals significantly improved the students' listening comprehension skills more than that of the other groups. These results have pedagogical implications for curriculum designers, interpreting programs for training future interpreters, material producers and all who are involved in language study and pedagogy.

A production study is presented that investigates the effects of word order and information structural context on the prosodic realization of declarative sentences in Hindi. Previous work on Hindi intonation has shown that: (i) non-final content words bear rising pitch accents (Moore 1965, Dyrud 2001, Nair 1999); (ii) focused constituents show greater pitch excursion and longer duration and that post-focal material undergoes pitch range reduction (Moore 1965, Harnsberger 1994, Harnsberger and Judge 1996); and (iii) focused constituents may be followed by a phrase break (Moore 1965). By means of a controlled experiment, we investigated the effect of focus in relation to word order variation using 1200 utterances produced by 20 speakers. Fundamental frequency (F0) and duration of constituents were measured in Subject-Object-Verb (SOV) and Object-Subject-Verb (OSV) sentences in different information structural conditions (wide focus, subject focus and object focus). The analyses indicate that (i) regardless of word order and focus, the constituents are in a strict downstep relationship; (ii) focus is mainly characterized by post-focal pitch range reduction rather than pitch raising of the
element in focus; (iii) given expressions that occur pre-focally appear to undergo no reduction; (iv) pitch excursion and duration of the constituents is higher in OSV compared to SOV sentences. A phonological analysis suggests that focus affects pitch scaling and that word order influences prosodic phrasing of the constituents.

- by Frank Kügler and +3
- •
- Speech Prosody, Intonation, Information Structure, Word order

Basing himself on pioneering research (Wilde 1938; Strang 1964; Jarman & Cruttenden 1976; Knowles 1975, 1978, 1981; Currie 1979; Pellowe & Jones 1978; Local 1986; McElholm 1986; Ladd & Lindsay 1991) and on his own observations, Cruttenden (1995) identified a number of intonation features common to different cities in Northern Britain which he regrouped under the appellation of ‘UNB’ (‘Urban North British’). Cruttenden subsequently defined ‘Urban North British Intonation’ as “an intonational system that operates in a number of cities in northern Britain […] characterised by a default intonation involving rising or rising-slumping nuclear pitch patterns” (Cruttenden 2007).
Investigating the same phenomenon, Wilhelm (2011) came to the conclusion that UNBI was not properly an intonation system, but rather a “metasystem” or a set of distinct systems, insofar as Northern British accents exhibit great intonational variety.
Based on a provisional corpus made up of recordings from the PAC project completed by additional material taken from other corpora, this presentation argues (contra Wilhelm 2011) that UNBI is essentially a unique system and not a set of indirectly related sequences of features. It attempts to provide an inventory of the main tones of which it is comprised, and suggests that the fundamental unity of UNBI could shed light on the vexed question of its origins (Cruttenden 1995; Hirst 2008).
This paper is intended as the first stage of an investigation of intonational variation in Britain within the framework of the PAC project, "La Phonologie de l’Anglais Contemporain: usages, variétés et structure: The Phonology of Contemporary English: usage, varieties and structure" coordinated by Philip Carr, Jacques Durand and Anne Przewozny-Desriaux from France (http://www.projet-pac.net/).

- by Stephan WILHELM
- •
- Speech Prosody, Phonetics, Sociolinguistics, Sociophonetics
- by Yi Xu
- •
- Speech Prosody, Speech Synthesis, Intonation, Prosody
- by Francisco José Cantero Serena
- •
- Emotion, Speech Prosody, Applied Linguistics, Intonation

When speaking or singing, vocalists often move their heads in an expressive fashion, yet the influence of emotion on vocalists' head motion is unknown. Using a comparative speech/song task, we examined whether vocalists' intended emotions influence head movements, and whether those movements influence the perceived emotion. In Experiment 1, vocalists were recorded with motion capture while speaking and singing each statement with different emotional intentions (very happy, happy, neutral, sad, very sad). Functional data analyses showed that head movements differed in translational and rotational displacement across emotional intentions, yet were similar across speech and song, transcending differences in F0 (varied freely in speech, fixed in song) and lexical variability. Head motion specific to emotional state occurred before and after vocalizations, as well as during sound production, confirming that some aspects of movement were not simply a by-product of sound production. In Experiment 2, observers accurately identified vocalists’ intended emotion based on silent, face-occluded videos of head movements during speech and song. These results provide the first evidence that head movements encode a vocalist’s emotional intent, and that observers decode emotional information from these movements. We discuss implications for models of head motion during vocalizations, and applied outcomes in social robotics and automated emotion recognition.

- by Steven R Livingstone
- •
- Emotion, Speech Prosody, Speech Communication, Affect/Emotion
- by Elena Yagunova
- •
- Languages, Speech Prosody, Text Linguistics