The phonetic reduction of nasals and voiced stops in Japanese across speech styles (original) (raw)
Related papers
An Acoustic Profile Of Consonant Reduction
1996
Vowel reduction has been studied for years. It is a universal phenomenon that reduces the distinction of vowels in informal speech and unstressed syllables. How consonants behave in situations where vowels are reduced is much less well known. In this paper we compare durational and spectral data (for both intervocalic consonants and vowels) segmented from read speech with otherwise identical segments from spontaneous speech. On a global level, it shows that consonants reduce like vowels when the speaking style becomes informal. On a more detailed level there are differences related to the type of the consonant.
Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech
Journal of Memory and Language, 2012
Frequent or contextually predictable words are often phonetically reduced, i.e. shortened and produced with articulatory undershoot. Explanations for phonetic reduction of predictable forms tend to take one of two approaches: Intelligibility-based accounts hold that talkers maximize intelligibility of words that might otherwise be difficult to recognize; production-based accounts hold that variation reflects the speed of lexical access and retrieval in the language production system. Here we examine phonetic variation as a function of phonological neighborhood density, capitalizing on the fact that words from dense phonological neighborhoods tend to be relatively difficult to recognize, yet easy to produce. We show that words with many phonological neighbors tend to be phonetically reduced (shortened in duration and produced with more centralized vowels) in connected speech, when other predictors of phonetic variation are brought under statistical control. We argue that our findings are consistent with the predictions of production-based accounts of pronunciation variation.
Phonetic reduction, vowel duration, and prosodic structure
Word frequency, phonological neighborhood density, semantic predictability in context, and discourse mention have all been previously found to cause reduction of vowels. Other researchers have suggested that reduction based on these factors is reflective of a unified process in which "redundant" or "predictable" elements are reduced, and that this reduction is largely mediated by prosody. Using a large read corpus, we show that these four factors show different types of reduction effects, and that there are reduction effects of prosody independent of duration, and vice versa, suggesting the existence of multiple processes underlying reduction.
Mechanism of extreme phonetic reduction: Evidence from Taiwan Mandarin
2012
"Extreme reduction refers to the phenomenon where intervocalic consonants are so severely reduced that two or more adjacent syllables appear to be merged into one. Such severe reduction is often considered a characteristic of natural speech and to be closely related to factors including lexical frequency, information load, social context and speaking style. This thesis takes a novel approach to investigating this phenomenon by testing the time pressure account of phonetic reduction, according to which time pressure is the direct cause of extreme reduction. The investigation was done with data from Taiwan Mandarin, a language where extreme reduction (referred to as contraction) has been reported to frequently occur. Three studies were conducted to test the main hypothesis. In Study 1, native Taiwan Mandarin speakers produced sentences containing nonsense disyllabic words with varying phonetic structures at differing speech rates. Spectral analysis showed that extreme reduction occurred frequently in nonsense words produced under high time pressure. In Study 2a, further examination of formant peak velocity as a function of formant movement amplitude in experimental data suggested that articulatory effort was not decreased during reduction, but in fact likely to be increased. Study 2b examined high frequency words from three spontaneous speech corpora for reduction variations. Results demonstrate that patterns of reduction in high frequency words in spontaneous speech (Study 2b) were similar to those in nonsense words spoken under experimental conditions (Study 2a). Study 3 investigated tonal reduction with varying tonal contexts and found that tonal reduction can also be explained in terms of time pressure. Analysis of F0 trajectories demonstrates that speakers attempt to reach the original underlying tonal targets even in the case of extreme reduction and that there was no weakening of articulatory effort despite the severe reduction. To further test the main hypothesis, two computational modelling experiments were conducted. The first applied the quantitative Target Approximation model (qTA) for tone and intonation and the second applied the Functional Linear Model (FLM). Results showed that severely reduced F0 trajectories in tone dyads can be regenerated to a high accuracy by qTA using generalized canonical tonal targets with only the syllable duration modified. Additionally, it was shown that using FLM and adjusting duration alone can give a fairly good representation of contracted F0 trajectory shapes. In summary, results suggest that target undershoot under time pressure is likely to be the direct mechanism of extreme reduction, and factors that have been commonly associated with reduction in previous research very likely have an impact on duration, which in turn determines the degree of target attainment through the time pressure mechanism. "
Phonetic reduction can lead to lengthening, and enhancement can lead to shortening
Contextually probable, high-frequency, or easily accessible words tend to be phonetically reduced, a pattern usually attributed to faster lexical access. In principle, word forms that are frequent in their inflectional paradigms should also enjoy faster lexical access, leading again to phonetic reduction. Yet research has found evidence of both reduction and enhancement on paradigmatically probable inflectional affixes. The current corpus study uses pronunciation data from conversationally produced English verbs and nouns to test the predictions of two accounts. In an exemplar account, paradigmatically probable forms seem enhanced because their denser exemplar clouds resist influence from related word forms on the average production target. A second pressure reduces such forms because they are, after all, more easily accessed. Under this account, paradig-matically probable forms should have longer affixes but shorter stems. An alternative account proposes that paradigmatically probable forms are produced in such a way as to enhance not articulation, but contrasts between related word forms. This account predicts lengthening of suffixed forms, and shortening of unsuffixed forms. The results of the corpus study support the second account, suggesting that characterizing pronunciation variation in terms of phonetic reduction and enhancement oversimplifies the relationship between lexical storage, retrieval, and articulation.
An introduction to reduced pronunciation variants
Journal of Phonetics, 2011
Words are often pronounced very differently in formal speech than in everyday conversations. In conversational speech, they may contain weaker segments, fewer sounds, and even fewer syllables. The English word yesterday, for instance, may be pronounced as [jePai]. This article forms an introduction to the phenomenon of reduced pronunciation variants and to the eight research articles in this issue on the characteristics, production, and comprehension of these variants. We provide a description of the phenomenon, addressing its high frequency of occurrence in casual conversations in various languages, the gradient nature of many reduction processes, and the intelligibility of reduced variants to native listeners. We also describe the relevance of research on reduced variants for linguistic and psychological theories as well as for applications in speech technology and foreign language acquisition. Since reduced variants occur more often in spontaneous than in formal speech, they are hard to study in the laboratory under well controlled conditions. We discuss the advantages and disadvantages of possible solutions, including the research methods employed in the articles in this special issue, based on corpora and experiments. This article ends with a short overview of the articles in this issue.
Towards a gradual scale of vowel reduction: a pilot study
The study reports the results of an acoustic analysis of vowel reduction of the /iː/ vowel, considering all three traditionally explored aspects of vowel reduction, i.e. duration, F1 and F2 in read speech produced by 12 native speakers of English. Starting from the observation that the standard literature considers only duration as a proxy for overall reduction, the aim of the study is to verify whether duration, F1 and F2 exhibit reduction (construed as shortening of duration and centralization of formants, respectively) to the same degree. The r test reveals the lack of a robust linear correlation between duration, F1 and F2, the highest value being 0.51 (the correlation between duration and F1) and 0.24 (the correlation between duration and F2), neither of which is a strong correlation. In light of the results, the study seeks to establish a gradual scale of vowel reduction, combining the spatial and the temporal aspects by means of averaging the distances between the least and the most reduced tokens across duration, F1/F2 on an equal basis. The resulting degree is expressed on a scale of reduction, ranging from 0 (no reduction whatsoever) to 100 per cent (reduction to schwa).
An Effort Based Approach to Consonant Lenition
2001
2-1. Schema of force of a gesture 2-2. Inferences from equation of effort with force 2-3. Schema of computational mass-spring model of consonant constriction 2-4. "Impulse" restriction on mass-spring model 2-5. Force and displacement vs. time graphs for a singleton stop and fricative 50 2-6. Force and displacement vs. time graph for a geminate fricative 2-7. Total force of gesture with positive and negative force impulses 2-8. Gestural score for voiceless vs. voiced medial stops 3-1. The standard (representational) treatment of contrastiveness 3-2. A constraint-ranking treatment of contrastiveness 3-3. Relation between auditory and articulatory representations in this approach. 64 3-4. Constrastive and non-contrastive PRs 3-5. Free variation 3-6. Vowel height continuum subdivided using two binary features 3-7. Vowel height continuum subdivided using 100 binary features 4-1. Schematic displacement-vs.-time graphs for a stop and non-strident fricative 110 4-2. Schematic displacement-vs.-time graphs for a strident fricative 4-3. Schematic: sustained constriction achieved by isometric tension 4-4. Outputs of the mass-spring model, for a stop, and strident and nonstrident fricative 113 4-5. Output of mass-spring model for affricate 5-1. Icelandic geminate preaspiration as shortening of oral closure gesture 5-2. Outputs of mass-spring model for singleton and geminate stops 5-3. Schemata of displacement-vs.-time, for singleton and geminate fricatives 157 5-4. Output of mass-spring model for geminate fricative and affricate 5-5. Output of mass-spring model for half-spirantized geminate 5-6. Output of mass-spring model for stop with attenuated transition 5-7. Gestural scores for full and partial geminates 5-8. Aerodynamic considerations in fricative voicing and glottal aperture 5-9. Hasse diagram of previously inferred effort relations among geminate and singleton consonant types 165 5-10. Reduction of magnitude and duration 5-11. Reduction to a prolonged close constriction 6-1. Comparison of displacement in consonants of differing constriction degrees 191 6-2. Comparison of displacement in consonants of the same constriction degree, but different starting/ending points 192 vi 6-3. Schemata of strategies of jaw/articulator displacement in intervocalic stops 188 6-4. Comparison of fortis and lenis stops, with different starting/ending points 199 6-5. Syllabic apportionment of effort cost of consonants in [akka] vs. [aka]. 6-6. Schemata of fast-speech shortening strategies 6-7. Register lowering as demotion of lenition-blocking constraints relative to effort thresholds 220 7-1. Flapping as coarticulatory retraction of the tongue tip 7-2. Schemata of tongue tip/tongue body ensemble dispacement vs. time, without and with compensatorily attenuated transition 243 7-3. Hasse diagram of constraint hierarchy for Tümpisa Shoshone 8-1. Stridency scale 8-2. Crucial ranking ranges of lenition-blocking constraints, relative to LAZY, for Florentine lenition variation vii LIST OF TABLES 1-1. Places of articulation targeted by lenition processes 1-2. Cases of lenition in word-final position 1-3. Cases of lenition in coda position 1-4. Fortition / blocking of lenition in word-initial position 1-5. Phrase-or utterance-initial blocking of lenition 1-6. Fortition / blocking of lenition in onset of stressed syllable 3-1. Values for feature F, in URs and PRs 4-1. Frequency of fricatives in Maddieson's (1984) segment inventory database 100 4-2. Spirantization of coronal stops to non-strident fricatives or approximants 102 4-3. Lenition of coronal stops to flaps 4-4. Assibilatory spirantizations, from Lavoie 1996 4-5. Spirantization of labials to non-strident fricatives or approximants 4-6. Fricative debuccalization outcomes 5-1. Blocking of spirantization in geminates 5-2. Blocking of voicing in geminates 5-3. Blocking of spirantization, flapping in partial geminates 5-4. Segment inventories: geminate stops and fricatives 5-5. Segment inventories: geminate voiced and voiceless obstruents 5-6. Point of passive devoicing, medial geminate fricative, in msec 6-1. Lenition in Intervocalic Position 6-2. Lenition in post-and prevocalic position 6-3. Lenition in laxer-than-vocalic contexts 6-4. Lenition in stricter-than-vocalic contexts 7-1. Consonant "phonemes" of Tümpisa Shoshone 8-1. Consonant "phonemes" of Florentine Italian 8-2. Lenition variation according to rate/register 8-3. Lenition variation according to rate/register, younger speakers. 8-4. Hypothetical effort values (in abstract units) for consonant allophones in weak position 271 8.5. Hypothetical effort values (in abstract units) for consonant allophones in strong position 272 8-6. Weak position chart (repeated from Table 8-2) 8-7. Lenition variation in strong position viii ACKNOWLEDGMENTS Though few who work with me may have discerned it, beneath my mildmannered, taciturn exterior lie innards of quivering gelatinous mush; and here's where I put them on public display. Brace yourselves. First, to Donca Steriade: In making it possible for me to come to UCLA, in giving me all manner of vital professional and personal advice and support, in addition to your role as my thesis advisor, my debt to you is profound, as is my admiration for you. Though we agree on many theoretical issues, I've had more fun arguing with you about the areas of disagreement than I could derive from successfully persuading nearly anyone else. I hope our arguments continue. To Bruce Hayes: For the same range of advice and support (not to mention help with the programming of the mass-spring model), and the same profundity of debt and admiration, ditto. In particular, I want to express my appreciation for your uncanny ability to take in my often half-baked ideas, figure out what I mean to say, and offer insightful comments on the rephrased ideas. This is a rare skill, which I hope I can develop in my own teaching. To Pat Keating and Donka Minkova, thanks for helpful feedback on this dissertation. Thanks in particular to Pat for recognizing that I'm not an experimentalist (but who knows what the future may bring?). Thanks also to Ian Maddieson, who helped me as much as my official committee members, offering many suggestions and ix corrections that (I hope) have saved me from appearing a complete ignoramus to any phonetician who reads this work. My fellow students at UCLA all helped create an atmosphere of creativity and discovery, for which I am grateful. But in particular, I wish to thank Edward Flemming, one of the most brilliant people I've ever met, without whose pioneering work this dissertation would be inconceivable; Richard Wright, whose brain I have shamelessly and repeatedly picked (I hope I can somehow eventually repay the debt); Rod Casali, for helping me pretend to know something about physics; and Peggy (sorry, Margaret) MacEachern, for general camaraderie, thought-provoking discussions on a variety of linguistic and philosophical issues, and drawings of horses for Miriam. Thanks to Amy Weinberg, my first phonology teacher(!), who helped rescue me from a legal career (shudder), and to Marie Huffman, who got me excited about phonology and very wisely told me to go to UCLA. To Donna Albino: thanks for helping me stay centered as a human being. To my wife Suzanne: I started graduate school as an anti-sexist egalitarian idealist, and as this dissertation took over my life, I gradually degenerated into a typical nerd husband who spent all day hunched over the computer while you cooked, cleaned, and took care of the girls, as well as taught. Thank you. And now it's your turn.
10th International Conference on Speech Prosody 2020, 2020
Perception of duration is critically influenced by the speaking rate of the surrounding context. However, to what extent this speaking rate normalization depends on a specific talker's voice is still understudied. The present study investigated whether listeners' perception of temporally contrastive phonemes is influenced by the speaking rate of the surrounding context, and more importantly, whether the effect of the contextual speaking rate persists across different talkers for different types of contrasts: Japanese singleton-geminate stop contrast (/k/-/kk/) and short-long vowel contrast (/e/-/ee/). The vowel contrast carries more reliable talker information than the stop contrast; hence, listeners' rate-based adjustments may be more talkerspecific for vowels than for stops. The current results showed that context speaking rate impacted the perception of the target contrast across different talkers, and this influence was evident for both types of the contrasts tested. These results suggest that listeners generalized their rate-based adjustments to different talkers' speech regardless of whether the target segment carried reliable talker information (i.e., vowel contrast) or not (i.e., stop contrast). The current results bear on the issue of how speaking rate information is processed with respect to talker information.
The study aims to compare vowel reduction in read and fully spontaneous speech in English and Polish. It hypothesizes that (i) vowels exhibit stronger reduction in fully spontaneous speech in comparison with read speech in the two languages (ii) vowel reduction is more robust in English than it is in Polish (iii) high speech rate of triggers vowel reduction. The aims were achieved by an acoustic analysis of interviews and word lists from PAC (9 speakers) and the Corpus of Modern Spoken Polish in the area of Greater Poland (9 speakers). The study takes centralization of formants and short vowel duration as vowel reduction (Lindblom 1963) which were normalized to compare the values across speakers. For Polish subjects, speakers’ canonical schwa was operationalized as an average of peripheral vowels /i/, /a/ and /u/ due to the fact that Polish has no schwa (Jassem 2003). Comparison of two speech styles consisted in measuring spectral and temporal properties of vowel tokens from the wordlist and from interviews. The rate-reduction hypothesis was tested by means of comparing vowel reduction for three fastest and three slowest speakers for each language and using Pearson correlation. In light of the obtained results, the two first hypotheses were positively verified. The third one produced mixed results. The study establishes a significant difference in vowel reduction across two speech styles, read and fully spontaneous across two unrelated languages. All vowel tokens were shorter and centralized in spontaneous speech, relative to their duration as well as placed in less peripheral positions than in read speech. It has been shown that reduction in English is considerably stronger than in Polish. With respect to the third hypothesis, assuming a straightforward relationship between speech rate and reduction, the findings of the current study did not provide a definite answer. To a certain extent, the correlation between rate and duration was found in Polish but not in English. As Zwicky notes, “casual speech need not to be fast; some speakers [...] use a quite informal speech even at fairly slow rates of speech, while others [...] give the impression of great precision even in hurried speech” (Zwicky 1972: 607).