Plosives of the German dialects in Western Hungary. (original) (raw)
The relevance of some acoustic parameters for the perception of a foreign accent
New Sounds, 2000
L2 speech is characterized by deviations from L1 speech along several dimensions that may also depend upon the speaker's L1. L2 speech is among speakers of a common L1 to a large extent characterized by bundles of deviating patterns which are described as accent. This accent is usually detectable by native speakers, negatively as a form deviating from their L1 and/or positively as an L2-variant based on a certain L1. The L2 speech sample may be deviating along dimensions such as quality and/or duration of segments, F0-pattern, rhythm, voice quality etc.
An analysis of acoustic properties of selected sounds.docx
The present paper analyses qualitative features of selected sounds of standard Slovak in relation to Received Pronunciation (herein after as RP) and General American (herein after as GA) on the level of phonetics. The paper's attempt is to point out which one of the two English varieties in question bears articulatory features similar to that of Slovak, in order to suggest seemingly easier variety of English to be mastered by a Slovak learner of English as a foreign language (EFL). Correct pronunciation of English sounds plays a vital role within an academic setting when focusing on majors using English as the language of interaction. Previous research has tackled the main focus of this paper only partially, and in the literature, several theories have been proposed about the importance of English language learning. However, "there is extraordinary diversity in the ways in which English is taught and learnt around the world, but some orthodoxies have arisen. [...] EFL, as we know it today, is a largely 19 th century creation" (Graddol, 2006, p. 82).The target variety of a learner of English is a native speaker, usually British or American one .
On the internal perceptual structure of phonological features: The [voice] distinction
The Journal of the Acoustical Society of America, 1995
In a direct-realist theory of speech perception, listeners are in immediate (in the sense of unmediated) contact with the phonological units of their language when they use structure in acoustic speech signals as information for its causal source-phonological gestures of the vocal tract. In the theory, phonological categories include, minimally, the sets of motor-equivalent articulatory movements producible by a synergy of the vocal-tract, each set, thereby, counting as a token of the same phonological gesture for producer/ perceivers of speech. Maximally, categories include a set of similar gestures that members of a language community do not distinguish. Categories, thus, are defined gesturally, not acoustically, as for example, research on prototypes in speech have been interpreted as suggesting. Striking behaviors of listeners that index their extraction of information about phonetic gestures from the acoustic speech signal is their parsing of acoustic signals. A literature review suggests that listeners do not hear such unitary acoustic dimensions as fundamental frequency or duration as unitary. Rather, they parse each dimension into its distinct, converging gestural cause. Complementarily, listeners use as information for a phonological unit the constellation of diverse acoustic consequences of the units gestural realization. [Work supported by NICHD.]
Evaluation and integration of acoustic features in speech perception
The Journal of the Acoustical Society of America, 1980
Identification of synthetic stop consonants as either /bae/, /pae/, /dae/, or/tae/ was examined in two experiments in which the stimuli varied independently on voice onset time (VOT), the consonantal second and third formant (F2-F•) transitions and, in experiment 2, the intensity of the aspiration noise during the VOT period. In both experiments, the patterns of the resulting identification probabilities were complex, but systematic, functions of each of the independent variables. Of most interest was the fact that the likelihood of identifying a stimulus to be foae/ or /pae/, rather than /dae/ or /tae/, was strongly influenced by the VOT as well as by the F2-F 3 transitions. Analogously, the likelihood of identifying a stimulus to be foae/ or /dae/, rather than /pae/ or /tae/, depended on the F2-F 3 transitions as well as on VOT. Three explanations of these results were considered within a fuzzy logical model of speech perception: (1) that there is interaction in the evaluation of acoustic features, (2) that the listener requires more extreme values of acoustic features for some speech sounds than for that of other speech sounds, and (3) that the aspiration noise during the VOT period serves as an independent acoustic feature to distinguish/pae/and/bae/from/tae/and/dae/. PACS numbers: 43.70.Dn, 43.70.Ve One of the concepts which has had a profound influence on the study of speech perception during the last quarter of a century is that of distinctive features. Trubetzkoy, Jakobson, and other members of the "Prague school" argued against the idea that the phonemes of a language are minimal units of analysis which cannot be further reduced (Lyons, 1968). According to the Prague school, a phoneme can be characterized by ß distinctive features that represent its similarities and differences with respect to the other phonemes in the same language. For example, Jakobson, Fant, and Halle (1961) proposed that a small set of orthogonal, binary properties or features were sufficient to distinguish among the larger set of phonemes of a language. Jakobson et al. were able to classify 28 English phoneroes on the basis of only nine distinctive features. The binary nature of distinctive features means that in the linguistic classification of a phoneme, each feature is either present or absent in an all-or-none fashion. The orthogonaltry of distinctive features means that the different features are conceptually independent so that, in principle, any combination of features could correspond to a phoneme. Distinctive feature analysis as perfomed by Jackobson et al. (1961) for phonemes or by Chomsky and Halle (1968) for other levels of phonetic and phonologicai representation is not based entirely on physical measurements of acoustic properties of speech sounds. which distinguish between pairs of phonemes. For example, the articulatory differences between p and Nevertheless, while originally intended only to capture linguistic generalities, distinctive feature analysis has been widely adopted as a framework for human speech perception. The attraction of this framework is that since these features are sufficient to distinguish among the different phoneroes, it is possible that phoneme identification could be reduced to the problem of de-termthing which features are present in any given speech sound. This approach gained credibility with the finding, originally by Miller and Nicely (1955) and since by many others (e.g., Campbell, 1974; Cole and Scott, 1972; Peters, 1963; Singh and Woods, 1971; Wang and Biiger, 1973), that the more distinctive features two sounds share, the more likely they are to be perceptually confused for one another. Given that distinctive features are not directly manifested in the speech signal, the implicit assumption underlying most perceptual studies is that acoustic characteristics mediate the perception of speech. Following our previous work, the acoustic characteristics functional in speech perception will be referred to as acoustic features or acoustic cues (Massaro,1975; Massaro and Cohen, 1976, 1977; Oden and Massaro, 1978), in contrast to distinctive features, in order to maintain the distinction between the psychoacoustic and linguistic levels of description. Although we do not expect that there is a direct correspondence between distinctive features and acoustic features, the study of the latter is aided by the distinctions made by the former. Thus, for example, one tends to ask what are the acoustic features for the voiced-voiceless distinction rather than what are the acoustic features that distinguish [b I and [PI' As pointed out by Lisker (1975) and others', a plethora of acoustic features are possible for the single linguistic distinction of voicing. One might also expect that a single acoustic characteristic could be relevant to more than one distinctive feature. In the present research, we have found some evidence for the aspiration noise functioning as an acoustic feature both for a place of articulation distinction as well as a voicing distinction. If this framework is accepted, then two questions are of central concern: (1) Which characteristics of speech sounds actually function as acoustic features; and (2) what are the processes by which this featural information is evaluated and integrated to identify a given 996 speech sound ? The seminal work on the first question was carried out at the Haskins Laboratories using synthetic speech produced by the pattern playback synthesizer (Liberman, Delattre, and Cooper, 1952). These investigators studied the contribution of various acoustic properties to the identification of speech sounds which differ by various different distinctive features and must, therefore, also differ by one or more acoustic features. Liberman, Delattre, and Cooper (1958), for example, demonstrated that for stop consonants in initial position, increases in the time between the onset of the release burst and the onset of vocal cord vibration [voice onset time (VOT)] were sufficient to change the identification from a voiced to a voiceless stop. The distinctive feature of voicing characterizes whether or not the vocal cords are vibrated during a significant portion of a particular sound. Stop consonants with short VOTs were usually identified as voiced stops (lb I, Idl, or Igl)whereas consonants that were identical except for having long VOTs were identified as voiceless stops(Ipl, It I, or Ikl). Liberman ½t al. (1958) also found that the presence or absence of aspiration during the VOT period contributed to the identification of stops with respect to voicing, and more recently, it has been found that the onset frequencies of the fundamental (Fo) and the first format (-•,) also cue voicing (Haggard, Ambler, and Callow, 1970; Lisker, 1975:, Summerfield and Haggard, 1977). It should be pointed out that it may be that voice onset time is not actually an acoustic feature itself even though changes in this variable are sufficient to change the identification from a voiced to a voicless stop. Some other concomitant change or changes in the stimulus may function as the critical feature or features. As an example, Winitz, LaRiviere, and Herriman (1975) dispredominantly voiceless sounds occurred at the point where the VOT was about 23 ms for labials, 37 ms for alveolars, and 42 ms for velars. Two types of learural nonindependence can account for 997
Perception of place of articulation for plosives and fricatives in noise
Speech Communication, 2011
This study aims at uncovering perceptually-relevant acoustic cues for the labial versus alveolar place of articulation distinction in syllable-initial plosives tic analyses using logistic regression show that formant frequency measurements, relative spectral amplitude measurements, and burst/noise durations are generally reliable cues for labial/alveolar classification. In a subsequent perceptual experiment, each pair of syllables with the labial/alveolar distinction (e.g., /ba,da/) was presented to listeners in various levels of signal-to-noise-ratio (SNR) in a 2-AFC task. A threshold SNR was obtained for each syllable pair using sigmoid fitting of the percent correct scores. Results show that the perception of the labial/alveolar distinction in noise depends on the manner of articulation, the vowel context, and interaction between voicing and manner of articulation. Correlation analyses of the acoustic measurements and threshold SNRs show that formant frequency measurements (such as F1 and F2 onset frequencies and F2 and F3 frequency changes) become increasingly important for the perception of labial/alveolar distinctions as the SNR degrades.
Experiments in Speech Perception : Phonetics Research Seminar 1978-1979
1979
An experiment was designed to investigate whether the perception of vowel quality is affected by the interaction that occurs at the acous tic level between a formant peak and the strongest harmonics within that formant. Synthetic one-formant stimuli were used in an identifi cation task involving [ a.J , [:>J, [ oJ as responses. Three formant fre quencies were experimentally established for these vowels and FO values were selected for each vowel so as to produce harmonic configurations with the strongest harmonic deviating both maximally and minimally from the formant frequency. The results support the claim that the phonetic quality of a given formant pattern is a monotonic function of FO and is not influenced by a "strongest harmonic" effect.-13-Why does raJ change to [�J when FO is increased? : Interplay between harmonic structure and formant frequency in the perception of vowel quality.-23-2.3 Analysis and p rediction of difference limen data for formant frequencies.