Units of speech perception: Phoneme and syllable (original) (raw)
Related papers
Proceedings of International Conferences of Experimental Linguistics, 2019
The study of speech perception over the past 60 years has tried to determine the human processes that underlie the rapid understanding of fluent speech. A first step was to determine the units of speech that could be experimentally manipulated. Years of examining the acoustic properties associated with phonemes led to theories such as the Motor Theory which postulate larger units that integrate vowel and consonant information. Current approaches find better support for the syllable as the most robust and coherent unit of speech. A complete theory of speech perception should systematically map how speech acoustic information is processed bottom-up through the peripheral and central auditory system, as well as how linguistic knowledge interacts top-down with the acoustic-phonetic information to extract meaning.
The nonperceptual reality of the phoneme
Journal of Verbal Learning and Verbal Behavior
Subjects responded as soon as they heard a preset target in a sequence of nonsense syllables. The target was a complete syllable (e.g., "baeb" "saeb") or a phoneme from that syllable, the syllable-initial consonant phoneme for some objects (e.g., "b-" or "s-"), and the medial vowel phoneme for other subjects (e.g., "-ae-"). Subjects responded more slowly to phoneme targets than to syllable targets (by 40 msec for/s-/, 70 msec for/b-/and 250 msec for medial /ae/). These results indicate that phoneme identification is subsequent to the perception of larger phonological units. The reality of the phoneme is demonstrated independently of speech perception and production by the natural presence of alphabets, rhymes, spoonerisms, and interphonemic contextual constraints.
The role of phonology in a letter detection task
2000
In two experiments, we investigated whether onsets and rimes have a role in the processing of written English. In both experiments, participants detected letter targets (e.g., t) in nonwords like vult faster than in nonwords like vust. This finding is consistent with view that sonorants (e.g., the III of vult) cohere with preceding short vowels and are part of the vowel nucleus. In contrast, the ItI of oust is part of the syllable's coda st and so is harder to isolate. Experiment 2 demonstrated that the time required for one to detect single-member codas following vowel digraphs (e.g., the t in veet) was similar to the time to detect the same target letter following a postvocalic sonorant (e.g., the t in vult). No evidence was found for onsets. The results provide support for a phonological organization among letters of printed rimes.
Syllable processing in English
We describe a reaction time study in which listeners detected word or nonword syllable targets (e.g. zoo, trel) in sequences consisting of the target plus a consonant or syllable residue (trelsh, trelshek). The pattern of responses differed from an earlier word-spotting study with the same material, in which words were always harder to find if only a consonant residue remained. The earlier results should thus not be viewed in terms of syllabic parsing, but in terms of a universal role for syllables in speech perception; words which are accidentally present in spoken input (e.g. sell in self) can be rejected when they leave a residue of the input which could not itself be a word.
Role of the syllable in the processing of spoken English: Evidence from a nonword comparison task
1995
Previous research using monitoring tasks suggests that syllables do not play a role in the initial processing of speech by English listeners. The role of syllables in a different task, one involving the speeded comparison of 2 nonwords, was investigated. In 2 experiments, responses to nonword pairs that shared a complete syllable were significantly faster than responses to pairs that shared part of a syllable when the shared unit was at the beginning or in the middle of the nonwords. Results were mixed when the shared unit was at the end of the nonwords, possibly reflecting a confounding effect of rhyme. Findings suggest that syllabified representations of the nonwords may be used in a comparison task, even in English. Results are interpreted relative to different demands of the nonword comparison and monitoring tasks.
Identification of Phonemes: Differences between Phoneme Classes and the Effect of Class Size
Phonetica, 2008
This study reports general and language-specific patterns in phoneme identification. In a series of phoneme monitoring experiments, Castilian Spanish, Catalan, Dutch, English, and Polish listeners identified vowel, fricative, and stop consonant targets that are phonemic in all these languages, embedded in nonsense words. Fricatives were generally identified more slowly than vowels, while the speed of identification for stop consonants was highly dependent on the onset of the measurements. Moreover, listeners' response latencies and accuracy in detecting a phoneme correlated with the number of categories within that phoneme's class in the listener's native phoneme repertoire: more native categories slowed listeners down and decreased their accuracy. We excluded the possibility that this effect stems from differences in the frequencies of occurrence of the phonemes in the different languages. Rather, the effect of the number of categories can be explained by general properties of the perception system, which cause language-specific patterns in speech processing.
Contextual Effects In Vowel Perception II: Evidence for Two Processing Mechanisms
Perception & Psychophysics, 1980
Recent experiments have indicated that contrast effects can be obtained with vowels by anchoring a test series with one of the endpoint vowels. These contextual effects cannot be attributed to feature detector fatigue or to the induction of an overt response bias. In the present studies, anchored ABX discrimination functions and signal detection analyses of identification data Ibefore and after anchoring} for an [i]-[I] vowel series were used to demonstrate that [i] and [I] anchoring produce contrast effects by affecting different perceptual mechanisms. The effects of [i] anchoring were to increase within-[’:] category sensitivity, while [I] anchoring shifted criterion placements. When vowels were placed in CVC syllables to reduce available auditory memory, there was a significant decrease in the size of the [I]-anchor contrast effects. The magnitude of the Ill-anchor effect was unaffected by the reduction in vowel information available in auditory memory. These results suggest that [i] and [I] anchors affect mechanisms at different levels of processing. The [i] anchoring results may reflect normalization processes in speech perception that operate at an early level of perceptual processing, while the [I] anchoring results represent changes in response criterion mediated by auditory memory for vowel information
Categorial discrimination of vowels produced in syllable context and in isolation
1985
An innovative experimental paradigm that avoids certain problems of response bias in speech perception studies is presented. The paradigm was tested in a replication of an important finding in the perception of American English vowels. The problem was the relative identifiability of vowels in different syllable contexts, Itl-vowel-/t (TVT) and isolated vowels (V). The traditional ABX discrimination procedure was converted to a categorial discrimination task by having the three stimuli on each trial spoken by different people. This task requires a match according to vowel category, not acoustic identity. The technique eliminates the response-alternative problems of keyword identification tasks. Although overall error rates were low, the original findings were replicated: Listeners were more accurate when discriminating some vowels in TVT than in V syllables. Results are interpreted as support for a theory that considers dynamic acoustic information important for vowel perception.
The contribution of consonants versus vowels to word recognition in fluent speech
Acoustics, Speech, and …, 1996
Three perceptual experiments were conducted to test the relative importance of vowels vs. consonants to recognition of uent speech. Sentences were selected from the TIMIT corpus to obtain approximately equal numbers of vowels and consonants within each sentence and equal durations across the set of sentences. In experiments 1 and 2, subjects listened to (a) unaltered TIMIT sentences; (b) sentences in which all of the vowels were replaced by noise; or (c) sentences in which all of the consonants were replaced by noise. The subjects listened to each sentence ve times, and attempted to transcribe what they heard. The results of these experiments show that recognition of words depends more upon vowels than consonants|about twice as many words are recognized when vowels are retained in the speech. The e ect was observed when occurrences of l], r], w], y] m], n], were included in the sentences (experiment 1) or replaced by noise (experiment 2). Experiment 3 tested the hypothesis that vowel boundaries contain more information about the neighboring consonants than vice versa.