Maintenance of Perceptual Information in Speech Perception (original) (raw)
Related papers
Memory maintenance of gradient speech representations is mediated by their expected utility
2019
Language understanding requires listeners to quickly compress large amounts of perceptual information into abstract linguistic categories. Critical cues to those categories are distributed across the speech signal, with some cues appearing substantially later. Speech perception would thus be facilitated if gradient sub-categorical representations of the input are maintained in memory, allowing optimal cue integration. However, indiscriminate maintenance of the high-dimensional signal would tax memory systems. We hypothesize that speech perception balances these pressures by maintaining gradient representations that are expected to facilitate category recognition. Two perception experiments test this hypothesis. Between participants, an initial exposure phase manipulated the utility of information maintenance: in the High-Informativity group, following context always was informative; in the LowInformativity group, following context always was uninformative. A subsequent test phase me...
Contingent categorisation in speech perception
Language and Cognitive Processes, 2013
The speech signal is notoriously variable, with the same phoneme realised differently depending on factors like talker and phonetic context. Variance in the speech signal has led to a proliferation of theories of how listeners recognise speech. A promising approach, supported by computational modelling studies, is contingent categorisation, wherein incoming acoustic cues are computed relative to expectations. We tested contingent encoding empirically. Listeners were asked to categorise fricatives in CV syllables constructed by splicing the fricative from one CV syllable with the vowel from another CV syllable. The two spliced syllables always contained the same fricative, providing consistent bottom-up cues; however on some trials, the vowel and/or talker mismatched between these syllables, giving conflicting contextual information. Listeners were less accurate and slower at identifying the fricatives in mismatching splices. This suggests that listeners rely on context information beyond bottom-up acoustic cues during speech perception, providing support for contingent categorisation.
Auditory discontinuities interact with categorization: Implications for speech perception
Journal of The Acoustical Society of America, 2004
Behavioral experiments with infants, adults, and nonhuman animals converge with neurophysiological findings to suggest that there is a discontinuity in auditory processing of stimulus components differing in onset time by about 20 ms. This discontinuity has been implicated as a basis for boundaries between speech categories distinguished by voice onset time ͑VOT͒. Here, it is investigated how this discontinuity interacts with the learning of novel perceptual categories. Adult listeners were trained to categorize nonspeech stimuli that mimicked certain temporal properties of VOT stimuli. One group of listeners learned categories with a boundary coincident with the perceptual discontinuity. Another group learned categories defined such that the perceptual discontinuity fell within a category. Listeners in the latter group required significantly more experience to reach criterion categorization performance. Evidence of interactions between the perceptual discontinuity and the learned categories extended to generalization tests as well. It has been hypothesized that languages make use of perceptual discontinuities to promote distinctiveness among sounds within a language inventory. The present data suggest that discontinuities interact with category learning. As such, ''learnability'' may play a predictive role in selection of language sound inventories.
Contingent categorization in speech perception
Language, cognition and neuroscience, 2014
The speech signal is notoriously variable, with the same phoneme realized differently depending on factors like talker and phonetic context. Variance in the speech signal has led to a proliferation of theories of how listeners recognize speech. A promising approach, supported by computational modeling studies, is contingent categorization, wherein incoming acoustic cues are computed relative to expectations. We tested contingent encoding empirically. Listeners were asked to categorize fricatives in CV syllables constructed by splicing the fricative from one CV syllable with the vowel from another CV syllable. The two spliced syllables always contained the same fricative, providing consistent bottom-up cues; however on some trials, the vowel and/or talker mismatched between these syllables, giving conflicting contextual information. Listeners were less accurate and slower at identifying the fricatives in mismatching splices. This suggests that listeners rely on context information be...
The Dynamic Nature of Speech Perception
Language and Speech, 2006
The speech perception system must be flexible in responding to the variability in speech sounds caused by differences among speakers and by language change over the lifespan of the listener. Indeed, listeners use lexical knowledge to retune perception of novel speech (Norris, McQueen, & Cutler, 2003). In that study, Dutch listeners made lexical decisions to spoken stimuli, including words with an ambiguous fricative (between [f] and [s]), in either [f]- or [s]-biased lexical contexts. In a subsequent categorization test, the former group of listeners identified more sounds on an [εf]—[εs] continuum as [f] than the latter group. In the present experiment, listeners received the same exposure and test stimuli, but did not make lexical decisions to the exposure items. Instead, they counted them. Categorization results were indistinguishable from those obtained earlier. These adjustments in fricative perception therefore do not depend on explicit judgments during exposure. This learning...
Psychological Science, 2010
Speech sounds are highly variable, yet listeners readily extract information from them and transform continuous acoustic signals into meaningful categories during language comprehension. A central question is whether perceptual encoding captures acoustic detail in a one-to-one fashion or whether it is affected by phonological categories. We addressed this question in an event-related potential (ERP) experiment in which listeners categorized spoken words that varied along a continuous acoustic dimension (voice-onset time, or VOT) in an auditory oddball task. We found that VOT effects were present through a late stage of perceptual processing (N1 component, ~100 ms poststimulus) and were independent of categorization. In addition, effects of within-category differences in VOT were present at a postperceptual categorization stage (P3 component, ~450 ms poststimulus). Thus, at perceptual levels, acoustic information is encoded continuously, independently of phonological information. Further, at phonological levels, fine-grained acoustic differences are preserved along with category information.