What do you expect from an unfamiliar talker? (original) (raw)

Acoustic differences, listener expectations, and the perceptual accommodation of talker variability

Journal of Experimental Psychology: Human Perception and Performance, 2007

Two talkers' productions of the same phoneme may be quite different acoustically, whereas their productions of different speech sounds may be virtually identical. Despite this lack of invariance in the relationship between the speech signal and linguistic categories, listeners experience phonetic constancy across a wide range of talkers, speaking styles, linguistic contexts, and acoustic environments. The authors present evidence that perceptual sensitivity to talker variability involves an active cognitive mechanism: Listeners expecting to hear 2 different talkers differing only slightly in average pitch showed performance costs typical of adjusting to talker variability, whereas listeners hearing the same materials but expecting a single talker or given no special instructions did not show these performance costs. The authors discuss the implications for understanding phonetic constancy despite variability between talkers (and other sources of variability) and for theories of speech perception. The results provide further evidence for active, controlled processing in real-time speech perception and are consistent with a model of talker normalization that involves contextual tuning.

Talker familiarity and the accommodation of talker variability

Attention, perception & psychophysics, 2021

A fundamental problem in speech perception is how (or whether) listeners accommodate variability in the way talkers produce speech. One view of the way listeners cope with this variability is that talker differences are normalizeda mapping between talker-specific characteristics and phonetic categories is computed such that speech is recognized in the context of the talker's vocal characteristics. Consistent with this view, listeners process speech more slowly when the talker changes randomly than when the talker remains constant. An alternative view is that speech perception is based on talker-specific auditory exemplars in memory clustered around linguistic categories that allow talker-independent perception. Consistent with this view, listeners become more efficient at talker-specific phonetic processing after voice identification training. We asked whether phonetic efficiency would increase with talker familiarity by testing listeners with extremely familiar talkers (family members), newly familiar talkers (based on laboratory training), and unfamiliar talkers. We also asked whether familiarity would reduce the need for normalization. As predicted, phonetic efficiency (word recognition in noise) increased with familiarity (unfamiliar < trainedon < family). However, we observed a constant processing cost for talker changes even for pairs of family members. We discuss how normalization and exemplar theories might account for these results, and constraints the results impose on theoretical accounts of phonetic constancy.

Reliably biased: The role of listener expectation in the perception of second language speech

2013

Second language pronunciation research and teaching relies on human listeners to assess second language speakers’ performance. Most applied linguists working in this area have been satisfied that listener ratings are reasonably reliable when well-controlled research protocols are implemented. We argue, however, that listeners demonstrate a certain amount of reliability in their ratings of speakers stemming from shared expectations of a speaker's language and social groups, rather than from the speech itself. In this article, we discuss evidence from perceptual psychology, sociolinguistics, and phonetics demonstrating a sizable listener influence on speech perception. We conclude by suggesting ways for research and teaching to acknowledge and contend with the role of the listener.

Speech perception and generalization across speakers and accents

The seeming ease with which we usually understand each other belies the complexity of the processes that underlie speech perception. One of the biggest computational challenges is that different talkers realize the same speech categories (e.g., /p/) in physically di erent ways. We review the mixture of processes that enable robust speech understanding across talkers despite this lack of invariance. These processes range from automatic pre-speech adjustments of the distribution of energy over acoustic frequencies (normalization) to implicit statistical learning of talker-specific properties (adaptation, perceptual recalibration), to the generalization of these patterns across groups of talkers (e.g., gender di erences).

First Impressions and Last Resorts: How Listeners Adjust to Speaker Variability

PsycEXTRA Dataset, 2007

Perceptual theories must explain how perceivers extract meaningful information from a continuously variable physical signal. In the case of speech, the puzzle is that little reliable acoustic invariance seems to exist. We tested the hypothesis that speech-perception processes recover invariants not about the signal, but rather about the source that produced the signal. Findings from two manipulations suggest that the system learns those properties of speech that result from idiosyncratic characteristics of the speaker; the same properties are not learned when they can be attributed to incidental factors. We also found evidence for how the system determines what is characteristic: In the absence of other information about the speaker, the system relies on episodic order, representing those properties present during early experience as characteristic of the speaker. This ''first-impressions'' bias can be overridden, however, when variation is an incidental consequence of a temporary state (a pen in the speaker's mouth), rather than characteristic of the speaker.

Cross-talker generalization in the perception of non-native speech: a large-scale replication

Speech perception depends on the ability to generalize previously experienced input effectively across talkers. How such cross-talker generalization is achieved has remained an open question. In a seminal study, Bradlow & Bent (2008, henceforth BB08) found that exposure to just five minutes of accented speech can elicit improved recognition that generalizes to an unfamiliar talker of the same accent (N=70 participants). Cross-talker generalization was, however, only observed after exposure to multiple talkers of the accent, not after exposure to a single accented talker. This contrast between single- and multi-talker exposure has been highly influential beyond research on speech perception, suggesting a critical role of exposure variability in learning and generalization. We assess the replicability of BB08’s findings in two large-scale perception experiments (total N=640) including 20 unique combinations of exposure and test talkers. Like BB08, we find robust evidence for cross-talker generalization after multi-talker exposure. Unlike BB08, we also find evidence for generalization after single-talker exposure. The degree of cross-talker generalization depends on the specific combination of exposure and test talker. This and other recent findings suggest that exposure to cross-talker variability is not necessary for cross-talker generalization. Variability during exposure might affect generalization only indirectly, mediated through the informativeness of exposure about subsequent speech during test: similarity-based inferences can explain both the original BB08 and the present findings. We present Bayesian data analysis, including Bayesian meta-analyses and replication tests for generalized linear mixed models. All data, stimuli, and reproducible literate (R markdown) code are shared via OSF.

Talker-specific learning in speech perception

Attention Perception & Psychophysics, 1998

The effects of perceptual learning of talker identity on the recognition of spoken words and sentences were investigated in three experiments. In each experiment, listeners were trained to learn a set of 10 talkers’ voices and were then given an intelligibility test to assess the influence of learning the voices on the processing of the linguistic content of speech. In the first experiment, listeners learned voices from isolated words and were then tested with novel isolated words mixed in noise. The results showed that listeners who were given words produced by familiar talkers at test showed better identification performance than did listeners who were given words produced by unfamiliar talkers. In the second experiment, listeners learned novel voices from sentence-length utterances and were then presented with isolated words. The results showed that learning a talker’s voice from sentences did not generalize well to identification of novel isolated words. In the third experiment, listeners learned voices from sentence-length utterances and were then given sentence-length utterances produced by familiar and unfamiliar talkers at test. We found that perceptual learning of novel voices from sentence-length utterances improved speech intelligibility for words in sentences. Generalization and transfer from voice learning to linguistic processing was found to be sensitive to the talker-specific information available during learning and test. These findings demonstrate that increased sensitivity to talker-specific information affects the perception of the linguistic properties of speech in isolated words and sentences.

Speech perception as probabilistic inference under uncertainty based on social-indexical 1 knowledge 2 3

2016

45 46 Word count: 10,184 (excluding abstract, references and appendices) 47 This manuscript is currently under review. Please don't cite this without the authors' permission. We'll appreciate your comments and feedback. Please direct them to Abstract 48 Talkers differ from each other in how they pronounce the same phonetic contrast. In speech perception, 49 such inter-talker variability contributes to the lack of invariance problem, creating uncertainty about the 50 mapping between acoustic cues and linguistic representations. However, inter-talker variability is not 51 random: talker-specific cue-to-category mappings are often systematically conditioned on social group 52 membership (including, e.g., a talker's gender and age, but also sociolect and dialect). There is now 53 substantial evidence that listeners take advantage of these statistical contingencies. We provide an 54 introduction to how such sensitivity can be productively understood in terms of ideal obse...

Short-Term Perceptual Tuning to Talker Characteristics

PsycEXTRA Dataset, 2012

When a listener encounters an unfamiliar talker, the ensuing perceptual accommodation to the unique characteristics of the talker has two aspects: (1) the listener assesses acoustic characteristics of speech to resolve the properties of the talker's sound production; and, (2) the listener appraises the talker's idiolect, subphonemic phonetic properties that compose the finest grain of linguistic production. A new study controlled a listener's exposure to determine whether the perceptual benefit rests on specific segmental experience. Effects of sentence exposure were measured using a spoken word identification task of Easy words

Perceptual learning, talker specificity, and sound change

Papers in Historical Phonology, 2020

Perceptual learning is when listeners hear novel speech input and shift their subsequent perceptual behavior. In this paper we consider the relationship between sound change and perceptual learning. We spell out the connections we see between perceptual learning and different approaches to sound change and explain how a deeper empirical understanding of the properties of perceptual learning might benefit sound change models. We propose that questions about when listeners generalize their perceptual learning to new talkers might be of of particular interest to theories of sound change. We review the relevant literature, noting that studies of perceptual learning generalization across talkers of the same gender are lacking. Finally, we present new experimental data aimed at filling that gap by comparing cross-talker generalization of fricative boundary perceptual learning in same-gender and different-gender pairs. We find that listeners are much more likely to generalize what they hav...

What do you expect from an unfamiliar talker? (original) (raw)

Related papers