Vsevolod Kapatsinski | University of Oregon (original) (raw)

Papers by Vsevolod Kapatsinski

Research paper thumbnail of The Architecture of Grammar In Artificial Grammar Learning: Formal Biases In the Acquisition of Morphophonology and the Nature of the Learning Task

Submitted to the faculty of the University Graduate School in partial fulfillment of the requirem... more Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Doctor of Philosophy in the Departments of Linguistics and Cognitive Science Indiana University May 2009 ... Accepted by the Graduate Faculty, Indiana University, in ...

Research paper thumbnail of Syntagmatic paradigms: learning correspondence from contiguity

Research paper thumbnail of Learning from a single cue: Is phonetic learning dimension-based?

The Journal of the Acoustical Society of America

Phonetic cue-weighting, the process of altering the weights of certain dimensions (e.g., F0) in t... more Phonetic cue-weighting, the process of altering the weights of certain dimensions (e.g., F0) in the speech signal, is a fundamental process in speech perception. Cue-reweighting is the process of adaptation required for understanding new accents and learning second language speech contrasts; however, little is understood about the underlying mechanisms. Harmon et al. (2019) examined three candidate mechanisms (distributional, supervised, and reinforcement learning) showing evidence for reinforcement learning. The current study investigates Harmon et al.’s (2019) assumed phonetic dimensions by asking how a single cue in a phonetic dimension (e.g., a single voice onset time (VOT) value) of a phonological contrast ([b]/[p]) generalizes to other values of the phonetic dimension. Said simpler, is phonetic learning dimension-based? Native English listeners (N = 270) participated in an online perceptual training experiment in which participants were asked to identify word contrasts like pe...

Research paper thumbnail of Learning to unlearn: The role of negative evidence in morphophonological learning

Phonological forms often express many different morphological functions. Learning these functions... more Phonological forms often express many different morphological functions. Learning these functions is quite challenging. How do learners accomplish this? Linguistic studies often assume that learning involves co-occurrence of cues and outcomes –contiguity–, but research shows that mere contiguity cannot explain learning effects (Nixon, 2020, Ramscar et al., 2010). Error-driven learning theories (Rescorla and Wagner, 1972) instead assume learning to be contingency-based: Cues predict outcomes, and learning is viewed as a continuous process of adjusting predictions on the basis of errors. Cues compete with each other to predict outcomes. This cue competition drives learning. One effect of cue competition is unlearning – a cue losing its association to an outcome –, which has not yet been investigated empirically in morphophonological learning. This paper reports an artificial language learning experiment, as well as computational simulations, in which we investigated unlearning in morp...

Research paper thumbnail of How Agglutinative? Searching for Cues to Meaning in Choguita Rarámuri (Tarahumara) Using Discriminative Learning

Morphological Diversity and Linguistic Cognition

Research paper thumbnail of The best-laid plans of mice and men: Competition between top-down and preceding-item cues in plan execution

There is evidence that the process of executing a planned utterance involves the use of both prec... more There is evidence that the process of executing a planned utterance involves the use of both preceding-context and topdown cues. Utterance-initial words are cued only by the topdown plan. In contrast, non-initial words are cued both by top-down cues and preceding-context cues. Co-existence of both cue types raises the question of how they interact during learning. We argue that this interaction is competitive: items that tend to be preceded by predictive preceding-context cues are harder to activate from the plan without this predictive context. A novel computational model of this competition is developed. The model is tested on a corpus of repetition disfluencies and shown to account for the influences on patterns of restarts during production. In particular, this model predicts a novel Initiation Effect: following an interruption, speakers re-initiate production from words that tend to occur in utterance-initial position, even when they are not initial in the interrupted utterance.

Research paper thumbnail of Determinants of Lengths of Repetition Disfluencies : Probabilistic syntactic constituency in speech production

1 Introduction Usage-based theories of grammar suggest that constituent structure emerges in part... more 1 Introduction Usage-based theories of grammar suggest that constituent structure emerges in part from co-occurrence: items used together fuse together forming cohesive, hard-to-interrupt units (Bybee 2002, see also Gregory et al. 1999, Kapatsinski 2010, Stefanowitsch & Gries 2003). This study is an effort to investigate the effects of co-occurrence on constituent structure in language production. We investigate these effects by looking at repetition disfluencies, in which one or more elements in the sentence are repeated after an interruption point in speech. Repetition disfluencies have been argued to be sensititive to constituency: the speaker restarts production from the major constituent boundary nearest to the point at which the flow of speech was interrupted (the " interruption point " , Clark from the more general hypothesis that the more cohesive a unit, the less likely it is to be interrupted (Kapatsinski 2010). For example, speech production is never restarted f...

Research paper thumbnail of Fuse to be used: A weak cue's guide to attracting attention

Cognitive Science, 2016

Several studies examined cue competition in human learning by testing learners on a combination o... more Several studies examined cue competition in human learning by testing learners on a combination of conflicting cues rooting for different outcomes, with each cue perfectly predicting its outcome. A common result has been that learners faced with cue conflict choose the outcome associated with the rare cue (the Inverse Base Rate Effect, IBRE). Here, we investigate cue competition including IBRE with sentences containing cues to meanings in a visual world. We do not observe IBRE. Instead we find that position in the sentence strongly influences cue salience. Faced with conflict between an initial cue and a non-initial cue, learners choose the outcome associated with the initial cue, whether frequent or rare. However, a frequent configuration of non-initial cues that are not sufficiently salient on their own can overcome a competing salient initial cue rooting for a different meaning. This provides a possible explanation for certain recurring patterns in language change.

Research paper thumbnail of Frequency Effects in Morphologisation of Korean / n /-Epenthesis

This study accounts for Korean /n/-epenthesis from a usage-based perspective, by describing the r... more This study accounts for Korean /n/-epenthesis from a usage-based perspective, by describing the reduced productivity of epenthesis as an analogical change in progress. We found that epenthesis probability rises as whole-word frequency increases, supporting the hypothesis that analogical change begins in lowfrequency words (Bybee 2002). We interpret the findings as support for the idea that frequent forms are stored and retrieved in production directly while rare words may be derived using grammar. The results further support the existence of morphological strata in Korean. We show that the constituents undergoing /n/epenthesis are largely limited to native rather than Sino-Korean morphemes. However, not all native morphemes are able to trigger /n/ epenthesis. We argue that particular native morphemes are associated with, and able to trigger, epenthesis to the extent that they tend to occur in epenthesis-favoring contexts (Bybee 2002, Raymond & Brown 2012).

Research paper thumbnail of A Hebbian account of entrenchment and (over)-extension in language learning

Cognitive Science, 2017

In production, frequently used words are preferentially extended to new, though related meanings.... more In production, frequently used words are preferentially extended to new, though related meanings. In comprehension, frequent exposure to a word instead makes the learner confident that all of the word’s legitimate uses have been experienced, resulting in an entrenched form-meaning mapping between the word and its experienced meaning(s). This results in a perception-production dissociation, where the forms speakers are most likely to map onto a novel meaning are precisely the forms that they believe can never be used that way. At first glance, this result challenges the idea of bidirectional form-meaning mappings, assumed by all current approaches to linguistic theory. In this paper, we show that bidirectional form-meaning mappings are not in fact challenged by this production-perception dissociation. We show that the production-perception dissociation is expected even if learners of the lexicon acquire simple symmetrical form-meaning associations through simple Hebbian learning.

Research paper thumbnail of A theory of repetition and retrieval in language production

Psychological Review, 2021

Repetition appears to be part of error correction and action preparation in all domains that invo... more Repetition appears to be part of error correction and action preparation in all domains that involve producing an action sequence. The present work contends that the ubiquity of repetition is due to its role in resolving a problem inherent to planning and retrieval of action sequences: the Problem of Retrieval. Repetitions occur when the production to perform next is not activated enough to be executed. Repetitions are helpful in this situation because the repeated action sequence activates the likely continuation. We model a corpus of natural speech using a recurrent network, with words as units of production. We show that repeated material makes upcoming words more predictable, especially when more than one word is repeated. Speakers are argued to produce multiword repetitions by using backward associations to reactivate recently produced words. The existence of multiword repetitions means that speakers must decide where to reinitiate execution from. We show that production restarts from words that have seldom occurred in a predictive preceding-word context and have often occurred utterance-initially. These results are explained by competition between preceding-context and top-down cues over the course of language learning. The proposed theory improves on structural accounts of repetition disfluencies, and integrates repetition disfluencies in language production with repetitions observed in other domains of skilled action. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

Research paper thumbnail of Evaluating Logistic Mixed-Effects Models of Corpus-Linguistic Data in Light of Lexical Diffusion

Quantitative Methods in the Humanities and Social Sciences, 2018

Evaluating the performance of mixed-effects models on the data they are trained on leads to probl... more Evaluating the performance of mixed-effects models on the data they are trained on leads to problems in estimating model goodness. Nonetheless, mixed-effects models are preferable for corpus data, where some items have many more observations than others, because not having random effects in the model can cause fixed-effects coefficients to be overly influenced by frequent items, which are often exceptional. We explore methods for evaluating logistic mixedeffects models of both corpus and experimental data types through simulations. We suggest that the model should be tested on data it has not been trained on using some method of cross-validation and that all items (e.g., words), rather than observations, should contribute equally to estimated accuracy of the model.

Research paper thumbnail of Acoustic cues of vowel quality to coda nasal perception in southern Min

Research paper thumbnail of The spoken language research laboratories (SLRL) at the University of Oregon

The Journal of the Acoustical Society of America, 2020

The Spoken Language Research Laboratories (SLRL) at the University of Oregon houses 5 integrated ... more The Spoken Language Research Laboratories (SLRL) at the University of Oregon houses 5 integrated laboratories that focus onspeech communication. The SLRL occupies nearly 4000 square feet and includes the following state-of-the-art facilities: 10 sound-attenuated subject running rooms; 3 sound-attenuated (clinical-type) observation rooms; 2 single-wall sound booths in another sound attenuated room; 2 waiting rooms; 1 large computer lab; 2 graduate student workrooms; and 8 offices. The SLRL is further equipped with all necessary high-quality audio and audio-visual recording equipment. Research at the SLRL ranges from work on language variation and change to first and second language acquisition to the perception and production of spoken language. The labs provide a dynamic, supportive environment for collaborative research and training in experimental design, acoustic and speech movement analysis, statistical analysis, grant writing, science communication and community outreach. The S...

Research paper thumbnail of What happens to large changes? Saltation produces well-liked outputs that are hard to generate

Laboratory Phonology: Journal of the Association for Laboratory Phonology, 2018

Saltatory alternations 'skip over' intermediate sounds, as in k~s skipping over [t]. Recent resea... more Saltatory alternations 'skip over' intermediate sounds, as in k~s skipping over [t]. Recent research has argued that saltation is diachronically unstable and documented one possible cause of instability: Learners exposed to saltatory alternations may overgeneralize them to intermediate sounds. However, this research has trained participants to criterion or excluded participants who did not reach criterion accuracy on familiar sounds. In first language acquisition, learners of languages with saltatory patterns cannot hope to receive more exposure to the pattern than those learning non-saltatory patterns. For this reason, we examined learning of saltatory and non-saltatory patterns after a constant amount of training. We compared saltatory labial palatalization to non-saltatory alveolar and velar palatalization. Participants showed overgeneralization of saltatory palatalization in a judgment task. However, saltatory alternations did not result in increased rates of palatalizing similar sounds, compared to non-saltatory alternations. Instead, saltatory alternations were less likely to be produced than non-saltatory alternations. These results suggest that large, saltatory alternations may be diachronically unstable because they are harder to (learn to) produce. Instead of being overgeneralized to intermediate sounds, saltatory alternations may disappear from the language by losing productivity and being replaced with faithful mappings.

Research paper thumbnail of Distributional learning is error-driven: the role of surprise in the acquisition of phonetic categories

Linguistics Vanguard, 2018

Much previous research on distributional learning and phonetic categorization assumes that catego... more Much previous research on distributional learning and phonetic categorization assumes that categories are either faithful reproductions or parametric summaries of experienced frequency distributions, acquired through a Hebbian learning process in which every experience contributes equally to the category representation. We suggest that category representations may instead be formed via error-driven predictive learning. Rather than passively storing tagged category exemplars or updating parametric summaries of token counts, learners actively anticipate upcoming events and update their beliefs in proportion to how surprising/unexpected these events turn out to be. As a result, rare category members exert a disproportionate influence on the category representation. We present evidence for this hypothesis from a distributional learning experiment on acquiring a novel phonetic category, and show that the results are well described by a classic error-driven learning model (Rescorla, R. A....

Research paper thumbnail of The power of a unimodal distribution in cue reweighting: Unimodality vs prediction error as signs of cue irrelevance

The Journal of the Acoustical Society of America, 2017

Maye & Gerken (2000) proposed that sound categories can be learned from probability distributions... more Maye & Gerken (2000) proposed that sound categories can be learned from probability distributions: a unimodal distribution suggests a single category, while a bimodal one suggests two contrasting ones. Research on distributional learning has focused on developing a contrast through exposure to a bimodal distribution. Here, we instead investigate how exposure to a unimodal distribution affects perception of a pre-existing multidimensional contrast (voicing, for which the primary cue is VOT). A total of 60 adult native English speakers were exposed to either bimodal or unimodal VOT distributions spanning the unaspirated/aspirated boundary (bear/pear). However, we paired acoustic stimuli with pictures of bears and pears independently of VOT in training. For each stimulus, participants were asked to guess the referent and received (random) feedback, generating an error signal that suggested VOT is no longer informative and should be downweighed. In this design, the bimodal distribution ...

Research paper thumbnail of Lay Listener Classification and Evaluation of Typical and Atypical Children's Speech

Language and speech, 2017

Verbal children with autism spectrum disorder (ASD) often also have atypical speech. In the conte... more Verbal children with autism spectrum disorder (ASD) often also have atypical speech. In the context of the many challenges associated with ASD, do speech sound pattern differences really matter? The current study addressed this question. Structured spontaneous speech was elicited from 34 children: 17 with ASD, whose clinicians reported unusual speech prosody; and 17 typically-developing, age-matched controls. Multiword utterances were excerpted from each child's speech sample and presented to young adult listeners, who had no clinical training or experience. In Experiment 1, listeners classified band pass filtered and unaltered excerpts as "typical" or "disordered". Children with ASD were only distinguished from typical children based on unaltered speech, but the analyses indicated unique contributions from speech sound patterns. In Experiment 2, listeners provided likeability ratings on the filtered and unaltered excerpts. Again, lay listeners only distingui...

Research paper thumbnail of Putting old tools to novel uses: The role of form accessibility in semantic extension

Cognitive psychology, Jan 19, 2017

An increase in frequency of a form has been argued to result in semantic extension (Bybee, 2003; ... more An increase in frequency of a form has been argued to result in semantic extension (Bybee, 2003; Zipf, 1949). Yet, research on the acquisition of lexical semantics suggests that a form that frequently co-occurs with a meaning gets restricted to that meaning (Xu & Tenenbaum, 2007). The current work reconciles these positions by showing that - through its effect on form accessibility - frequency causes semantic extension in production, while at the same time causing entrenchment in comprehension. Repeatedly experiencing a form paired with a specific meaning makes one more likely to re-use the form to express related meanings, while also increasing one's confidence that the form is never used to express those meanings. Recurrent pathways of semantic change are argued to result from a tug of war between the production-side pressure to reuse easily accessible forms and the comprehension-side confidence that one has seen all possible uses of a frequent form.

Research paper thumbnail of Temporal structure of repetition disfluencies in American English

Journal of the Acoustical Society of America, 2016

A repetition disfluency involves an interruption in the flow of speech followed by a restart, lea... more A repetition disfluency involves an interruption in the flow of speech followed by a restart, leading to repetition of one or more words. We analyzed the temporal structure of one-word, two-word, and three-word repetition disfluencies in the Switchboard Corpus (none-word = 30546, ntwo-word = 8102, nthree-word = 1606). Comparing durations of words preceding an interruption point to their repeated counterparts, we find that repetition is typically accompanied by prolongation, which mainly influences the last word preceding the interruption point. When prolongation does not provide enough time for planning upcoming speech—as there seems to be a temporal limit to prolongation—the speaker repeats parts of the utterance just produced. Our results demonstrate that the number of words repeated is determined both by word duration and by co-occurrence relations between words. Mixed effects logistic regression analysis revealed that longer words are less likely to be repeated (z=-24.45, p<....

Research paper thumbnail of The Architecture of Grammar In Artificial Grammar Learning: Formal Biases In the Acquisition of Morphophonology and the Nature of the Learning Task

Submitted to the faculty of the University Graduate School in partial fulfillment of the requirem... more Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Doctor of Philosophy in the Departments of Linguistics and Cognitive Science Indiana University May 2009 ... Accepted by the Graduate Faculty, Indiana University, in ...

Research paper thumbnail of Syntagmatic paradigms: learning correspondence from contiguity

Research paper thumbnail of Learning from a single cue: Is phonetic learning dimension-based?

The Journal of the Acoustical Society of America

Phonetic cue-weighting, the process of altering the weights of certain dimensions (e.g., F0) in t... more Phonetic cue-weighting, the process of altering the weights of certain dimensions (e.g., F0) in the speech signal, is a fundamental process in speech perception. Cue-reweighting is the process of adaptation required for understanding new accents and learning second language speech contrasts; however, little is understood about the underlying mechanisms. Harmon et al. (2019) examined three candidate mechanisms (distributional, supervised, and reinforcement learning) showing evidence for reinforcement learning. The current study investigates Harmon et al.’s (2019) assumed phonetic dimensions by asking how a single cue in a phonetic dimension (e.g., a single voice onset time (VOT) value) of a phonological contrast ([b]/[p]) generalizes to other values of the phonetic dimension. Said simpler, is phonetic learning dimension-based? Native English listeners (N = 270) participated in an online perceptual training experiment in which participants were asked to identify word contrasts like pe...

Research paper thumbnail of Learning to unlearn: The role of negative evidence in morphophonological learning

Phonological forms often express many different morphological functions. Learning these functions... more Phonological forms often express many different morphological functions. Learning these functions is quite challenging. How do learners accomplish this? Linguistic studies often assume that learning involves co-occurrence of cues and outcomes –contiguity–, but research shows that mere contiguity cannot explain learning effects (Nixon, 2020, Ramscar et al., 2010). Error-driven learning theories (Rescorla and Wagner, 1972) instead assume learning to be contingency-based: Cues predict outcomes, and learning is viewed as a continuous process of adjusting predictions on the basis of errors. Cues compete with each other to predict outcomes. This cue competition drives learning. One effect of cue competition is unlearning – a cue losing its association to an outcome –, which has not yet been investigated empirically in morphophonological learning. This paper reports an artificial language learning experiment, as well as computational simulations, in which we investigated unlearning in morp...

Research paper thumbnail of How Agglutinative? Searching for Cues to Meaning in Choguita Rarámuri (Tarahumara) Using Discriminative Learning

Morphological Diversity and Linguistic Cognition

Research paper thumbnail of The best-laid plans of mice and men: Competition between top-down and preceding-item cues in plan execution

There is evidence that the process of executing a planned utterance involves the use of both prec... more There is evidence that the process of executing a planned utterance involves the use of both preceding-context and topdown cues. Utterance-initial words are cued only by the topdown plan. In contrast, non-initial words are cued both by top-down cues and preceding-context cues. Co-existence of both cue types raises the question of how they interact during learning. We argue that this interaction is competitive: items that tend to be preceded by predictive preceding-context cues are harder to activate from the plan without this predictive context. A novel computational model of this competition is developed. The model is tested on a corpus of repetition disfluencies and shown to account for the influences on patterns of restarts during production. In particular, this model predicts a novel Initiation Effect: following an interruption, speakers re-initiate production from words that tend to occur in utterance-initial position, even when they are not initial in the interrupted utterance.

Research paper thumbnail of Determinants of Lengths of Repetition Disfluencies : Probabilistic syntactic constituency in speech production

1 Introduction Usage-based theories of grammar suggest that constituent structure emerges in part... more 1 Introduction Usage-based theories of grammar suggest that constituent structure emerges in part from co-occurrence: items used together fuse together forming cohesive, hard-to-interrupt units (Bybee 2002, see also Gregory et al. 1999, Kapatsinski 2010, Stefanowitsch & Gries 2003). This study is an effort to investigate the effects of co-occurrence on constituent structure in language production. We investigate these effects by looking at repetition disfluencies, in which one or more elements in the sentence are repeated after an interruption point in speech. Repetition disfluencies have been argued to be sensititive to constituency: the speaker restarts production from the major constituent boundary nearest to the point at which the flow of speech was interrupted (the " interruption point " , Clark from the more general hypothesis that the more cohesive a unit, the less likely it is to be interrupted (Kapatsinski 2010). For example, speech production is never restarted f...

Research paper thumbnail of Fuse to be used: A weak cue's guide to attracting attention

Cognitive Science, 2016

Several studies examined cue competition in human learning by testing learners on a combination o... more Several studies examined cue competition in human learning by testing learners on a combination of conflicting cues rooting for different outcomes, with each cue perfectly predicting its outcome. A common result has been that learners faced with cue conflict choose the outcome associated with the rare cue (the Inverse Base Rate Effect, IBRE). Here, we investigate cue competition including IBRE with sentences containing cues to meanings in a visual world. We do not observe IBRE. Instead we find that position in the sentence strongly influences cue salience. Faced with conflict between an initial cue and a non-initial cue, learners choose the outcome associated with the initial cue, whether frequent or rare. However, a frequent configuration of non-initial cues that are not sufficiently salient on their own can overcome a competing salient initial cue rooting for a different meaning. This provides a possible explanation for certain recurring patterns in language change.

Research paper thumbnail of Frequency Effects in Morphologisation of Korean / n /-Epenthesis

This study accounts for Korean /n/-epenthesis from a usage-based perspective, by describing the r... more This study accounts for Korean /n/-epenthesis from a usage-based perspective, by describing the reduced productivity of epenthesis as an analogical change in progress. We found that epenthesis probability rises as whole-word frequency increases, supporting the hypothesis that analogical change begins in lowfrequency words (Bybee 2002). We interpret the findings as support for the idea that frequent forms are stored and retrieved in production directly while rare words may be derived using grammar. The results further support the existence of morphological strata in Korean. We show that the constituents undergoing /n/epenthesis are largely limited to native rather than Sino-Korean morphemes. However, not all native morphemes are able to trigger /n/ epenthesis. We argue that particular native morphemes are associated with, and able to trigger, epenthesis to the extent that they tend to occur in epenthesis-favoring contexts (Bybee 2002, Raymond & Brown 2012).

Research paper thumbnail of A Hebbian account of entrenchment and (over)-extension in language learning

Cognitive Science, 2017

In production, frequently used words are preferentially extended to new, though related meanings.... more In production, frequently used words are preferentially extended to new, though related meanings. In comprehension, frequent exposure to a word instead makes the learner confident that all of the word’s legitimate uses have been experienced, resulting in an entrenched form-meaning mapping between the word and its experienced meaning(s). This results in a perception-production dissociation, where the forms speakers are most likely to map onto a novel meaning are precisely the forms that they believe can never be used that way. At first glance, this result challenges the idea of bidirectional form-meaning mappings, assumed by all current approaches to linguistic theory. In this paper, we show that bidirectional form-meaning mappings are not in fact challenged by this production-perception dissociation. We show that the production-perception dissociation is expected even if learners of the lexicon acquire simple symmetrical form-meaning associations through simple Hebbian learning.

Research paper thumbnail of A theory of repetition and retrieval in language production

Psychological Review, 2021

Repetition appears to be part of error correction and action preparation in all domains that invo... more Repetition appears to be part of error correction and action preparation in all domains that involve producing an action sequence. The present work contends that the ubiquity of repetition is due to its role in resolving a problem inherent to planning and retrieval of action sequences: the Problem of Retrieval. Repetitions occur when the production to perform next is not activated enough to be executed. Repetitions are helpful in this situation because the repeated action sequence activates the likely continuation. We model a corpus of natural speech using a recurrent network, with words as units of production. We show that repeated material makes upcoming words more predictable, especially when more than one word is repeated. Speakers are argued to produce multiword repetitions by using backward associations to reactivate recently produced words. The existence of multiword repetitions means that speakers must decide where to reinitiate execution from. We show that production restarts from words that have seldom occurred in a predictive preceding-word context and have often occurred utterance-initially. These results are explained by competition between preceding-context and top-down cues over the course of language learning. The proposed theory improves on structural accounts of repetition disfluencies, and integrates repetition disfluencies in language production with repetitions observed in other domains of skilled action. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

Research paper thumbnail of Evaluating Logistic Mixed-Effects Models of Corpus-Linguistic Data in Light of Lexical Diffusion

Quantitative Methods in the Humanities and Social Sciences, 2018

Evaluating the performance of mixed-effects models on the data they are trained on leads to probl... more Evaluating the performance of mixed-effects models on the data they are trained on leads to problems in estimating model goodness. Nonetheless, mixed-effects models are preferable for corpus data, where some items have many more observations than others, because not having random effects in the model can cause fixed-effects coefficients to be overly influenced by frequent items, which are often exceptional. We explore methods for evaluating logistic mixedeffects models of both corpus and experimental data types through simulations. We suggest that the model should be tested on data it has not been trained on using some method of cross-validation and that all items (e.g., words), rather than observations, should contribute equally to estimated accuracy of the model.

Research paper thumbnail of Acoustic cues of vowel quality to coda nasal perception in southern Min

Research paper thumbnail of The spoken language research laboratories (SLRL) at the University of Oregon

The Journal of the Acoustical Society of America, 2020

The Spoken Language Research Laboratories (SLRL) at the University of Oregon houses 5 integrated ... more The Spoken Language Research Laboratories (SLRL) at the University of Oregon houses 5 integrated laboratories that focus onspeech communication. The SLRL occupies nearly 4000 square feet and includes the following state-of-the-art facilities: 10 sound-attenuated subject running rooms; 3 sound-attenuated (clinical-type) observation rooms; 2 single-wall sound booths in another sound attenuated room; 2 waiting rooms; 1 large computer lab; 2 graduate student workrooms; and 8 offices. The SLRL is further equipped with all necessary high-quality audio and audio-visual recording equipment. Research at the SLRL ranges from work on language variation and change to first and second language acquisition to the perception and production of spoken language. The labs provide a dynamic, supportive environment for collaborative research and training in experimental design, acoustic and speech movement analysis, statistical analysis, grant writing, science communication and community outreach. The S...

Research paper thumbnail of What happens to large changes? Saltation produces well-liked outputs that are hard to generate

Laboratory Phonology: Journal of the Association for Laboratory Phonology, 2018

Saltatory alternations 'skip over' intermediate sounds, as in k~s skipping over [t]. Recent resea... more Saltatory alternations 'skip over' intermediate sounds, as in k~s skipping over [t]. Recent research has argued that saltation is diachronically unstable and documented one possible cause of instability: Learners exposed to saltatory alternations may overgeneralize them to intermediate sounds. However, this research has trained participants to criterion or excluded participants who did not reach criterion accuracy on familiar sounds. In first language acquisition, learners of languages with saltatory patterns cannot hope to receive more exposure to the pattern than those learning non-saltatory patterns. For this reason, we examined learning of saltatory and non-saltatory patterns after a constant amount of training. We compared saltatory labial palatalization to non-saltatory alveolar and velar palatalization. Participants showed overgeneralization of saltatory palatalization in a judgment task. However, saltatory alternations did not result in increased rates of palatalizing similar sounds, compared to non-saltatory alternations. Instead, saltatory alternations were less likely to be produced than non-saltatory alternations. These results suggest that large, saltatory alternations may be diachronically unstable because they are harder to (learn to) produce. Instead of being overgeneralized to intermediate sounds, saltatory alternations may disappear from the language by losing productivity and being replaced with faithful mappings.

Research paper thumbnail of Distributional learning is error-driven: the role of surprise in the acquisition of phonetic categories

Linguistics Vanguard, 2018

Much previous research on distributional learning and phonetic categorization assumes that catego... more Much previous research on distributional learning and phonetic categorization assumes that categories are either faithful reproductions or parametric summaries of experienced frequency distributions, acquired through a Hebbian learning process in which every experience contributes equally to the category representation. We suggest that category representations may instead be formed via error-driven predictive learning. Rather than passively storing tagged category exemplars or updating parametric summaries of token counts, learners actively anticipate upcoming events and update their beliefs in proportion to how surprising/unexpected these events turn out to be. As a result, rare category members exert a disproportionate influence on the category representation. We present evidence for this hypothesis from a distributional learning experiment on acquiring a novel phonetic category, and show that the results are well described by a classic error-driven learning model (Rescorla, R. A....

Research paper thumbnail of The power of a unimodal distribution in cue reweighting: Unimodality vs prediction error as signs of cue irrelevance

The Journal of the Acoustical Society of America, 2017

Maye & Gerken (2000) proposed that sound categories can be learned from probability distributions... more Maye & Gerken (2000) proposed that sound categories can be learned from probability distributions: a unimodal distribution suggests a single category, while a bimodal one suggests two contrasting ones. Research on distributional learning has focused on developing a contrast through exposure to a bimodal distribution. Here, we instead investigate how exposure to a unimodal distribution affects perception of a pre-existing multidimensional contrast (voicing, for which the primary cue is VOT). A total of 60 adult native English speakers were exposed to either bimodal or unimodal VOT distributions spanning the unaspirated/aspirated boundary (bear/pear). However, we paired acoustic stimuli with pictures of bears and pears independently of VOT in training. For each stimulus, participants were asked to guess the referent and received (random) feedback, generating an error signal that suggested VOT is no longer informative and should be downweighed. In this design, the bimodal distribution ...

Research paper thumbnail of Lay Listener Classification and Evaluation of Typical and Atypical Children's Speech

Language and speech, 2017

Verbal children with autism spectrum disorder (ASD) often also have atypical speech. In the conte... more Verbal children with autism spectrum disorder (ASD) often also have atypical speech. In the context of the many challenges associated with ASD, do speech sound pattern differences really matter? The current study addressed this question. Structured spontaneous speech was elicited from 34 children: 17 with ASD, whose clinicians reported unusual speech prosody; and 17 typically-developing, age-matched controls. Multiword utterances were excerpted from each child's speech sample and presented to young adult listeners, who had no clinical training or experience. In Experiment 1, listeners classified band pass filtered and unaltered excerpts as "typical" or "disordered". Children with ASD were only distinguished from typical children based on unaltered speech, but the analyses indicated unique contributions from speech sound patterns. In Experiment 2, listeners provided likeability ratings on the filtered and unaltered excerpts. Again, lay listeners only distingui...

Research paper thumbnail of Putting old tools to novel uses: The role of form accessibility in semantic extension

Cognitive psychology, Jan 19, 2017

An increase in frequency of a form has been argued to result in semantic extension (Bybee, 2003; ... more An increase in frequency of a form has been argued to result in semantic extension (Bybee, 2003; Zipf, 1949). Yet, research on the acquisition of lexical semantics suggests that a form that frequently co-occurs with a meaning gets restricted to that meaning (Xu & Tenenbaum, 2007). The current work reconciles these positions by showing that - through its effect on form accessibility - frequency causes semantic extension in production, while at the same time causing entrenchment in comprehension. Repeatedly experiencing a form paired with a specific meaning makes one more likely to re-use the form to express related meanings, while also increasing one's confidence that the form is never used to express those meanings. Recurrent pathways of semantic change are argued to result from a tug of war between the production-side pressure to reuse easily accessible forms and the comprehension-side confidence that one has seen all possible uses of a frequent form.

Research paper thumbnail of Temporal structure of repetition disfluencies in American English

Journal of the Acoustical Society of America, 2016

A repetition disfluency involves an interruption in the flow of speech followed by a restart, lea... more A repetition disfluency involves an interruption in the flow of speech followed by a restart, leading to repetition of one or more words. We analyzed the temporal structure of one-word, two-word, and three-word repetition disfluencies in the Switchboard Corpus (none-word = 30546, ntwo-word = 8102, nthree-word = 1606). Comparing durations of words preceding an interruption point to their repeated counterparts, we find that repetition is typically accompanied by prolongation, which mainly influences the last word preceding the interruption point. When prolongation does not provide enough time for planning upcoming speech—as there seems to be a temporal limit to prolongation—the speaker repeats parts of the utterance just produced. Our results demonstrate that the number of words repeated is determined both by word duration and by co-occurrence relations between words. Mixed effects logistic regression analysis revealed that longer words are less likely to be repeated (z=-24.45, p<....