William D. Raymond | Universität Heidelberg (original) (raw)

Uploads

Papers by William D. Raymond

Research paper thumbnail of The ViC corpus of conversational speech

IEEE Trans. Acoust., Speech, Signal Processing, 2003

Research paper thumbnail of Buckeye corpus of conversational speech (2nd release)

Columbus, OH: Department of Psychology, Ohio State University, 2007

Research paper thumbnail of An Optimality-Theoretic Typology of Case and Grammatical Voice Systems

Annual Meeting of the Berkeley Linguistics Society, 1993

Proceedings of the Nineteenth Annual Meeting of the Berkeley Linguistics Society: General Session... more Proceedings of the Nineteenth Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on Semantic Typology and Semantic Universals (1993)

Research paper thumbnail of Breaking Into the Mind: George A. Miller's Early Work in the <em>American Journal of Psychology</em>

The American Journal of Psychology

Research paper thumbnail of Actually: We know what you meant by the way you said it

Research paper thumbnail of The effects of collocational strength and contextual predictability in lexical production 1

The Proceedings of the Institute of Medicine of Chicago, Feb 1, 1999

Word frequency and word predictability have both been proposed in the literature as explanations ... more Word frequency and word predictability have both been proposed in the literature as explanations for word shortening or reduction. Traditionally, these two explanations have been modeled separately. Frequency models focus on the fact that words with high use frequency are shortened compared to low frequency words, whether in the lexicon or during phonetic production , Bybee 1999a. Predictability models focus on the fact that words that are highly predictable from the context are shortened during production , Fowler & Housum 1987. We propose that these "predictability" and "frequency" affects are actually variants of the same basic factor: the informativeness of a word as measured by its probability. In this account, words which are highly predictable or very frequent are highly probable, and hence have a lower information value. 2 A consequence of considering frequency and predictability as probabilities is that they can be unified into a probabilistic model of processing together with other types of probabilistic knowledge, ultimately providing a more complete explanation of language use. Probabilistic models of human language comprehension claim that probabilistic information about words, phrases, and other linguistic structure is represented in the minds of language users and plays a role in language comprehension (Jurafsky 1996, Narayanan & Jurafsky 1998). This paper extends this probabilistic hypothesis to language production, suggesting that speakers use their knowledge of the probability of a word or combinations of words in sentence production. In particular, we present evidence that highly probable (less informative) words are shorter or more reduced in conversational speech. This is true whether the high probability of the word is based on frequency, collocation with neighboring words, repetition of the word in the conversation, or the semantic association of the word with its conversation context.

Research paper thumbnail of Reduction of English function words in switchboard

Icslp, 1998

The causes of pronunciation reduction in 8458 occurrences of ten frequent English function words ... more The causes of pronunciation reduction in 8458 occurrences of ten frequent English function words in a four-hour sample from conversations from the Switchboard corpus were examined. Using ordinary linear and logistic regression models, we examined the length of the words, the form of their vowel (basic, full, or reduced), and final obstruent deletion. For all of these we found strong, independent effects of speaking rate, predictability, the form of the following word, and planning problem disfluencies. The results bear on issues in speech recognition, models of speech production, and conversational analysis.

Research paper thumbnail of Abstract The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability

Research paper thumbnail of Optimality in Wh-Chains

Research paper thumbnail of Optimality and Wh-Extraction

The study of wh-question formation has historically served as the empirical basis for major const... more The study of wh-question formation has historically served as the empirical basis for major constructs in Government-Binding (GB) such as the Empty Category Principle (ECP), the existence of Logical Form (LF) as a separate level of representation-motivated in part by the abstract wh-movement at LF analysis of wh-in-situ in languages like Chinese (Huang, 1982)-and the central but controversial issue of which principles apply at which levels of representation. For example, argues, based on Chinese, that the ECP applies at Sstructure and LF while subjacency and his Condition on Extraction Domain (CED) apply only at S-structure.

Research paper thumbnail of Optimizing the durability and generalizability of knowledge and skills

Page 133. Chapter 7 Optimizing the Durability and Generalizability of Knowledge and Skills Alice ... more Page 133. Chapter 7 Optimizing the Durability and Generalizability of Knowledge and Skills Alice F. Healy, Carolyn J. Buck-Gengler, Immanuel Barshi, James T. Parker, Vivian I. Schneider, William D. Raymond, N. Noelle LaVoie ...

Research paper thumbnail of An analysis of coding consistency in the transcription of spontaneous speech from the Buckeye corpus

Research paper thumbnail of Are effects of word frequency effects of context of use? An analysis of initial fricative reduction in Spanish

ABSTRACT The connection between frequency of form use and form reduction in language has been wid... more ABSTRACT The connection between frequency of form use and form reduction in language has been widely studied. After controlling for multiple contextual factors associated with reduction, word frequency, which reflects a speaker’s cumulative experience with a word, has been reported to predict several types of pronunciation reduction. However, word frequency effects are not found consistently. Some studies have alternatively reported effects on reduction of the cumulative exposure of words to specific reducing environments or measures of contextual predictability. The current study examines cumulative and contextual effects of reducing environments, as well as non-contextual frequency measures, on the reduction of word-initial /s/ in a corpus of spoken New Mexican Spanish. The results show effects of non-cumulative factors on reduction, argued to occur on-line during articulation. There are also effects of the cumulative exposure of words to specific reducing environments and of contextual predictability, but not of the cumulative experience with a word overall (word frequency). The results suggest representational change in the lexicon through repeated exposure of words to reducing environments and call into question proposals that frequency of use per se causes reduction.

Research paper thumbnail of The asymmetric effect of local context on word duration: Consequences for models of production

The important role of following local context on word duration in production is well established.... more The important role of following local context on word duration in production is well established. Word durations are longer for words that occur in phrase-final position (Klatt 1975, inter alia), pre-pausally (Bell et al. 2001; Jurafsky et al. 1998), before repetitions (Shriberg 1999) and other dysfluencies (Fox Tree & Clark 1997), or for words that are highly predictable from following words (Jurafsky et al. 2000). However, little is known about the effects of these local factors on duration when they occur in the context immediately preceding a word. This study examines the effects of three local factors in the previous and following contexts of 17000 tokens of phonetically transcribed content words from the Switchboard corpus of spoken American English. Two of the factors examined are known to affect duration in following context: utterance boundary and dysfluency. The third factor, the predictability of adjacent words as measured by their frequency and conditional probabilities,...

Research paper thumbnail of Optimality in Wh-Chains

Research paper thumbnail of The effect of language model probability on pronunciation reduction

2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001

We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens o... more We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens of the 10 most frequent (function) words in Switchboard: I, and, the, that, a, you, to, of, it, and in, and 2042 tokens of content words whose lexical form ends in a t or d. Our observations were drawn from the phonetically hand-transcribed subset [1] of the Switchboard corpus [2], enabling us to code each word with its pronunciation and duration. Using linear and logistic regression to control for contextual factors, we show that words which have a high unigram, bigram, or reverse bigram (given the following word) probability are shorter, more likely to have a reduced vowel, and more likely to have a deleted final t or d. These results suggest that pronunciation models in speech recognition and synthesis should take into account word probability given both the previous and following words, for both content and function words.

Research paper thumbnail of Probabilistic relations between words

Typological Studies in Language, 2001

Research paper thumbnail of Affordances and Choices in Strategy Shifts in Skill Acquisition Tasks

Research paper thumbnail of Modeling Speech Production and Processing: Evidence From Lexical Substitution Errors

Speech errors have been a major source of data in the development of models of speech production ... more Speech errors have been a major source of data in the development of models of speech production . The contributions to production modeling from speech error studies have been especially important in modeling phonological encoding and lexicalization. The current study continues this line of research by examining in some detail effects from an array of factors that relate error words and their intended targets in a corpus of whole word lexical substitutions. The multi-dimensional approach of the analysis of relational factors investigates what substitutions can tell us about lexical organization and lexical processing. Where the results extend or modify previous findings, their consequences for current models of speech production are examined. The parameters of speech production models that are addressed by the data analyzed include the levels and nature of lexical representations, the degree of functional modularity within the production system, the degree of competition during lexical access, the control of serial timing during production, variable modes of lexical access, the types and locations of frequency effects, and competitor effects in lexical neighborhood structure.

Research paper thumbnail of An analysis of coding consistency in the transcription of spontaneous speech from the Buckeye corpus

Research paper thumbnail of The ViC corpus of conversational speech

IEEE Trans. Acoust., Speech, Signal Processing, 2003

Research paper thumbnail of Buckeye corpus of conversational speech (2nd release)

Columbus, OH: Department of Psychology, Ohio State University, 2007

Research paper thumbnail of An Optimality-Theoretic Typology of Case and Grammatical Voice Systems

Annual Meeting of the Berkeley Linguistics Society, 1993

Proceedings of the Nineteenth Annual Meeting of the Berkeley Linguistics Society: General Session... more Proceedings of the Nineteenth Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on Semantic Typology and Semantic Universals (1993)

Research paper thumbnail of Breaking Into the Mind: George A. Miller's Early Work in the <em>American Journal of Psychology</em>

The American Journal of Psychology

Research paper thumbnail of Actually: We know what you meant by the way you said it

Research paper thumbnail of The effects of collocational strength and contextual predictability in lexical production 1

The Proceedings of the Institute of Medicine of Chicago, Feb 1, 1999

Word frequency and word predictability have both been proposed in the literature as explanations ... more Word frequency and word predictability have both been proposed in the literature as explanations for word shortening or reduction. Traditionally, these two explanations have been modeled separately. Frequency models focus on the fact that words with high use frequency are shortened compared to low frequency words, whether in the lexicon or during phonetic production , Bybee 1999a. Predictability models focus on the fact that words that are highly predictable from the context are shortened during production , Fowler & Housum 1987. We propose that these "predictability" and "frequency" affects are actually variants of the same basic factor: the informativeness of a word as measured by its probability. In this account, words which are highly predictable or very frequent are highly probable, and hence have a lower information value. 2 A consequence of considering frequency and predictability as probabilities is that they can be unified into a probabilistic model of processing together with other types of probabilistic knowledge, ultimately providing a more complete explanation of language use. Probabilistic models of human language comprehension claim that probabilistic information about words, phrases, and other linguistic structure is represented in the minds of language users and plays a role in language comprehension (Jurafsky 1996, Narayanan & Jurafsky 1998). This paper extends this probabilistic hypothesis to language production, suggesting that speakers use their knowledge of the probability of a word or combinations of words in sentence production. In particular, we present evidence that highly probable (less informative) words are shorter or more reduced in conversational speech. This is true whether the high probability of the word is based on frequency, collocation with neighboring words, repetition of the word in the conversation, or the semantic association of the word with its conversation context.

Research paper thumbnail of Reduction of English function words in switchboard

Icslp, 1998

The causes of pronunciation reduction in 8458 occurrences of ten frequent English function words ... more The causes of pronunciation reduction in 8458 occurrences of ten frequent English function words in a four-hour sample from conversations from the Switchboard corpus were examined. Using ordinary linear and logistic regression models, we examined the length of the words, the form of their vowel (basic, full, or reduced), and final obstruent deletion. For all of these we found strong, independent effects of speaking rate, predictability, the form of the following word, and planning problem disfluencies. The results bear on issues in speech recognition, models of speech production, and conversational analysis.

Research paper thumbnail of Abstract The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability

Research paper thumbnail of Optimality in Wh-Chains

Research paper thumbnail of Optimality and Wh-Extraction

The study of wh-question formation has historically served as the empirical basis for major const... more The study of wh-question formation has historically served as the empirical basis for major constructs in Government-Binding (GB) such as the Empty Category Principle (ECP), the existence of Logical Form (LF) as a separate level of representation-motivated in part by the abstract wh-movement at LF analysis of wh-in-situ in languages like Chinese (Huang, 1982)-and the central but controversial issue of which principles apply at which levels of representation. For example, argues, based on Chinese, that the ECP applies at Sstructure and LF while subjacency and his Condition on Extraction Domain (CED) apply only at S-structure.

Research paper thumbnail of Optimizing the durability and generalizability of knowledge and skills

Page 133. Chapter 7 Optimizing the Durability and Generalizability of Knowledge and Skills Alice ... more Page 133. Chapter 7 Optimizing the Durability and Generalizability of Knowledge and Skills Alice F. Healy, Carolyn J. Buck-Gengler, Immanuel Barshi, James T. Parker, Vivian I. Schneider, William D. Raymond, N. Noelle LaVoie ...

Research paper thumbnail of An analysis of coding consistency in the transcription of spontaneous speech from the Buckeye corpus

Research paper thumbnail of Are effects of word frequency effects of context of use? An analysis of initial fricative reduction in Spanish

ABSTRACT The connection between frequency of form use and form reduction in language has been wid... more ABSTRACT The connection between frequency of form use and form reduction in language has been widely studied. After controlling for multiple contextual factors associated with reduction, word frequency, which reflects a speaker’s cumulative experience with a word, has been reported to predict several types of pronunciation reduction. However, word frequency effects are not found consistently. Some studies have alternatively reported effects on reduction of the cumulative exposure of words to specific reducing environments or measures of contextual predictability. The current study examines cumulative and contextual effects of reducing environments, as well as non-contextual frequency measures, on the reduction of word-initial /s/ in a corpus of spoken New Mexican Spanish. The results show effects of non-cumulative factors on reduction, argued to occur on-line during articulation. There are also effects of the cumulative exposure of words to specific reducing environments and of contextual predictability, but not of the cumulative experience with a word overall (word frequency). The results suggest representational change in the lexicon through repeated exposure of words to reducing environments and call into question proposals that frequency of use per se causes reduction.

Research paper thumbnail of The asymmetric effect of local context on word duration: Consequences for models of production

The important role of following local context on word duration in production is well established.... more The important role of following local context on word duration in production is well established. Word durations are longer for words that occur in phrase-final position (Klatt 1975, inter alia), pre-pausally (Bell et al. 2001; Jurafsky et al. 1998), before repetitions (Shriberg 1999) and other dysfluencies (Fox Tree & Clark 1997), or for words that are highly predictable from following words (Jurafsky et al. 2000). However, little is known about the effects of these local factors on duration when they occur in the context immediately preceding a word. This study examines the effects of three local factors in the previous and following contexts of 17000 tokens of phonetically transcribed content words from the Switchboard corpus of spoken American English. Two of the factors examined are known to affect duration in following context: utterance boundary and dysfluency. The third factor, the predictability of adjacent words as measured by their frequency and conditional probabilities,...

Research paper thumbnail of Optimality in Wh-Chains

Research paper thumbnail of The effect of language model probability on pronunciation reduction

2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001

We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens o... more We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens of the 10 most frequent (function) words in Switchboard: I, and, the, that, a, you, to, of, it, and in, and 2042 tokens of content words whose lexical form ends in a t or d. Our observations were drawn from the phonetically hand-transcribed subset [1] of the Switchboard corpus [2], enabling us to code each word with its pronunciation and duration. Using linear and logistic regression to control for contextual factors, we show that words which have a high unigram, bigram, or reverse bigram (given the following word) probability are shorter, more likely to have a reduced vowel, and more likely to have a deleted final t or d. These results suggest that pronunciation models in speech recognition and synthesis should take into account word probability given both the previous and following words, for both content and function words.

Research paper thumbnail of Probabilistic relations between words

Typological Studies in Language, 2001

Research paper thumbnail of Affordances and Choices in Strategy Shifts in Skill Acquisition Tasks

Research paper thumbnail of Modeling Speech Production and Processing: Evidence From Lexical Substitution Errors

Speech errors have been a major source of data in the development of models of speech production ... more Speech errors have been a major source of data in the development of models of speech production . The contributions to production modeling from speech error studies have been especially important in modeling phonological encoding and lexicalization. The current study continues this line of research by examining in some detail effects from an array of factors that relate error words and their intended targets in a corpus of whole word lexical substitutions. The multi-dimensional approach of the analysis of relational factors investigates what substitutions can tell us about lexical organization and lexical processing. Where the results extend or modify previous findings, their consequences for current models of speech production are examined. The parameters of speech production models that are addressed by the data analyzed include the levels and nature of lexical representations, the degree of functional modularity within the production system, the degree of competition during lexical access, the control of serial timing during production, variable modes of lexical access, the types and locations of frequency effects, and competitor effects in lexical neighborhood structure.

Research paper thumbnail of An analysis of coding consistency in the transcription of spontaneous speech from the Buckeye corpus

Research paper thumbnail of Acquisition of morphological variation: The case of the English definite article (by William D. Raymond, Alice F. Healy, Samantha McDonnel, and Charlotte A. Healy)

Language and Cognitive Processes, 2009

Morphological systems have been pivotal in exploring cognitive mechanisms of language use and acq... more Morphological systems have been pivotal in exploring cognitive mechanisms of language use and acquisition. Adult English definite article form preference seems to depend non-deterministically on multiple factors. A corpus study of adult spontaneous speech revealed similar patterns of variability. In an experiment, article variant preferences of three age groups were compared. Children were sensitive to the same phonological factors as adults, but showed effects of more limited experience with articulation and orthography. Preferences across age groups suggest developmental changes, but no evidence that children initially use a default form. Corpus studies of children's and adults’ speech also revealed no evidence for a default. The results point to overgeneralisation of both article variants, resulting from extended competition between variant forms.