Tom Wasow | Stanford University (original) (raw)
Papers by Tom Wasow
Both clause types were tested in four ways, using two methods of comparison and two populations o... more Both clause types were tested in four ways, using two methods of comparison and two populations of participants. One method had participants distribute 100 points between the versions with and without ‘that’; the other was a binary forced-choice, under time constraints. Both methods were employed in a traditional lab setting and also crowd-sourced via Amazon’s Mechanical Turk (AMT) facility (Munro et al., 2010).
Expecting the Unexpected: Exceptions in Grammar, 2011
Kempson's interesting commentary raises two important points.∗ First... more Kempson's interesting commentary raises two important points.∗ First, while extolling the value of probabilistic corpus data, she is not ready to accept “that speakers manipulate probability estimates as input to the decisions as to how to say what they do”. Second, she suggests an alternative to our attempt to explain the correlation between predictability of non-subject relative clauses and the absence of that in such clauses. We discuss these points in reverse order and raise some additional questions for future research. Our proposed ...
Natural Language and Linguistic Theory, 1996
Language Variation and Change, 2010
This paper examines a short-lived innovation, quotative all, in real and apparent time. We used a... more This paper examines a short-lived innovation, quotative all, in real and apparent time. We used a two-pronged method to trace the trajectory of all over the past two decades: (i) Quantitative analyses of the quotative system of young Californians from different decades; this reveals a startling crossover pattern: in 1990/1994, all predominates, but by 2005, it has given way to like. (ii) Searches of Internet newsgroups; these confirm that after rising briskly in the 1990s, all is declining. Tracing the changing usage of quotative options provides year-to-year evidence that all has recently given way to like. Our paper has two aims: We provide insights from ongoing language change regarding short-term innovations in the history of English. We also discuss our collaboration with Google Inc. and argue for the value of newsgroups to research projects investigating linguistic variation and change in real time, especially where recorded conversational tokens are relatively sparse.
American Speech, 2007
This article presents a synchronic and diachronic investigation of the lexeme all in its intensif... more This article presents a synchronic and diachronic investigation of the lexeme all in its intensifier and quotative functions. We delimit the new from the old functions of the lexeme and present a variationist account of all's external and internal constraints in various syntactic environments. our analysis is based on a variety of data sets, which include traditional sociolinguistic interviews as well as data culled from internet searches and a new Google-based search tool. on the basis of these data sets, we show that intensifier all is not new but has expanded in syntactic environments. We further pinpoint the syntactic and semantic niches which all has appropriated for itself among California adolescents and compare its patterning with that of other intensifiers in our data and the data of other researchers. All's extension to quotative function, however, is new, apparently originating in California in the 1980s. our investigation of its development spans across data sets...
AMLaP-2005, Sep 5, 2005
The predictability of a word has been shown to correlate positively its phonetic reduction (Bell ... more The predictability of a word has been shown to correlate positively its phonetic reduction (Bell et al., 2003). Surprisingly, little work has been done on the correlation between predictability and word omission (ie variations where a word can be omitted without leading to ungrammaticality; Resnik, 1996). We present a large-scale corpus study of predictability effects on such a phenomenon, relativizer omission in non-subject-extracted relative clauses (NSRCs):
The 18th Annual CUNY Sentence …, 2005
We also investigated four measures of an NSRC's predictability. The general idea was to see ... more We also investigated four measures of an NSRC's predictability. The general idea was to see how information that is available prior to the beginning of the NSRC (and may therefore help to predict the presence of an NSRC) influences relativizer likelihood.
The Handbook of Linguistics
Journal of Linguistics, 2019
The English auxiliary system exhibits many lexical exceptions and subregularities, and considerab... more The English auxiliary system exhibits many lexical exceptions and subregularities, and considerable dialectal variation, all of which are frequently omitted from generative analyses and discussions. This paper presents a detailed, movement-free account of the English Auxiliary System within Sign-Based Construction Grammar (Sag 2010, Michaelis 2011, Boas & Sag 2012) that utilizes techniques of lexicalist and construction-based analysis. The resulting conception of linguistic knowledge involves constraints that license hierarchical structures directly (as in context-free grammar), rather than by appeal to mappings over such structures. This allows English auxiliaries to be modeled as a class of verbs whose behavior is governed by general and class-specific constraints. Central to this account is a novel use of the feature aux, which is set both constructionally and lexically, allowing for a complex interplay between various grammatical constraints that captures a wide range of excepti...
Non-Transformational Syntax, 2011
Proceedings of the 1978 workshop on Theoretical issues in natural language processing -, 1978
1'111< LI'SIC'ON* TIlollii~s Ikrnsow Stnriford Uliivcrsity l..inguists hi\ve long recopnizch the ... more 1'111< LI'SIC'ON* TIlollii~s Ikrnsow Stnriford Uliivcrsity l..inguists hi\ve long recopnizch the desirnhilitj'of cmbqdding a tliu~ry 01. ' graliilnai withill tr tlicory ol' t inguistic pcrf ormanc* (scc, c,g., C l~o m A y .(19G5;10A1 5)). It lias bccn widcly assulncd by transformationalists that an adcqui~tc' niodcl of ',a language mcr woilld include as uric coniponcn t some sort ol' generative grntnmar. Yct tr;~nsfot~natiu~i;~l gri1mni;rri;lfls Iiavc dcvbtcd relatively little eneGy to the problc~n that ,Srcsnan (in press) calls "the gra1nmatic;rl Irc;~lii.atio~~ problcm": "1 low lrould a reasona hlc modcl of langilage g s e ' i n~o ;~o r a t e .a transformat~onal grammar?" When this question has. been
Stanford University's new Symbolic Systems Program is an interdisciplinary undergraduate program ... more Stanford University's new Symbolic Systems Program is an interdisciplinary undergraduate program focusing on understanding the nature of intelligent behavior. It brings together the disciplines of cognitive psychology, logic, computer science and artificial intelligence, philosophy, and linguistics in a newly emerging field of research concerned with the structure, content, and processing of information. Lingustics plays a central role in the program because, as the systematic study of human language, it can contribute greatly to the development of a general theory about how information is conveyed through symbols. The field of linguistics also deals with an exceptional range of phenomena, and is intellectually and practically accessible to an undergraduate student, making it an especially suitable vehicle for teaching undergraduates how to evaluate theories. (MSE)
this paper is to argue, to the contrary, that the highly ambiguous character of natural languages... more this paper is to argue, to the contrary, that the highly ambiguous character of natural languages is surprising, and that the very existence of ambiguity calls for an explanation. Section 1 clarifies what we mean by ambiguity, discussing the distinction between vagueness and ambiguity. We go on to identify several distinct types of ambiguity. Section 2 presents evidence that English is massively ambiguous . Section 3 elaborates our central argument: if (as is widely claimed) ambiguity impedes efficient communication, then one would expect languages to evolve so as to reduce ambiguity; but this does not appear to have happened. Section 4 responds to some possible objections to the argument in Section 3. Section 5 explores some possible strategies for explaining ambiguity, concluding with pointers to our ongoing research on the subject. 1. Characterizing Ambiguity Ambiguity is a semantic property. Semanticists argue over exactly what meaning is, but it surely involves associating expr...
Studies in Theoretical Psycholinguistics, 1986
Generative grammarians have been studying anaphora1 for two decades, since the publication of Lee... more Generative grammarians have been studying anaphora1 for two decades, since the publication of Lees and Klima’s seminal paper, ‘Rules for English Pronominalization’. During this period, the generative literature on anaphora has grown to massive proportions. While many of the avenues explored in that literature have proved to be dead ends, and many issues remain unresolved, I believe that important insights have been attained. My purpose in this paper is to survey what I consider to be the major achievements of the work on anaphora of the past twenty years. I do not pretend to be presenting any novel discoveries; my aim, rather, is to distill what is most significant out of a large and often confusing literature.
Journal of psycholinguistic research, 2015
We explore the consequences of letting the incremental and integrative nature of language process... more We explore the consequences of letting the incremental and integrative nature of language processing inform the design of competence grammar. What emerges is a view of grammar as a system of local monotonic constraints that provide a direct characterization of the signs (the form-meaning correspondences) of a given language. This "sign-based" conception of grammar has provided precise solutions to the key problems long thought to motivate movement-based analyses, has supported three decades of computational research developing large-scale grammar implementations, and is now beginning to play a role in computational psycholinguistics research that explores the use of underspecification in the incremental computation of partial meanings.
Karttunen actually proposes that NP3 should pronominalize NP4. This seems objection able to me on... more Karttunen actually proposes that NP3 should pronominalize NP4. This seems objection able to me on two grounds: first, there is no evidence to indicate that relative pronouns are derived from full NP's identical to the heads of the relatives; and second, even if relative pronouns ...
Explanations of the tendency to put long, complex constituents at the ends of sentences (' 'end-w... more Explanations of the tendency to put long, complex constituents at the ends of sentences (' 'end-weight'') usually take the listener's perspective, claiming it facilitates parsing. I argue for a speaker-oriented explanation of end-weight, based on how it facilitates utterance planning. Parsing is facilitated when as much tree structure as possible can be determined early in the string, but production is easiest when options for how to continue are kept open. That is, listeners should prefer early commitment and speakers should prefer late commitment. Corpus data show that different verbs exhibit different rates of word-order variation that are systematically related to differences in subcategorization possibilities in just the way predicted by a strategy of late commitment. Thus, a speakerbased account of lexical preferences in word ordering does a better job of explaining variation in weight effects than a listener-based account.
Both clause types were tested in four ways, using two methods of comparison and two populations o... more Both clause types were tested in four ways, using two methods of comparison and two populations of participants. One method had participants distribute 100 points between the versions with and without ‘that’; the other was a binary forced-choice, under time constraints. Both methods were employed in a traditional lab setting and also crowd-sourced via Amazon’s Mechanical Turk (AMT) facility (Munro et al., 2010).
Expecting the Unexpected: Exceptions in Grammar, 2011
Kempson&amp;amp;amp;amp;#x27;s interesting commentary raises two important points.∗ First... more Kempson&amp;amp;amp;amp;#x27;s interesting commentary raises two important points.∗ First, while extolling the value of probabilistic corpus data, she is not ready to accept “that speakers manipulate probability estimates as input to the decisions as to how to say what they do”. Second, she suggests an alternative to our attempt to explain the correlation between predictability of non-subject relative clauses and the absence of that in such clauses. We discuss these points in reverse order and raise some additional questions for future research. Our proposed ...
Natural Language and Linguistic Theory, 1996
Language Variation and Change, 2010
This paper examines a short-lived innovation, quotative all, in real and apparent time. We used a... more This paper examines a short-lived innovation, quotative all, in real and apparent time. We used a two-pronged method to trace the trajectory of all over the past two decades: (i) Quantitative analyses of the quotative system of young Californians from different decades; this reveals a startling crossover pattern: in 1990/1994, all predominates, but by 2005, it has given way to like. (ii) Searches of Internet newsgroups; these confirm that after rising briskly in the 1990s, all is declining. Tracing the changing usage of quotative options provides year-to-year evidence that all has recently given way to like. Our paper has two aims: We provide insights from ongoing language change regarding short-term innovations in the history of English. We also discuss our collaboration with Google Inc. and argue for the value of newsgroups to research projects investigating linguistic variation and change in real time, especially where recorded conversational tokens are relatively sparse.
American Speech, 2007
This article presents a synchronic and diachronic investigation of the lexeme all in its intensif... more This article presents a synchronic and diachronic investigation of the lexeme all in its intensifier and quotative functions. We delimit the new from the old functions of the lexeme and present a variationist account of all's external and internal constraints in various syntactic environments. our analysis is based on a variety of data sets, which include traditional sociolinguistic interviews as well as data culled from internet searches and a new Google-based search tool. on the basis of these data sets, we show that intensifier all is not new but has expanded in syntactic environments. We further pinpoint the syntactic and semantic niches which all has appropriated for itself among California adolescents and compare its patterning with that of other intensifiers in our data and the data of other researchers. All's extension to quotative function, however, is new, apparently originating in California in the 1980s. our investigation of its development spans across data sets...
AMLaP-2005, Sep 5, 2005
The predictability of a word has been shown to correlate positively its phonetic reduction (Bell ... more The predictability of a word has been shown to correlate positively its phonetic reduction (Bell et al., 2003). Surprisingly, little work has been done on the correlation between predictability and word omission (ie variations where a word can be omitted without leading to ungrammaticality; Resnik, 1996). We present a large-scale corpus study of predictability effects on such a phenomenon, relativizer omission in non-subject-extracted relative clauses (NSRCs):
The 18th Annual CUNY Sentence …, 2005
We also investigated four measures of an NSRC's predictability. The general idea was to see ... more We also investigated four measures of an NSRC's predictability. The general idea was to see how information that is available prior to the beginning of the NSRC (and may therefore help to predict the presence of an NSRC) influences relativizer likelihood.
The Handbook of Linguistics
Journal of Linguistics, 2019
The English auxiliary system exhibits many lexical exceptions and subregularities, and considerab... more The English auxiliary system exhibits many lexical exceptions and subregularities, and considerable dialectal variation, all of which are frequently omitted from generative analyses and discussions. This paper presents a detailed, movement-free account of the English Auxiliary System within Sign-Based Construction Grammar (Sag 2010, Michaelis 2011, Boas & Sag 2012) that utilizes techniques of lexicalist and construction-based analysis. The resulting conception of linguistic knowledge involves constraints that license hierarchical structures directly (as in context-free grammar), rather than by appeal to mappings over such structures. This allows English auxiliaries to be modeled as a class of verbs whose behavior is governed by general and class-specific constraints. Central to this account is a novel use of the feature aux, which is set both constructionally and lexically, allowing for a complex interplay between various grammatical constraints that captures a wide range of excepti...
Non-Transformational Syntax, 2011
Proceedings of the 1978 workshop on Theoretical issues in natural language processing -, 1978
1'111< LI'SIC'ON* TIlollii~s Ikrnsow Stnriford Uliivcrsity l..inguists hi\ve long recopnizch the ... more 1'111< LI'SIC'ON* TIlollii~s Ikrnsow Stnriford Uliivcrsity l..inguists hi\ve long recopnizch the desirnhilitj'of cmbqdding a tliu~ry 01. ' graliilnai withill tr tlicory ol' t inguistic pcrf ormanc* (scc, c,g., C l~o m A y .(19G5;10A1 5)). It lias bccn widcly assulncd by transformationalists that an adcqui~tc' niodcl of ',a language mcr woilld include as uric coniponcn t some sort ol' generative grntnmar. Yct tr;~nsfot~natiu~i;~l gri1mni;rri;lfls Iiavc dcvbtcd relatively little eneGy to the problc~n that ,Srcsnan (in press) calls "the gra1nmatic;rl Irc;~lii.atio~~ problcm": "1 low lrould a reasona hlc modcl of langilage g s e ' i n~o ;~o r a t e .a transformat~onal grammar?" When this question has. been
Stanford University's new Symbolic Systems Program is an interdisciplinary undergraduate program ... more Stanford University's new Symbolic Systems Program is an interdisciplinary undergraduate program focusing on understanding the nature of intelligent behavior. It brings together the disciplines of cognitive psychology, logic, computer science and artificial intelligence, philosophy, and linguistics in a newly emerging field of research concerned with the structure, content, and processing of information. Lingustics plays a central role in the program because, as the systematic study of human language, it can contribute greatly to the development of a general theory about how information is conveyed through symbols. The field of linguistics also deals with an exceptional range of phenomena, and is intellectually and practically accessible to an undergraduate student, making it an especially suitable vehicle for teaching undergraduates how to evaluate theories. (MSE)
this paper is to argue, to the contrary, that the highly ambiguous character of natural languages... more this paper is to argue, to the contrary, that the highly ambiguous character of natural languages is surprising, and that the very existence of ambiguity calls for an explanation. Section 1 clarifies what we mean by ambiguity, discussing the distinction between vagueness and ambiguity. We go on to identify several distinct types of ambiguity. Section 2 presents evidence that English is massively ambiguous . Section 3 elaborates our central argument: if (as is widely claimed) ambiguity impedes efficient communication, then one would expect languages to evolve so as to reduce ambiguity; but this does not appear to have happened. Section 4 responds to some possible objections to the argument in Section 3. Section 5 explores some possible strategies for explaining ambiguity, concluding with pointers to our ongoing research on the subject. 1. Characterizing Ambiguity Ambiguity is a semantic property. Semanticists argue over exactly what meaning is, but it surely involves associating expr...
Studies in Theoretical Psycholinguistics, 1986
Generative grammarians have been studying anaphora1 for two decades, since the publication of Lee... more Generative grammarians have been studying anaphora1 for two decades, since the publication of Lees and Klima’s seminal paper, ‘Rules for English Pronominalization’. During this period, the generative literature on anaphora has grown to massive proportions. While many of the avenues explored in that literature have proved to be dead ends, and many issues remain unresolved, I believe that important insights have been attained. My purpose in this paper is to survey what I consider to be the major achievements of the work on anaphora of the past twenty years. I do not pretend to be presenting any novel discoveries; my aim, rather, is to distill what is most significant out of a large and often confusing literature.
Journal of psycholinguistic research, 2015
We explore the consequences of letting the incremental and integrative nature of language process... more We explore the consequences of letting the incremental and integrative nature of language processing inform the design of competence grammar. What emerges is a view of grammar as a system of local monotonic constraints that provide a direct characterization of the signs (the form-meaning correspondences) of a given language. This "sign-based" conception of grammar has provided precise solutions to the key problems long thought to motivate movement-based analyses, has supported three decades of computational research developing large-scale grammar implementations, and is now beginning to play a role in computational psycholinguistics research that explores the use of underspecification in the incremental computation of partial meanings.
Karttunen actually proposes that NP3 should pronominalize NP4. This seems objection able to me on... more Karttunen actually proposes that NP3 should pronominalize NP4. This seems objection able to me on two grounds: first, there is no evidence to indicate that relative pronouns are derived from full NP's identical to the heads of the relatives; and second, even if relative pronouns ...
Explanations of the tendency to put long, complex constituents at the ends of sentences (' 'end-w... more Explanations of the tendency to put long, complex constituents at the ends of sentences (' 'end-weight'') usually take the listener's perspective, claiming it facilitates parsing. I argue for a speaker-oriented explanation of end-weight, based on how it facilitates utterance planning. Parsing is facilitated when as much tree structure as possible can be determined early in the string, but production is easiest when options for how to continue are kept open. That is, listeners should prefer early commitment and speakers should prefer late commitment. Corpus data show that different verbs exhibit different rates of word-order variation that are systematically related to differences in subcategorization possibilities in just the way predicted by a strategy of late commitment. Thus, a speakerbased account of lexical preferences in word ordering does a better job of explaining variation in weight effects than a listener-based account.