Can Statistical Parsers WOW ! You : A Cognitive Assessment (original) (raw)

Developing An Experimental Basis For Determining Grammaticality

Introduction It is well known to social scientists, that in building any theory, theoreticians must account for the general pattern first, and only then can they turn to the problematic cases commonly referred to as "exceptions." Thus, the claim by Chomsky (1965:3-4) that linguistic theory is concerned primarily with an ideal speaker-listener in a completely homogeneous speech community, was a necessary first step towards building a coherent syntactic theory. However, GB theory, together with other branches of linguistics, has progressed quite far in the past 28 years, and the time is ripe to begin to concentrate more on the "exceptions" or unclarities that theoreticians have not been addressing by developing experimental methods to examine them. The development of syntactic theory appears to proceed in the following manner: data consisting of isolated sentences are gathered, their grammaticality is determined, and a theory is developed which accounts for both th...

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society Title Grammar Induction Profits from Representative Stimulus Sampling Publication Date Grammar Induction Profits from Representative Stimulus Sampling

2006

Sensitivity to distributional characteristics of sequential linguistic and nonlinguistic stimuli, have been shown to play a role in learning the underlying structure of these stimuli. A growing body of experimental and computational research with (artificial) grammars suggests that learners are sensitive to various distributional characteristics of their environment (Kuhl, 2004; Onnis, Monaghan, Richmond & Chater, 2005; Rohde & Plaut, 1999). We propose that, at a higher level, statistical characteristics of the full sample of stimuli on which learning is based, also affects learning. We provide a statistical model that accounts for such an effect, and experimental data with the Artificial Grammar Learning (AGL) methodology, showing that learners also are sensitive to distributional characteristics of a full sample of exemplars.

Three Benchmarks for Statistical Models of Syntax

Abstract Human language abilities are far richer than what is represented in the kinds of monolingual corpora that are standardly used to evaluate statistical models of language learning. This article summarizes a series of findings from language acquisition, cross-language typology, and language processing, that illustrate the challenges that any serious model of natural language syntax must meet.

The Psychological Reality of Syntactic Principles

2017

In this chapter, I survey a variety of grammars that have played a role in psycholinguistics, tracing the coevolution of theories in formal syntax and the computational parsing models that they inspired. In Chomsky’s “Standard Theory” the output of context-free rules is fed into the transformational component of a grammar. Many incorrectly interpreted early psycholinguistic experiments as shedding doubt on the psychological reality of transformational operations. These arguments, based on the Derivational Theory of Complexity, ultimately fail. But transformational parsers were rejected anyway, on computational grounds. Augmented Transition Networks (ATNs) rose to prominence, offering a promising framework for describing the surface syntax of natural language, as well as a natural implementation of the grammar as a parsing model. ATN parsers thus serve as a clear example of how grammatical rules can be viewed as procedural dispositions. A strong criticism of the ATN architecture, due...

Syntactic structure and artificial grammar learning: The learnability of embedded hierarchical structures

Cognition, 2008

Embedded hierarchical structures, such as ''the rat the cat ate was brown'', constitute a core generative property of a natural language theory. Several recent studies have reported learning of hierarchical embeddings in artificial grammar learning (AGL) tasks, and described the functional specificity of Broca's area for processing such structures. In two experiments, we investigated whether alternative strategies can explain the learning success in these studies. We trained participants on hierarchical sequences, and found no evidence for the learning of hierarchical embeddings in test situations identical to those from other studies in the literature. Instead, participants appeared to solve the task by exploiting surface distinctions between legal and illegal sequences, and applying strategies such as counting or repetition detection. We suggest alternative interpretations for the observed activation of Broca's area, in terms of the application of calculation rules or of a differential role of working memory. We claim that the learnability of hierarchical embeddings in AGL tasks remains to be demonstrated.

Tily, H.J., Frank, M.C. & Jaeger, T.F. (2011). The learnability of constructed languages reflects typological patterns. The 33rd Annual Meeting of the Cognitive Science Society (CogSci11). Boston, MA. July, 2011, 1364-1369.

A small number of the logically possible word order configurations account for a large proportion of actual human languages. To explain this distribution, typologists often invoke principles of human cognition which might make certain orders easier or harder to learn or use. We present a novel method for carrying out very large scale artificial language learning tasks over the internet, which allows us to test large batteries of systematically designed languages for differential learnability. An exploratory study of the learnability of all possible configurations of subject, verb, and object finds that the two most frequent orders in human languages are the most easily learned, and yields suggestive evidence compatible with other typological and psycholinguistic observations. Keywords: artificial grammar; language acquisition; language typology; psycholinguistics; word order

Syntactic Judgment Experiments

Language and Linguistics Compass, 2009

Informal judgments of sentence acceptability have long been the primary source of evidence about grammaticality in syntax, and have been controversial just as long. In the past decade, there has been growing interest in collecting and analyzing acceptability judgments according to the formal protocols of experimental psycholinguistics, an approach sometimes called experimental syntax. This article reviews the major issues relevant to this approach, namely, the relative reliability, validity, sensitivity, and convenience of formal vs. informal methods.

Frequency of basic English grammatical structures: A corpus analysis

Journal of Memory and Language, 2007

Many recent models of language comprehension have stressed the role of distributional frequencies in determining the relative accessibility or ease of processing associated with a particular lexical item or sentence structure. However, there exist relatively few comprehensive analyses of structural frequencies, and little consideration has been given to the appropriateness of using any particular set of corpus frequencies in modeling human language. We provide a comprehensive set of structural frequencies for a variety of written and spoken corpora, focusing on structures that have played a critical role in debates on normal psycholinguistics, aphasia, and child language acquisition, and compare our results with those from several recent papers to illustrate the implications and limitations of using corpus data in psycholinguistic research.