Syntactic Parsing Research Papers - Academia.edu (original) (raw)
In this article, we present an application of X-bar syntax in a computational enviroment. We present the parser Grammar Play, a syntactic parser in Prolog. The parser analyses simple declarative sentences of Brazilian Portuguese,... more
In this article, we present an application of X-bar syntax in a computational enviroment. We present the parser Grammar Play, a syntactic parser in Prolog. The parser analyses simple declarative sentences of Brazilian Portuguese, identifying their constituent structure. The grammar is implemented in Prolog, making use of DCGs, and it is based on the X-bar theory (HAEGEMAN 1994, MIOTO et al. 2004). The parser is an attempt to broaden the coverage of similar syntactic analyzers, as the ones presented in Pagani (2004) and Othero (2006). The main goals of the present version of the Grammar Play are not related to broad coverage, but to the computational implemenation of coherent linguistic models applied to the description of Portuguese, and to the developement of a computational linguistics tool that can be used didactically in introductory classes of Syntax or Linguistics.
Keywords: X-bar Theory; Computational Syntax; Automatic Processing of Portuguese.
- by and +1
- •
- Syntactic Parsing, X-bar theory, Computational Syntax
There is a mismatch between the documentation on the SpaCy’s English dependency labels referenced on their website and the actual labels produced by the parser. This document contains a comprehensive list of all the possible dependency... more
There is a mismatch between the documentation on the SpaCy’s English dependency labels referenced on their website and the actual labels produced by the parser. This document contains a comprehensive list of all the possible dependency relations with labels used by SpaCy parser.
Resumen: Comprender cómo se procesa el lenguaje ha sido interés central de la psicolingüística hace varias décadas ya que entender los procesos involucrados aporta al estudio del lenguaje, pero, también, al estudio y comprensión de la... more
Resumen: Comprender cómo se procesa el lenguaje ha sido interés central de la psicolingüística hace varias décadas ya que entender los procesos involucrados aporta al estudio del lenguaje, pero, también, al estudio y comprensión de la mente. En este artículo se presenta una revisión de los modelos de procesamiento sintáctico y sus aportes a la comprensión acerca del funcionamiento del lenguaje y de la cognición. En primer lugar, se retoman algunas de las discusiones clásicas acerca del procesamiento del lenguaje y los procesos cognitivos en general. En segunda instancia, se realiza una revisión de distintos modelos de procesamiento sintáctico propuestos en las últimas décadas, sus características y supuestos acerca de la facultad del lenguaje. Por último, se comparan las diferentes propuestas en función de los debates clásicos presentados y su posición acerca de la universalidad de ciertos procesos cognitivos. Palabras-clave: procesos cognitivos; comprensión del lenguaje; procesamiento sintáctico; adjunción. Abstract: Understanding how language is processed has been a theme of an utmost interest for psycholinguistics for several decades, since fully acknowledging the processes involved contributes not only to the study of language, but also to both the study and understanding of the mind. This article presents a review of syntactic parsing models and their contributions to the comprehension of the functioning of language and cognition. First of all, some of the classical discussions about language processing and cognitive processes in general are analyzed. Secondly, different models of syntactic parsing proposed in the last decades as well as their characteristics and assumptions about the language faculty are reviewed. Finally, the different proposals are compared
This study will present a tool designed for meaning extraction with monoclausal sentences in Italian. Its main features will be illustrated with instances of the Italian causative clause type featuring the verb fare 'make' (e.g. Egli fa... more
This study will present a tool designed for meaning extraction with monoclausal sentences in Italian. Its main features will be illustrated with instances of the Italian causative clause type featuring the verb fare 'make' (e.g. Egli fa piangere il bambino 'He makes the child cry'), a construction which invariably embeds a clause (e.g. Il bambino piange 'The child cries') with which it establishes an entailment relationship. The tool automatically accomplishes the following tasks: it answers relevant wh-questions (e.g. Who cries? Who makes someone cry?) and detects the entailment. Concurrent with this presentation, this study will also encourage a reflection on the research currently being conducted in Computational Linguistics and Natural Language Processing, in which data-driven approaches prevail over rule-based systems.
Diese Arbeit beschäftigt sich mit der Bedeutung, die dem Komma im Leseprozess zukommt. Zuvor wurde dieses lediglich als mit der Rhetorik und in Folge mit der Grammatik verbundenes graphisches Instrument interpretiert, weshalb seine... more
Diese Arbeit beschäftigt sich mit der Bedeutung, die dem Komma im Leseprozess zukommt. Zuvor wurde dieses lediglich als mit der Rhetorik und in Folge mit der Grammatik verbundenes graphisches Instrument interpretiert, weshalb seine psycholinguistische Relevanz als Hilfestellung für den Lesenden im Ablauf des syntaktischen Parsings unbeachtet blieb und erst innerhalb der letzten Jahre mit dem Aufkommen einer sprachverarbeitungsorientierten Sichtweise verstärkt untersucht wurde. Nach der Darlegung der theoretischen Grundlagen gibt diese Arbeit einen Überblick über die bisher betriebene Forschung und präsentiert schließlich eine Studie, welche die Relevanz des Kommas an bestimmten syntaktischen Stellen anhand der Verarbeitungsdauer von jeweils zwei korrespondierenden Sätzen unter-sucht, die sich lediglich in der Realisierung des Kommas unterschieden. Schließlich werden die Ergebnisse diskutiert und theoretische Rückschlüsse gezogen.
One of the most prominent current paradigms in automatic syntactic analysis is datadriven dependency parsing. In this approach, manually annotated treebanks are used in training and evaluating parsers that create tree representations,... more
One of the most prominent current paradigms in automatic syntactic analysis is datadriven dependency parsing. In this approach, manually annotated treebanks are used in training and evaluating parsers that create tree representations, where each word depends on a head word and is assigned a label depicting its relation to the head word. This paper describes the Greek Dependency Treebank, a resource that contains 130+ thousand tokens in 5669 manually annotated sentences from texts including transcripts of parliamentary sessions and several types of web documents. The GDT has been used in experiments for training and evaluating a dependency parser for the Greek language.
Se han desarrollado muchas investigaciones sobre el procesamiento de oraciones de vía muerta (Waters & Caplan, 2001; Sturt, Scheepers & Pickering, 2002; Vos et al., 2001; Gordon, Hendrick & Levine, 2002; Fiebach et al., 2005). En español... more
Se han desarrollado muchas investigaciones sobre el procesamiento de oraciones de vía muerta (Waters & Caplan, 2001; Sturt, Scheepers & Pickering, 2002; Vos et al., 2001; Gordon, Hendrick & Levine, 2002; Fiebach et al., 2005). En español y en otras lenguas, los trabajos se han enfocado en estructuras temporalmente ambiguas, centrándose en la posibilidad de la doble adjunción de las proposiciones subordinadas temporales (Meseguer, Carreiras & Clifton, 2002; Véliz et al., 2011; van Gompel et al., 2005; Pickering, Traxler & Crocker, 2000; Traxler & Frazier, 2008; entre otros), que parecen ser una de las construcciones clásicas que muestran efectos de vía muerta.
En este estudio, se presentan los resultados preliminares de un experimento piloto con hispanohablantes. El objetivo principal fue verificar similitudes o diferencias en el procesamiento de las oraciones ambiguas de vía muerta con subordinación temporal o de lugar. La hipótesis de trabajo fue que ambos tipos de estructuras, con subordinación de tiempo y de lugar, iban a implicar la misma complejidad de lectura y a presentar efectos de vía muerta.
Se diseñaron dos tareas con un paradigma de tiempo de reacción: 1) lectura de oraciones en dos condiciones (ambigua vs. no ambigua) con dos tipos de subordinación (locativa vs. temporal) segmentadas en 5 sintagmas y reconocimiento de una palabra presentada inmediatamente después; 2) lectura de las mismas oraciones segmentadas en sintagmas y verificación de correspondencia con una oración completa presentada inmediatamente después.
Se registraron y analizaron los tiempos de lectura, los tiempos de reconocimiento de palabras y oraciones y la precisión de respuesta. Los resultados preliminares parecen indicar que hay un efecto significativo de la tarea: la tarea de oración completa genera un sesgo que disminuye los tiempos de lectura en todos los sintagmas y parece desencadenar una representación sintáctica entera, incluso durante la presentación segmentada. El segundo resultado preliminar que necesita observación es la distinción entre las oraciones de vía muerta temporales y las de lugar: el efecto clásico encontrado en las oraciones estructuralmente ambiguas de tiempo no fue claramente replicado en las oraciones locativas en el experimento piloto.
This thesis discusses and implements Arabic data from the Syrian dialect within the Minimalist Machine program first created and developed for English and Japanese by Sandiway Fong and Jason Ginsberg. This thesis centers on implementing... more
This thesis discusses and implements Arabic data from the Syrian dialect within the Minimalist Machine program first created and developed for English and Japanese by Sandiway Fong and Jason Ginsberg. This thesis centers on implementing data found in Kristen Brustad’s The Syntax of Spoken Arabic: A Comparative Study (2000) and introduces a minimalist Arabic grammar into the Minimalist Machine program, a Prolog-based system [that uses an online server to self-contain the information recall]. The data and insights from Brustad, as well as other prevalent sources in Arabic syntax, were used to inform and compose the syntactic constraints for the Prolog-based system.
Второй раунд форума «Оценка методов автоматического анализа текста» в 2011–2012 гг. был посвящен синтаксическим анализаторам русскоязычных текстов. В статье описываются принципы и процедура проведения дорожек форума, состав участников,... more
Второй раунд форума «Оценка методов автоматического анализа текста» в 2011–2012 гг. был посвящен синтаксическим анализаторам русскоязычных текстов. В статье описываются принципы и процедура проведения дорожек форума, состав участников, тестовая коллекция и Золотой Стандарт, на основе которого осуществлялась оценка, принципы сопоставления ответов систем, сложные для оценки случаи, а также некоторые проблемные точки в работе синтаксических парсеров, которые выявила экспертиза результатов.
Two eye-tracking experiments were designed to investigate a novel temporal ambiguity between object relative clauses (object RCs; 'the claim that John made is false)' and complement clauses of a noun (CCs; 'the claim that John made a... more
Two eye-tracking experiments were designed to investigate a novel temporal ambiguity between
object relative clauses (object RCs; 'the claim that John made is false)' and complement clauses of a
noun (CCs; 'the claim that John made a mistake...') in Italian and English. This study has three main
goals: the first is to assess whether a temporary ambiguity between a RC and CC structure gives rise
to a garden path effect; the second, is to consider the potential implications of this effect in relation to
current parsing theories and determine whether it is compatible with the predictions drawn from the
family of reanalysis-based two-stages models (a.o. Frazier 1987; Traxler, Pickering & Clifton 1998;
Van Gompel, Pickering & Traxler 1999); the third is to evaluate competing syntactic analysis for CC
structures. A more traditional analysis of CCs will be compared with a recent proposal presented in
Cecchetto & Donati (2011) and Donati & Cecchetto (2015). We will show that only this latter
account is consistent with our experimental findings.
- by Caterina Donati and +2
- •
- Languages and Linguistics, Eye tracking, Psycholinguistics, Syntax
In this paper, we introduce experiment results of a Vietnamese sentence parser which is built by using the Chomsky’s subcategorization theory and PDCG (Probabilistic Definite Clause Grammar). The efficiency of this subcategorized PDCG... more
In this paper, we introduce experiment results of a Vietnamese sentence parser which is built by using the Chomsky’s subcategorization theory and PDCG (Probabilistic Definite Clause Grammar). The efficiency
of this subcategorized PDCG parser has been proved by experiments, in which, we have built by hand a
Treebank with 1000 syntactic structures of Vietnamese training sentences, and used different testing datasets to evaluate the results. As a result, the precisions, recalls and F-measures of these experiments are
over 98%.
This study investigates the syntactic structures of spoken discourse of teachers in academic discourse. The knowledge of syntactic structure of a language helps in understanding the spoken discourse. So, the study identifies the... more
This study investigates the syntactic structures of spoken discourse of teachers in academic discourse. The knowledge of syntactic structure of a language helps in understanding the spoken discourse. So, the study identifies the wh-Movement in the syntactic structures of teachers in English classroom sessions. The data was collected from two universities of Federal government, Pakistan. The one was Air University Islamabad and the second was National University of Modern Languages Islamabad. The data was collected through the recording tool where the English classroom sessions of the teachers were audio-recorded and transcribed. The analysis of data was quantitative and qualitative in nature. The frequency of wh-movement in the structures of recorded English spoken data was analysed quantitatively. In qualitative analyses, the transcribed data was analysed syntactically, keeping in view minimalist perspective, with the help of parsing rules and figures. The analyzed data shows that the teachers at undergraduate level use language where wh-movement is employed in syntactic structure of English used in classroom sessions. They move wh-expression into other slots like internal merge and pied-pipe. However, the minimalist parametric unit, wh-movement, was found in the sentence structures of the teachers in the delivery of classroom sessions. So, the minimal pairs of sentence structure impacts different level of language.
Three experiments examine the syntactic and semantic processing of idioms in English, French, and Tunisian Arabic and compare these processing cross-linguistically with a priming procedure. Results from experiment 1 and 2 reveal that... more
Three experiments examine the syntactic and semantic processing of idioms in English, French, and Tunisian Arabic and compare these processing cross-linguistically with a priming procedure. Results from experiment 1 and 2 reveal that there is a syntactic priming effect of idioms in Tunisian Arabic, French, and English. It is worth noting, however, that it was less significant in English. Experiment 3 concerned with the semantic priming, showing that there is no semantic priming effects as subjects named equally abstract and concrete nouns with no difference in their reaction time. The results from these experiments entail the independence of the syntactic processor, thus, adhering to one central aspect of the modular view of language processing. This study eventually sustains the universality of parsing.
Relative clauses and more generally clauses modifying nouns have been at the center of a long debate in the last forty years, opposing largely diverging syntactic analyses, comparing relevant data and discussing perspectives. The aim of... more
Relative clauses and more generally clauses modifying nouns have been at the center of a long debate in the last forty years, opposing largely diverging syntactic analyses, comparing relevant data and discussing perspectives. The aim of this paper is to contribute to this debate by adding novel experimental data on how these structures are processed in an online reading task. Two eye-tracking experiments were designed to investigate the temporal structural ambiguity that can arise between object relative clauses (object RCs; 'the claim that linguists made is a mistake)' and so-called complement clauses of a noun (CCs; 'the claim that linguists made a mistake...') in Italian and English. Although the pattern is complex, the results of both experiments suggest that a reanalysis effect is associated with CCs, showing an initial preference for the object RC structural interpretation. The implications of our results are discussed in relation to competing syntactic analyses for CCs ad RCs.
- by Mirta Vernice and +1
- •
- Cognitive Science, Languages and Linguistics, Syntax, Parsing
Identifying indices of effort in post-editing of machine translation can have a number of applications, including estimating machine translation quality and calculating post-editors’ pay rates. Both source-text and machine-output features... more
Identifying indices of effort in post-editing of machine translation can have a number of applications, including estimating machine translation quality and calculating post-editors’ pay rates. Both source-text and machine-output features as well as subjects’ traits are investigated here in view of their impact on cognitive effort, which is measured with eye tracking and a subjective scale borrowed from the field of Educational Psychology. Data is analysed with mixed-effects models, and results indicate that the semantics-based automatic evaluation metric Meteor is significantly correlated with all measures of cognitive effort considered. Smaller effects are also observed for source-text linguistic features. Further insight is provided into the role of the source text in post-editing, with results suggesting that consulting the source text is only associated with how cognitively demanding the task is perceived in the case of those with a low level of proficiency in the source language. Subjects’ working memory capacity was also taken into account and a relationship with post-editing productivity could be noticed. Scaled-up studies into the construct of working memory capacity and the use of eye tracking in models for quality estimation are suggested as future work.
Two aspects of anaphora in Hittite are discussed in the paper. The first is syntactic means to mark immediate anaphora after first mention. Besides fronting a constituent hosting -a/ma and demonstrative phrases, it is shown that the... more
Two aspects of anaphora in Hittite are discussed in the paper. The first is syntactic means to mark immediate anaphora after first mention. Besides fronting a constituent hosting -a/ma and demonstrative phrases, it is shown that the specific type of anaphora is also marked by the seemingly redundant structure of enclitic pronoun + full NP in its canonical position. It is argued that the parallel syntactic behaviour of all three constructions provides evidence to distinguish some cases of enclitic pronoun + full NP from appositions and to consider them a taxonomically distinct category, clitic doubling.
The second part of the paper deals with non-standard anaphora in relative clauses. It explores the occasional associate anaphoric relation between the relative phrase and its correlate (bridging) in the cross-linguistic perspective. Building upon work of Huggard and Belyaev it is shown that this non-standard anaphora provides evidence that Hittite relative sentences are not standard relative sentences, they rather constitute a separate taxonomic category, correlatives pace Cinque . Along more general lines, the paper substantiates the language specific claim of Belyaev contra Cinque that correlatives are a syntactic category distinct from relative clauses.
In this research, we would like to build an initial model for semantic parsing of simple Vietnamese sentences. With a semantic parsing model like that, we can analyse simple Vietnamese sentences to determine their semantic structures that... more
In this research, we would like to build an initial model for semantic parsing of simple Vietnamese sentences. With a semantic parsing model like that, we can analyse simple Vietnamese sentences to determine their semantic structures that are represented in a form that was defined by our point of view. So, we try to solve two tasks: first, building an our taxonomy of Vietnamese nouns, then we use it to define the feature structures of nouns and verbs; second, to build a Unification-Based Vietnamese Grammar we
define the syntactic and semantic unification rules for the Vietnamese phrases, clauses and sentences based on the Unification-Based Grammar. This Vietnamese grammar has been used to build a semantic parser for single Vietnamese sentences. This semantic parser has been experienced and the experiment results get precision and recall all over 84%.
This paper presents very concretely on French examples the choices to be made when devising a mecanism for analysing and understanding sentences, as applied to French in particular. The methodology presented is functionalist,... more
This paper presents very concretely on French examples the choices to be made when devising a mecanism for analysing and understanding sentences, as applied to French in particular. The methodology presented is functionalist, operationalist and resolutely privilegies an onomasiologic approach.
In this paper, we present three techniques for incorporating syntactic metadata in a textual retrieval system. The first technique involves just a syntactic analysis of the query and it generates a different weight for each term of the... more
In this paper, we present three techniques for incorporating syntactic metadata in a textual retrieval system. The first technique involves just a syntactic analysis of the query and it generates a different weight for each term of the query, depending on its grammar category in the query phrase. These weights will be used for each term in the retrieval process. The second technique involves a storage optimization of the system's inverted index that is the inverse index will store only terms that are subjects or predicates in the document they appear in. Finally, the third technique builds a full syntactic index, meaning that for each term in the term collection, the inverse index stores besides the term-frequency and the inverse-document-frequency, also the grammar category of the term for each of its occurrences in a document.
En las investigaciones sobre procesamiento sintáctico de oraciones, existe una discusión vigente acerca del rol que juega la memoria de trabajo en el procesamiento de oraciones sintácticamente complejas y que pueden tener más de una... more
En las investigaciones sobre procesamiento sintáctico de oraciones, existe una discusión vigente acerca del rol que juega la memoria de trabajo en el procesamiento de oraciones sintácticamente complejas y que pueden tener más de una representación estructural, como las oraciones temporalmente ambiguas o de vía muerta. Algunos estudios empíricos sostienen que un deterioro en la memoria de trabajo trae aparejado problemas para el análisis de estructuras que requieren un reanálisis sintáctico, o sea un procesamiento psicolingüístico más complejo (Carpenter et al., 1994; Kemper y Kemtes, 1997; King y Just, 1991; MacDonald y otros, 1992). Otros trabajos reportan evidencia que señala que el procesamiento sintáctico de oraciones temporalmente ambiguas no se vería constreñido por un decaimiento en la capacidad de memoria de trabajo (Caplan y Waters, 1999; Véliz et al., 2011).
En este trabajo, se presentan los resultados preliminares de un estudio experimental cuyo propósito es observar si una reducción de la capacidad de la memoria de trabajo, a causa del envejecimiento cognitivo, produce una baja en la eficiencia de procesamiento sintáctico de oraciones ambiguas de vía muerta (o temporalmente ambiguas). En esta investigación, intentamos verificar dos hipótesis centrales: a) la memoria operativa podrá modular el procesamiento sintáctico pero no llegará a tener un efecto principal transversal: esto es, la complejidad sintáctica provoca dificultades específicas más allá de las capacidades cognitivas generales; b) el mayor costo de procesamiento de estas oraciones se reflejará en un punto específico del proceso temporal, específicamente, en el momento en que ingresa, como input lingüístico, el sintagma donde se produce la vía muerta.
La tarea completa consistió en leer las oraciones divididas en sintagmas a ritmo propio y luego reconocer si una palabra presentada luego de cada oración se encontraba presente en el ítem. Los materiales utilizados se caracterizaron por presentar dos tipos de estructuras sintácticas temporalmente ambiguas: 1) con subordinadas de tiempos y 2) con subordinadas de lugar. Para cada estructura se presentaron 3 condiciones para una tarea de reconocimiento de palabra: 1) palabra presente en la oración; 2) palabra semánticamente relacionada; 3) palabra lejana.
Se registraron y analizaron, en principio, los tiempos de lectura, el tiempo de reconocimiento de la palabra y la precisión de respuesta. Los resultados preliminares parecen indicar que este instrumento logra registrar los efectos de vía muerta en ambos tipos de estructura (una novedad para este tipo de estudios, que, usualmente, miden un sólo tipo de estructura ambigua) y que existe una interacción consistente entre los tiempos de reconocimiento según condición y el tipo de estructura sintáctica (ambigua vs. no ambigua), lo que ayudaría a fortalecer la hipótesis sobre el punto en el que se realiza el proceso de desambiguación durante el procesamiento sintáctico.
What do perceptually bistable figures, sentences vulnerable to misinterpretation and the Stroop task have in common? Although seemingly disparate, they all contain elements of conflict or ambiguity. Consequently, in order to monitor a... more
What do perceptually bistable figures, sentences vulnerable to misinterpretation and the Stroop task have in common? Although seemingly disparate, they all contain elements of conflict or ambiguity. Consequently, in order to monitor a fluctuating percept, reinterpret sentence meaning, or say ‘‘blue’’ when the word RED is printed in blue ink, individuals must regulate attention and engage cognitive control. According to the Conflict Monitoring Theory (Botvinick, Braver, Barch, Carter, & Cohen, 2001), the detection of conflict automatically triggers cognitive control mechanisms, which can enhance resolution of subsequent conflict, namely, ‘‘conflict adaptation.’’ If adaptation reflects the recruitment of domain-general processes, then conflict detection in one domain should facilitate conflict resolution in an entirely different domain. We report two novel findings: (i) significant conflict adaptation from a syntactic to a non-syntactic domain and (ii) from a perceptual to a verbal domain, providing strong evidence that adaptation is mediated by domain-general cognitive control.
Eye fixation durations during normal reading correlate with processing difficulty but the specific cognitive mechanisms reflected in these measures are not well understood. This study finds support in German readers' eye fixations for... more
Eye fixation durations during normal reading correlate with processing difficulty but the specific cognitive mechanisms reflected in these measures are not well understood. This study finds support in German readers' eye fixations for two distinct difficulty metrics: surprisal, which reflects the change in probabilities across syntactic analyses as new words are integrated, and retrieval, which quantifies comprehension difficulty in terms of working memory constraints. We examine the predictions of both metrics using a family of dependency parsers indexed by an upper limit on the number of candidate syntactic analyses they retain at successive words. Surprisal models all fixation measures and regression probability. By contrast, retrieval does not model any measure in serial processing. As more candidate analyses are considered in parallel at each word, retrieval can account for the same measures as surprisal. This pattern suggests an important role for ranked parallelism in theories of sentence comprehension.
We propose a speech comprehension software architecture to represent the flow of the natural processing of auditory sentences. The computational implementation applies wavelets transforms to speech signal codification and data prosodic... more
We propose a speech comprehension software architecture to represent the flow of the natural processing of auditory sentences. The computational implementation applies wavelets transforms to speech signal codification and data prosodic extraction, and connectionist models to syntactic parsing and prosodic-semantic mapping.
We present a comparative error analysis of two parsers-MALT and MST on Telugu Dependency Treebank data. MALT and MST are currently two of the most dominant data-driven dependency parsers. We discuss the performances of both the parsers in... more
We present a comparative error analysis of two parsers-MALT and MST on Telugu Dependency Treebank data. MALT and MST are currently two of the most dominant data-driven dependency parsers. We discuss the performances of both the parsers in relation to Telugu language. we also talk in detail about both the algorithmic issues of the parsers as well as the language specific constraints of Telugu.The purpose is, to better understand how to help the parsers deal with complex structures, make sense of implicit language specific cues and build a more informed Treebank.
Timothy Osborne argues that phrase structure grammars (PSGs) postulate unnecessarily complex structures, and that Dependency Grammar (DG) is to be preferred on grounds of simplicity (1:1 word-to-node ratio) and empirical adequacy... more
Timothy Osborne argues that phrase structure grammars (PSGs) postulate unnecessarily complex structures, and that Dependency Grammar (DG) is to be preferred on grounds of simplicity (1:1 word-to-node ratio) and empirical adequacy (capturing the results of constituency tests). In this reply, I argue that, while some of Osborne's criticisms of PSGs are justified, there are both empirical and theoretical problems with his major claims. In particular, his version of DG is too restrictive with respect to certain constituency facts (modified nouns, verbal phrases), and what it gains in simplicity qua number of nodes, it loses in requiring a more complex interface between syntax and other linguistic components (phonology, semantics). I argue that Mirror Theory, a framework that is in a sense intermediate between DG and PSGs, answers Osborne's justified criticisms while not suffering from the problems of his version of DG.
Question Generation (QG) and Question Answering (QA) are among the many challenges in natural language generation and natural language understanding. An automated QG system focuses on generation of expressive and factoid questions which... more
Question Generation (QG) and Question Answering (QA) are among the many challenges in natural language generation and natural language understanding. An automated QG system focuses on generation of expressive and factoid questions which assist in meetings, customer helpline, specific domain services, and Educational Institutes etc. In this paper, the proposed system addresses the generation of factoid or wh-questions from sentences in a corpus consisting of factual, descriptive and unbiased details. We discuss our heuristic algorithm for sentence simplification or pre-processing and the knowledge base extracted from previous step is stored in a structured format which assists us in further processing. We further discuss the sentence semantic relation which enables us to construct questions following certain recognizable patterns among different sentence entities, following by the evaluation of question generated. We conclude our project discussing the applications, future scope and improvements.
The article is dedicated to the problem of the frame methodic in the context of analysis of the nouns valence in the Ukrainian language. The methodic of the linguistic frames may solve such important problems as to describe the words... more
The article is dedicated to the problem of the frame methodic in the context of analysis of the nouns valence in the Ukrainian language. The methodic of the linguistic frames may solve such important problems as to describe the words valence in Ukrainian language, to model syntactic and semantic structure of the language and find out what dependencies between the lexical semantics and syntax exist in the Ukrainian language.
This paper reports how a rule-based natural lan- guage parser uses context knowledge to resolve ambiguities in POS tagging. The parser has only 9 word classes and they are sufficient enough to have fine-grained distinctions and flexible... more
This paper reports how a rule-based natural lan- guage parser uses context knowledge to resolve ambiguities in POS tagging. The parser has only 9 word classes and they are sufficient enough to have fine-grained distinctions and flexible enough to perform their roles in a handful of constituent classes in syntactic parsing. We classify words based on the traditional tripartite of classification - form, function, and distribution (Lyons, 1977). Morphological analysis of form enables the tagger to reduce the number of possible candidate tags for any word to no more than 3, even before parsing. We call it the right context. The words parsed (the left context) provide disambiguation mechanisms with syntactic information that is unavailable to most part-of-speech tagging systems. The POS tagger is highly portable in that there's no need for data creation or preparation and no training is re- quired for any domain. Also its small tag set gives it the flexibility to tailor its support f...
This article elucidates the effectiveness of the use of syntactic tree program in analyzing tagmemic structures in understanding linguistic elements. Above and beyond, this study aims at knowing the effectiveness of this software to the... more
This article elucidates the effectiveness of the use of syntactic tree program in analyzing tagmemic structures in understanding linguistic elements. Above and beyond, this study aims at knowing the effectiveness of this software to the university students to comprehend syntactic parsing in case of statistic results. Hence, the writer uses quantitative method in investigating the effectiveness of this program by using quasi-experimental technique. The results show that the students taught using this program are better in comprehending syntactic description and analysis meanwhile the students in control group find difficulties in parsing the linguistic elements in tagmemic structures. So, this program is proposed to use in syntactic parsing to linguistics classes.
Keywords: syntactic tree program, tagmemic structures, linguistics, statistics, effectiveness.
This dissertation presents the development of a wide-coverage semantic parser capable of handling quantifier scope ambiguities in a novel way. In contrast with traditional approaches that deliver an underspecified representation and focus... more
This dissertation presents the development of a wide-coverage semantic parser capable of handling quantifier scope ambiguities in a novel way. In contrast with traditional approaches that deliver an underspecified representation and focus on enumerating the possible readings “offline” after the end of the syntactic analysis, our parser handles the ambiguities during the derivation using a semantic device known as generalized skolem term. This approach combines most of the benefits of the existing methods and provides solutions to their deficiencies with a natural way. Furthermore, this takes place in the context of the grammar itself, without resorting to ad-hoc complex mechanisms.
Using the structural priming paradigm, the present study explores predictions made by the implicit prosody hypothesis (IPH) by testing whether an implicit prosodic boundary generated from a silently read sentence influences attachment... more
Using the structural priming paradigm, the present study explores predictions made by the implicit prosody hypothesis
(IPH) by testing whether an implicit prosodic boundary generated from a silently read sentence influences attachment
preference for a novel, subsequently read sentence. Results indicate that such priming does occur, as evidenced by an effect on relative clause attachment. In particular, priming an implicit boundary directly before a relative clause– cued by commas in orthography – encouraged high attachment of that relative clause, although the size of the effect depended somewhat on individual differences in pragmatic/communication skills (as measured by the Autism Spectrum Quotient). Thus, in addition to supporting the basic claims of the IPH, the present study demonstrates the relevance of such individual differences to sentence processing, and that implicit prosodic structure, like syntactic structure, can be primed.
- by Jason Bishop and +1
- •
- Speech Prosody, Autism, Psycholinguistics, Intonation
The paper reconceptualizes Constraint Grammar as a framework where the rules refine the compact representations of local ambiguity while the rule conditions are matched against a string of feature vectors that summarize the compact... more
The paper reconceptualizes Constraint Grammar as a framework where the rules refine the compact representations of local ambiguity while the rule conditions are matched against a string of feature vectors that summarize the compact representations. Both views to the ambiguity are processed with pure finite-state operations. The compact representations are mapped to feature vectors with the aid of a rational power series.
The Linguistic Annotation Framework (LAF) provides a general, extensible stand-o markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in... more
The Linguistic Annotation Framework (LAF) provides a general, extensible stand-o markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in particular. We rst walk through the history of the Hebrew Bible as text database in decennium-wide steps. Then we describe how LAF-Fabric may serve as an analysis tool for this corpus. Finally, we describe three analytic projects/workows that benet
Deliverable D5-4 is a prototype English text to Sign Language editing environment which allows input of English text from a file or from the keyboard to be converted into a Discourse Representation Structure (DRS) semantic representation... more
Deliverable D5-4 is a prototype English text to Sign Language editing environment which allows input of English text from a file or from the keyboard to be converted into a Discourse Representation Structure (DRS) semantic representation from which a synthetic sign sequence is generated. User intervention by a linguistically aware person is supported to permit modification of the input text
This paper is based on an extant lexicon-grammar of European Portuguese verbal idioms (e.g., deitar mãos à obra, literally, 'to throw hands to the work', 'to start working'.). This a database containing about 2,400 expressions, along with... more
This paper is based on an extant lexicon-grammar of European Portuguese verbal idioms (e.g., deitar mãos à obra, literally, 'to throw hands to the work', 'to start working'.). This a database containing about 2,400 expressions, along with all relevant information on the sentence structure, distributional constraints and transformational properties of these frozen sentences. In this paper, we present a solution to the integration of verbal idioms in a fully-fledged natural language processing system, and a preliminary evaluation using a small, manually annotated corpus.
- by Hyekyung Hwang and +1
- •
- Psychology, Cognitive Science, Psycholinguistics, Speech perception