Combining Dependency Parsing with PP Attachment (original) (raw)
Related papers
The benefit of stochastic PP attachment to a rule-based parser
Proceedings of the COLING/ACL on Main conference poster sessions -, 2006
To study PP attachment disambiguation as a benchmark for empirical methods in natural language processing it has often been reduced to a binary decision problem (between verb or noun attachment) in a particular syntactic configuration. A parser, however, must solve the more general task of deciding between more than two alternatives in many different contexts. We combine the attachment predictions made by a simple model of lexical attraction with a full-fledged parser of German to determine the actual benefit of the subtask to parsing. We show that the combination of data-driven and rule-based components can reduce the number of all parsing errors by 14% and raise the attachment accuracy for dependency parsing of German to an unprecedented 92%.
Leveraging a Semantically Annotated Corpus to Disambiguate Prepositional Phrase Attachment
Accurate parse ranking requires semantic information, since a sentence may have many candidate parses involving common syntactic constructions. In this paper, we propose a probabilistic framework for incorporating distributional semantic information into a maximum entropy parser. Furthermore , to better deal with sparse data, we use a modified version of Latent Dirichlet Allocation to smooth the probability estimates. This LDA model generates pairs of lemmas, representing the two arguments of a semantic relation, and can be trained, in an unsupervised manner, on a corpus annotated with semantic dependencies. To evaluate our framework in isolation from the rest of a parser, we consider the special case of prepositional phrase attachment ambiguity. The results show that our semantically-motivated feature is effective in this case, and moreover, the LDA smoothing both produces semantically interpretable topics, and also improves performance over raw co-occurrence frequencies, demonstrating that it can successfully generalise patterns in the training data.
Resolving prepositional phrase attachment ambiguities in Spanish with a classifier
2011
In this paper we present a classifier that solves a certain kind of ambiguities in syntactic structure for Spanish, namely, ambiguities as to the point of adjunction of a prepositional phrase in the syntactic structure of a sentence (PP attachment). As a starting point, we used EsTxala dependency grammar for Spanish, integrated within FreeLing, with an accuracy score of 61% on PP adjunction. Our target is to develop a specialized module for for PP attachment, so that the syntactic analyzer combines dependency grammar's manual rules with statistical information infered out of a classifier. We have evaluated different classifiers and different features to characterize PP-attachment ambiguities. Our best approaches improve the performance of EsTxala by 20 points, but are still far from the performance of unsupervised methods reporting 94% accuracy. We gained insight on the factors governing the disambiguation of PP attachment ambiguities, which will arguably let us build lighter models that can be easily integrated within a general-purpose analyzer as FreeLing.
An Analysis of Prepositional-Phrase Attachment Disambiguation
International Journal of Computational Linguistics Research, 2018
Prepositional-phrase (PP) attachment ambiguity is a pervasive problem in natural language processing, and at times it poses significant challenges to a computer system to resolve this ambiguity. In literature, different approaches have been proposed to address PP-attachment ambiguity, but to the best of our knowledge, there is no published work which surveys such approaches. This survey paper compares the standard methods that attempt to resolve PP-attachment ambiguities in natural language processing. We also provide a taxonomy of various ambiguities, which may arise at different levels during the language-processing task. There are two methods employed in natural language processing concerning new approaches: the first technique called the rule-based method and the second called statistical approach.
This empirical study attempts to find answers to the question of how a natural language (henceforth NL) system could resolve attachment of prepositional phrases (henceforth PPs) by examining naturally occurring PP attachments in typed dialogue. Examination includes testing predictive powers of existing attachment theories against the data. The result of this effort will be an algorithm for interpreting PP attachment.
Attaching Multiple Prepositional Phrases: Generalized Backed-off Estimation
Computing Research Repository, 1997
There has recently been considerable interest in the use of lexically-based statistical techniques to resolve prepositional phrase attachments. To our knowledge, however, these investigations have only considered the problem of attaching the first PP, i.e., in a IV NP PP] configuration. In this paper, we consider one technique which has been successfully applied to this problem, backed-off estimation, and demonstrate how it can be extended to deal with the problem of multiple PP attachment. The multiple PP attachment introduces two related problems: sparser data (since multiple PPs are naturally rarer), and greater syntactic ambiguity (more attachment configurations which must be distinguished). We present and algorithm which solves this problem through re-use of the relatively rich data obtained from first PP training, in resolving subsequent PP attachments.
2004
Extracting information automatically from texts for database representation requires previously well-grouped phrases so that entities can be separated adequately. This problem is known as prepositional phrase (PP) attachment disambiguation. Current PP attachment disambiguation systems require an annotated treebank or they use an Internet connection to achieve a precision of more than 90. Unfortunately, these resources are not always available.
Improving prepositional phrase attachment disambiguation using the web as corpus
Progress in Pattern Recognition, Speech and …, 2003
The problem of Prepositional Phrase (PP) attachment disambiguation consists in determining if a PP is part of a noun phrase, as in He sees the room with books, or an argument of a verb, as in He fills the room with books. Volk has proposed two variants of a method that queries an Internet search engine to find the most probable attachment variant. In this paper we apply the latest variant of Volk's method to Spanish with several differences that allow us to attain a better performance close to that of statistical methods using treebanks.
Proceedings of the 28th annual meeting on Association for Computational Linguistics -, 1990
This empirical study attempts to find answers to the question of how a natural language (henceforth NL) system could resolve attachment of prepositional phrases (henceforth PPs) by examining naturally occurring PP attachments in typed dialogue. Examination includes testing predictive powers of existing attachment theories against the data. The result of this effort will be an algorithm for interpreting PP attachment.
Experiments with a multilanguage non-projective dependency parser
Proceedings of the Tenth Conference on Computational Natural Language Learning - CoNLL-X '06, 2006
In this presentation, I will look back at 10 years of CoNLL conferences and the state of the art of machine learning of language that is evident from this decade of research. My conclusion, intended to provoke discussion, will be that we currently lack a clear motivation or "mission" to survive as a discipline. I will suggest that a new mission for the field could be found in a renewed interest for theoretical work (which learning algorithms have a bias that matches the properties of language?, what is the psycholinguistic relevance of learner design issues?), in more sophisticated comparative methodology, and in solving the problem of transfer, reusability, and adaptation of learned knowledge. ing corpus size on classifier performance for natural language processing. In HLT '01: Proceedings of the first international conference on Human language technology research, pages 1-5, Morristown, NJ, USA. Association for Computational Linguistics.