Enhancing Dependency Analysis by Combining Specific Dependency Parsers (original) (raw)

Towards an N-Version Dependency Parser

Lecture Notes in Computer Science, 2010

Maltparser is a contemporary dependency parsing machine learningbased system that shows great accuracy. However 90% for Labelled Attachment Score (LAS) seems to be a de facto limit for such kinds of parsers. Since generally such systems can not be modified, previous works have been developed to study what can be done with the training corpora in order to improve parsing accuracy. High level techniques, such as controlling sentences' length or corpora's size, seem useless for these purposes. But low level techniques, based on an in-depth study of the errors produced by the parser at the word level, seem promising. Prospective low level studies suggested the development of n-version parsers. Each one of these n versions should be able to tackle a specific kind of dependency parsing at the word level and the combined action of all them should reach more accurate parsings. In this paper we present an extensive study on the usefulness and the expected limits for n-version parser to improve parsing accuracy. This work has been developed specifically for Spanish using Maltparser.

Improving data driven dependency parsing using clausal information

2010

The paper describes a data driven dependency parsing approach which uses information about the clauses in a sentence to improve the parser performance. The clausal information is added automatically using a partial parser. We demonstrate the experiments on Hindi, a morphologically rich, free-word-order language, using a modified version of MSTParser. We did all the experiments on the ICON 2009 parsing contest data. We achieved an improvement of 0.87% and 0.77% in unlabeled attachment and labeled attachment accuracies respectively over the baseline parsing accuracies.

A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy

2013

We present a novel transition-based, greedy dependency parser which implements a flexible mix of bottom-up and top-down strategies. The new strategy allows the parser to postpone difficult decisions until the relevant information becomes available. The novel parser has a 12% error reduction in unlabeled attachment score over an arc-eager parser, with a slow-down factor of 2.8.

A System for Experiments with Dependency Parsers

2014

In this paper we present a system for experimenting with combinations of dependency parsers. The system supports initial training of different parsing models, creation of parsebank(s) with these models, and different strategies for the construction of ensemble models aimed at improving the output of the individual models by voting. The system employs two algorithms for construction of dependency trees from several parses of the same sentence and several ways for ranking of the arcs in the resulting trees. We have performed experiments with state-of-the-art dependency parsers including MaltParser, MSTParser, TurboParser, and MATEParser, on the data from the Bulgarian treebank -- BulTreeBank. Our best result from these experiments is slightly better then the best result reported in the literature for this language.

Yara Parser: A Fast and Accurate Dependency Parser

2015

Dependency parsers are among the most crucial tools in natural language processing as they have many important applications in downstream tasks such as information retrieval, machine translation and knowledge acquisition. We introduce the Yara Parser, a fast and accurate open-source dependency parser based on the arc-eager algorithm and beam search. It achieves an unlabeled accuracy of 93.32 on the standard WSJ test set which ranks it among the top dependency parsers. At its fastest, Yara can parse about 4000 sentences per second when in greedy mode (1 beam). When optimizing for accuracy (using 64 beams and Brown cluster features), Yara can parse 45 sentences per second. The parser can be trained on any syntactic dependency treebank and different options are provided in order to make it more flexible and tunable for specific tasks. It is released with the Apache version 2.0 license and can be used for both commercial and academic purposes. The parser can be found at https: //github.com/yahoo/YaraParser.

Efficient Parsing of Syntactic and Semantic Dependency Stru ctures

2009

In this paper, we describe our system for the 2009 CoNLL shared task for joint parsing of syntactic and semantic dependency structures of multiple languages. Our system combines and implements efficient parsing techniques to get a high accuracy as well as very good parsing and training time. For the applications of syntactic and semantic parsing, the parsing time and memory footprint are very important. We think that also the development of systems can profit from this since one can perform more experiments in the given time. For the subtask of syntactic dependency parsing, we could reach the second place with an accuracy in average of 85.68 which is only 0.09 points behind the first ranked system. For this task, our system has the highest accuracy for English with 89.88, German with 87.48 and the out-of-domain data in average with 78.79. The semantic role labeler works not as well as our parser and we reached therefore the fourth place (ranked by the macro F1 score) in the joint task for syntactic and semantic dependency parsing.

Giving Shape to an N-version Dependency Parser - Improving Dependency Parsing Accuracy for Spanish using Maltparser

2010

Maltparser is a contemporary dependency parsing machine learning–based system that shows great accuracy. However 90% of the Labelled Attachment Score (LAS) seems to be a de facto limit for these kinds of parsers. In this paper we present an n–version dependency parser that will work as follows: we found that there is a small set of words that are more frequently incorrectly parsed so the n-version dependency parser consists of n different parsers trained specifically to parse those difficult words. An algorithm will send each word to each parser and combined with the action of a general parser we will achieve better overall accuracy. This work has been developed specifically for Spanish using Maltparser.