Prototyping Efficient Natural Language Parsers (original) (raw)
Related papers
Prototyping Ecient Natural Language Parsers
2000
We present a technique for the construction of ecient prototypes for natural language parsing based on the compilation of parsing schemata to executable implementations of their corresponding algorithms. Taking a simple description of a schema as input, Java code for the corresponding parsing algorithm is generated, including schema-specific index- ing code in order to attain eciency.
Compiling declarative specifications of parsing algorithms
Database and Expert Systems …, 2007
The parsing schemata formalism allows us to describe parsing algorithms in a simple, declarative way by capturing their fundamental semantics while abstracting low-level detail. In this work, we present a compilation technique allowing the automatic transformation of parsing schemata to efficient executable implementations of their corresponding algorithms. Our technique is general enough to be able to handle all kinds of schemata for context-free grammars, tree adjoining grammars and other grammatical formalisms, providing an extensibility mechanism which allows the user to define custom notational elements.
Method and Means of Development of Context-Free Grammars for Natural Language Parsers
Abstract:-We discuss the requirements for the system that performs the analysis of natural language at the syntactic level. We also present the environment that allows development of context-free grammars for natural language parsers. The environment was tested for Spanish language, resulting on the development of a Spanish morphological analyzer. The environment gives the user the possibilities to develop and debug grammars of new languages.
Generation of indexes for compiling efficient parsers from formal specifications
Computer Aided Systems …, 2007
Parsing schemata provide a formal, simple and uniform way to describe, analyze and compare different parsing algorithms. The notion of a parsing schema comes from considering parsing as a deduction process which generates intermediate results called items. An initial set of items is directly obtained from the input sentence, and the parsing process consists of the application of inference rules (called deductive steps) which produce new items from existing ones. Each item contains a piece of information about the sentence’s structure, and a successful parsing process will produce at least one final item containing a full parse tree for the sentence or guaranteeing its existence. Their abstraction of lowlevel details makes parsing schemata useful to define parsers in a simple and straightforward way. Comparing parsers, or considering aspects such as their correction and completeness or their computational complexity, also becomes easier if we think in terms of schemata. However, when we want to actually use a parser by running it on a computer, we need to implement it in a programming language, so we have to abandon the high level of abstraction and worry about implementation details that were irrelevant at the schema level. In particular, we study in this article how the source parsing schema should be analysed to decide what kind of indexes need to be generated in order to obtain an efficient parser.
A generic parser for strings and trees
Computer Science and Information Systems, 2012
In this paper, we propose a two fold generic parser. First, it simulates the behavior of multiple parsing automata. Second, it parses strings drawn from either a context free grammar, a regular tree grammar, or from both. The proposed parser is based on an approach that defines an extended version of an automaton, called positionparsing automaton (PPA) using concepts from LR and regular tree automata, combined with a newly introduced concept, called state instantiation and transition cloning. It is constructed as a direct mapping from a grammar, represented in an expanded list format. However, PPA is a non-deterministic automaton with a generic bottom-up parsing behavior. Hence, it is efficiently transformed into a reduced one (RBA). The proposed parser is then constructed to simulate the run of the RBA automaton on input strings derived from a respective grammar. Without loss of generality, the proposed parser is used within the framework of pattern matching and code generation. Co...
DPL – a computational method for describing grammars and modelling parsers
1985
constituent-pairs (ie. words or recognized phrase constructs) and (3) description of constituent surroundings in the form of two-way automata. The compilation of DPL-grammars results in e>;ecutable codes of corresponding parsers. To ease the modelling of grammars there exists a linguistically oriented programming environment, which contains e.g. tracing facility for the parsing process, grammar—sensitive lexical maintenance programs, and routines for the interactive graphic display of parse trees and grammar definitions. Translator routines are also available for the transport of compiled code between various LISP-dialects. The DPL-compiler and associated tools can be used under INTERLISP and FRANZLISPThis paper focuses on knowledge engineering issues. Linguistic argumentation we have presented in 73/ and 74/The detailed syntax of DPL with examples can be found in 727.