Automized Generation of Typed Syntax Trees via XML (original) (raw)

Generator of efficient strongly typed abstract syntax trees in Java

IEE Proceedings - Software, 2005

syntax trees are a very common data-structure in language related tools. For example compilers, interpreters, documentation generators, and syntax-directed editors use them extensively to extract, transform, store and produce information that is key to their functionality. We present a Java back-end for ApiGen, a tool that generates implementations of abstract syntax trees. The generated code is characterized by strong typing combined with a generic interface and maximal sub-term sharing for memory efficiency and fast equality checking. The goal of this tool is to obtain safe and efficient programming interfaces for abstract syntax trees. The contribution of this work is the combination of generating a strongly typed data-structure with maximal sub-term sharing in Java. Practical experience shows that this approach can not only be used for tools that are otherwise manually constructed, but also for internal datastructures in generated tools.

Dual syntax for XML languages

Information Systems, 2008

XML is successful as a machine processable data interchange format, but it is often too verbose for human use. For this reason, many XML languages permit an alternative more legible non-XML syntax. XSLT stylesheets are often used to convert from the XML syntax to the alternative syntax; however, such transformations are not reversible since no general tool exists to automatically parse the alternative syntax back into XML.

CDuce: an XML-centric general-purpose language

2003

We present the functional language ¡ Duce, discuss some design issues, and show its adequacy for working with XML documents. Distinctive features of ¡ Duce are a powerful pattern matching, first class functions, overloaded functions, a very rich type system (arrows, sequences, pairs, records, intersections, unions, differences), precise type inference for patterns and error localization, and a natural interpretation of types as sets of values. We also outline some important implementation issues; in particular, a dispatch algorithm that demonstrates how static type information can be used to obtain very efficient compilation schemas.

XAGra - An XML Dialect for Attribute Grammars

2009

Attribute Grammars (AG) are a powerful and well-known formalism used to create language processors. The metalanguage used to write an AG (to specify a language and its processor) depends on the compiler generator tool chosen. This fact can be an handicap when it is necessary to share or transfer information between language-based systems; this is, we face an interchangeability problem, if we want to reuse the same language specification (the AG) on another development environment. To overcome this interoperability flaw, we present in this paper X AGra-an XML dialect to describe attribute grammars. X AGra was precisely conceived aiming at adapting the output of a visual attribute grammar editor (named VisualLISA) to any compiler generator tool. Based on the formal definition of Attribute Grammar and on the usual requirements for the generation of a language processor, X AGra schema is divided into five main fragments: symbol declarations, attribute declarations, semantic productions (including attribute evaluation rules, contextual conditions, and translation rules), import, and auxiliary functions definitions. In the paper we present those components, but the focus will be on the systematic way we followed to design the XML schema based on the formal definition of AG. To strength the usefulness of X AGra as a universal AG specification, we show at a glance X AGraAl, a tool taking as input an AG written in X AGra, is a Grammar Analyzer and Transformation system that computes dependencies among symbols, various metrics, slices and rebuilds the grammar.

Processing XML for domain specific languages.

2008

XML is a standard and universal language for rep-resenting information. XML processing is supported by two key frameworks: DOM and SAX. SAX is efficient, but leaves the developer to encode much of the processing. This paper introduces a language for expressing XML-based languages via grammars that can be used to process XML documents and synthesize arbitrary values. The language is declarative and shields the developer from SAX implementation details.

Automatic Construction of XML-Based Tools Seen as Meta-Programming

Automated Software Engineering, 2003

This article presents XML-based tools for parser generation and data binding generation. The underlying concept is that of transformation between formal languages, which is a form of meta-programming. We discuss the benefits of such a declarative approach with well-defined semantics: productivity, maintainability, verifiability, performance and safety.

Expressive Power of the Statically Typed Concrete Syntax Trees

2021

The article specifies the definitions of a Concrete Syntax Tree and an Abstract Syntax Tree. The different types of knowledge that are shared between a parser and builder modules in a parsing machine, about the syntax tree building, are discussed. For the building of the syntax tree, various Syntax Structure Construction Commands are presented. They are transmitted from the parser to the builder, depending on the type of tree. Template grammars and a computer program (Parser Generator Profiler) that performs parser tests on their basis are described. The empirical results from the different tests (for different combinations of grammar elements), performed with different types of syntax trees, for different parsers generated by different parser generators, are shown. The measurements are based on different criteria such as the time for the tree building, its traversal time, its destruction time, and the memory used by it.

Maptool — supporting modular syntax development

Lecture Notes in Computer Science, 1996

In building textual translators, implementors often distinguish between a concrete syntax and an abstract syntax. The concrete syntax describes the phrase structure of the input language and the abstract syntax describes a tree structure that can be used as the basis for performing semantic computations. Having two grammars imposes the requirement that there exist a mapping from the concrete syntax to the abstract syntax. The research presented in this paper led to a tool, called Maptool, that is designed to simplify the development of the two grammars. Maptool supports a modular approach to syntax development that mirrors the modularity found in semantic computations. This is done by allowing users to specify each of the syntaxes only partially as long as the sum of the fragments allows deduction of the complete syntaxes.

Gel: A Generic Extensible Language

Domain-Specific Languages, 2009

Both XML and Lisp have demonstrated the utility of generic syntax for expressing tree-structured data. But generic languages do not provide the syntactic richness of custom languages. Generic Extensible Language (Gel) is a rich generic syntax that embodies many of the common syntactic conventions for operators, grouping and lists in widely-used languages. Prefix/infix operators are disambiguated by white-space, so that documents which violate common white-space conventions will not necessarily parse correctly with Gel. With some character replacements and adjusting for mismatch in operator precedence, Gel can extract meaningful structure from typical files in many languages, including Java, Cascading Style Sheets, Smalltalk, and ANTLR grammars. This evaluation shows the expressive power of Gel, not that Gel can be used as a parser for existing languages. Gel is intended to serve as a generic language for creating composable domain-specific languages.

Dynamic Syntax Tree: Implementation Results-2012

Dynamic Syntax Tree: Implementation Results-2012, 2012

In our earlier research[1], we described the Dynamic Syntax Tree method implementation for enhancing the Static Analysis process. After 10+ years of experience, we collected the significant results presented in this paper Keywords-dynamic syntax tree, dynamic analysis , static code analysis, abstract syntax tree, parser, semantic I.