GENOA---a customizable, front-end-retargetable source code analysis framework (original) (raw)

GENOA: a customizable language-and front-end independent code analyzer

1992

Programmers working on large software systems spend a great deal of time examining code and trying to understand it. Code Analysis tools (e.g., cross referencing tools such as CIA and Cscope) can be very helpful in this process. In this paper we describe genoa, an application generator that can produce a whole range of useful code analysis tools. genoa is designed to be language-and front-end independent; it can be interfaced to any front-end for any language that produces an attributed parse tree, simply by writing an interface speci cation.

Automatic Generation of Code Analysis Tools

Proceedings of the 1st International Workshop on Real World Domain Specific Languages

Source code analysis and manipulation tools have become an essential part of software development processes. Automating the development of such tools can heavily reduce development time, effort and cost. This paper proposes a framework for the efficient development of code analysis software. A tool for automatically generating the front end of analysis tools for a given language grammar is proposed. The proposed approach can be applied to any language that can be described using the BNF notation. The proposed framework also provides a domain specific language to concisely express queries on the internal representation generated by the front end. This language tackles the problem of writing complex code in a general purpose programming language in order to retrieve information from the internal representation. The approach has been evaluated through two different realistic usage scenarios applied to a number of different benchmark applications. The front end generator has also been tested for twenty input grammars. In all cases the software generated by the proposed framework functions according to the input grammar while the development time has been reduced on average down to 12% compared to equivalent handwritten implementations. The experimental results give evidence that the use of the proposed framework can heavily reduce the relevant design effort and cost.

Automatic generation of code analysis tools: The CastQL approach

Proceedings of the 1st International Workshop on Real World Domain Specific Languages (RWDSL)

Source code analysis and manipulation tools have become an essential part of software development processes. Automating the development of such tools can heavily reduce development time, effort and cost. This paper proposes a framework for the efficient development of code analysis software. A tool for automatically generating the front end of analysis tools for a given language grammar is proposed. The proposed approach can be applied to any language that can be described using the BNF notation. The proposed framework also provides a domain specific language to concisely express queries on the internal representation generated by the front end. This language tackles the problem of writing complex code in a general purpose programming language in order to retrieve information from the internal representation. The approach has been evaluated through two different realistic usage scenarios applied to a number of different benchmark applications. The front end generator has also been t...

An Extensible System for Source Code Analysis

IEEE Transactions on Software Engineering, 1998

Constructing code analyzers may be costly and error prone if inadequate technologies and tools are used. If they are written in a conventional programming language, for instance, several thousand lines of code may be required even for relatively simple analyses. One way of facilitating the development of code analyzers is to define a very high-level domain-oriented language and implement an application generator that creates the analyzers from the specification of the analyses they are intended to perform. This paper presents a system for developing code analyzers that uses a database to store both a no-loss fine-grained intermediate representation and the results of the analyses. The system uses an algebraic representation, called F p ( ) , as the user-visible intermediate representation. Analyzers are specified in a declarative language, called F p ( ) − l, which enables an analysis to be specified in the form of a traversal of an algebraic expression, with access to, and storage of, the database information the algebraic expression indices. A foreign language interface allows the analyzers to be embedded in C programs. This is useful for implementing the user interface of an analyzer, for example, or to facilitate interoperation of the generated analyzers with pre-existing tools. The paper evaluates the strengths and limitations of the proposed system, and compares it to other related approaches.

Compiler Hacking for Source Code Analysis

Software Quality Journal, 2000

Many activities related to software quality assessment and improvement, such as empirical model construction, data flow analysis, testing or reengineering, rely on static source code analysis as the first and fundamental step for gathering the necessary input information. In the past, two different strategies have been adopted to develop tool suites. There are tools encompassing or implementing the source parse step, where the parser is internal to the toolkit, and is developed and maintained with it. A different approach builds tools on the top of external already-available components such as compilers that output the program abstract syntax tree, or that make it available via an API. This paper discusses techniques, issues and challenges linked to compiler patching or wrapping for analysis purposes. In particular, different approaches for accessing the compiler parsing information are compared, and the techniques used to decouple the parsing front end from the analysis modules are discussed.

GAST: A Generic AST Representation for Language-Independent Source Code Analysis

Enfoque UTE, 2023

Organizations use various programming languages to develop their systems. These aim to take advantage of the most appropriate features of each language for a given domain and require programmers to command different languages and also to face the growing complexity of software development and maintenance. So, they need tools to help them analyze programs to identify relationships between their internal elements, uncover patterns, and calculate quality metrics. However, most tools have limited support for parsing multiple programming languages and high acquisition costs. Therefore, there is a need for new methods to analyze code written in multiple programming languages. This article describes the design of a method to automatically transform the syntax of various programming languages into a universal language with a generic syntax. The function of the generic language is to encapsulate the specificities of each specific language, so that the analysis of programs is facilitated in a single programming syntax and not in multiple syntaxes. The advantage of this approach is that only one analysis engine is required, not multiple code analyzers, to study the programs.

Building Useful Program Analysis Tools Using an Extensible Java Compiler

2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation, 2012

Large software companies need customized tools to manage their source code. These tools are often built in an ad-hoc fashion, using brittle technologies such as regular expressions and home-grown parsers. Changes in the language cause the tools to break. More importantly, these ad-hoc tools often do not support uncommon-but-valid code code patterns.

Source code analysis: A road map

2007

Abstract The automated and semi-automated analysis of source code has remained a topic of intense research for more than thirty years. During this period, algorithms and techniques for source-code analysis have changed, sometimes dramatically. The abilities of the tools that implement them have also expanded to meet new and diverse challenges. This paper surveys current work on source-code analysis.

Software Architecture of Code Analysis Frameworks Matters: The Frama-C Example

arXiv (Cornell University), 2015

Implementing large software, as software analyzers which aim to be used in industrial settings, requires a well-engineered software architecture in order to ease its daily development and its maintenance process during its lifecycle. If the analyzer is not only a single tool, but an open extensible collaborative framework in which external developers may develop plug-ins collaborating with each other, such a well designed architecture even becomes more important. In this experience report, we explain difficulties of developing and maintaining open extensible collaborative analysis frameworks, through the example of Frama-C , a platform dedicated to the analysis of code written in C. We also present the new upcoming software architecture of Frama-C and how it aims to solve some of these issues.