Dynamic Syntax Tree: Implementation Results-2012 (original) (raw)

2012, Dynamic Syntax Tree: Implementation Results-2012

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In our earlier research[1], we described the Dynamic Syntax Tree method implementation for enhancing the Static Analysis process. After 10+ years of experience, we collected the significant results presented in this paper Keywords-dynamic syntax tree, dynamic analysis , static code analysis, abstract syntax tree, parser, semantic I.

Using Static Analysis For IDE ’ s for Dynamic Languages

Modern IDE’s for languages such as Java exploit the static type system of the language to provide shortcuts and hints to aid programmers with common programming tasks. An example of this help is auto-completion of field and method names when a programmer types a “.” operator. In dynamic languages, such as scripting languages like JavaScript[1] and programming languages like Scheme[3], there is no such static type system that can be exploited in this manner. However, a deeper analysis of the source program may be able to provide equivalent information that allows similar levels of functionality in an IDE for dynamic languages. To enable this, we discuss various uses to which analysis data could be put, and we argue for a layered approach that exploits the same analysis infrastructure across multiple languages Modern IDE’s provide a range of shortcuts and wizards to ease, and to some extent automate, many tedious aspects of programming. While some of these shortcuts are straightforwar...

Introducing enriched concrete syntax trees

In our earlier research [9] an area of consistent and systematic application of software metrics was explored. Strong dependency of applicability of software metrics on input programming language was recognized as one of the main weaknesses in this field. Introducing enriched Concrete Syntax Tree (eCST) for internal and intermediate representation of the source code resulted with step forward over this weakness. In this paper we explain innovation made by introducing eCST and provide idea for broader applicability of eCST in some other fields of software engineering.

ASTLOG: A language for examining abstract syntax trees

Proceedings of the Conference on Domain-Specific …, 1997

We desired a facility for locating analyzing syntactic artifacts in abstract syntax trees of C C++ programs, similar to the facility grep or awk provides for locating artifacts at the lexical level. Prolog, with its implicit pattern-matching and backtracking capabilities, is a natural choice for such a n application. We have developed a Prolog variant that avoids the overhead of translating the source syntactic structures into the form of a Prolog database; this is crucial to obtaining acceptable performance on large programs. An interpreter for this language has been implemented and used to nd various kinds of syntactic bugs and other questionable constructs in real programs like Microsoft SQL server 450Klines and Microsoft Word 2Mlines in time comparable to the runtime of the actual compiler. The model in which terms are matched against an implicit current object, rather than simply proven against a database of facts, leads to a distinct insideout functional" programming style that is quite unlike typical Prolog, but one that is, in fact, well-suited to the examination of trees. Also, various second-order Prolog set-predicates may be implemented via manipulation of the current object, thus retaining an important feature without entailing that the database be dynamically extensible as the usual implementation does.

Survey on Various Syntax Analyzer Tools

International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2022

Syntax analysis forms the second phase of a compiler. Syntax Analyzer basically takes the input of tokens from the lexical analyzer and parses the source code according to the production rules to detect errors in the code. The syntax analyzer gives an output in the form of a parse tree. The syntax analysis techniques can be classified as top-down parsing and bottom-up parsing. These categories can be further subdivided into recursive descent, LL (1), operator precedence, LR (0), SLR (1), CLR (1) and LALR (1) respectively. Various parser generators can generate parsers of these types which have been studied and analyzed in this paper such as Beaver, Tatoo, APG etc. This paper provides an overview of all these tools with respective to their working, advantages and features. These syntax analyzer tools can be used for different purposes according to the user.

Static Analysis: new emerging algorithms

–New generation Web Application Firewalls (ngWAF), new Dynamic Analysis (modern DAST products) RASP and DevOps fever are making Static Analysis (SAST) techniques useless? No, Absolutely not. But commercial Static Analysis vendors have to think different if they want their products to survive. Software solutions performing automatic code analysis are still very important, especially for remediation assistance capabilities or for extracting semantic metadata. These methods gather syntactic information from the source code and/or binaries, and then in general they provide large set of implying semantics. With the increased focus on dynamic techniques for vulnerabilities detection and prevention the problem emerges – modern programming languages are dynamic and the whole code semantic is known only at runtime and the analysis has to estimate larger relations. Moreover described is a new algorithm for better contrasting the jeopardize of dynamic analysis techniques.

Concrete Syntax with Black Box Parsers

The Art, Science, and Engineering of Programming, 2019

Context: Meta programming consists for a large part of matching, analyzing, and transforming syntax trees. Many meta programming systems process abstract syntax trees, but this requires intimate knowledge of the structure of the data type describing the abstract syntax. As a result, meta programming is errorprone, and meta programs are not resilient to evolution of the structure of such ASTs, requiring invasive, fault-prone change to these programs. Inquiry: Concrete syntax patterns alleviate this problem by allowing the meta programmer to match and create syntax trees using the actual syntax of the object language. Systems supporting concrete syntax patterns, however, require a concrete grammar of the object language in their own formalism. Creating such grammars is a costly and error-prone process, especially for realistic languages such as Java and C++. Approach: In this paper we present Concretely, a technique to extend meta programming systems with pluggable concrete syntax patterns, based on external, black box parsers. We illustrate Concretely in the context of Rascal, an open-source meta programming system and language workbench, and show how to reuse existing parsers for Java, JavaScript, and C++. Furthermore, we propose Tympanic, a DSL to declaratively map external AST structures to Rascal's internal data structures. Tympanic allows implementors of Concretely to solve the impedance mismatch between object-oriented class hierarchies in Java and Rascal's algebraic data types. Both the algebraic data type and AST marshalling code is automatically generated. Knowledge: The conceptual architecture of Concretely and Tympanic supports the reuse of pre-existing, external parsers, and their AST representation in meta programming systems that feature concrete syntax patterns for matching and constructing syntax trees. As such this opens up concrete syntax pattern matching for a host of realistic languages for which writing a grammar from scratch is time consuming and error-prone, but for which industry-strength parsers exist in the wild. Grounding: We evaluate Concretely in terms of source lines of code (SLOC), relative to the size of the AST data type and marshalling code. We show that for real programming languages such as C++ and Java, adding support for concrete syntax patterns takes an effort only in the order of dozens of SLOC. Similarly, we evaluate Tympanic in terms of SLOC, showing an order of magnitude of reduction in SLOC compared to manual implementation of the AST data types and marshalling code. Importance: Meta programming has applications in reverse engineering, reengineering, source code analysis, static analysis, software renovation, domain-specific language engineering, and many others. Processing of syntax trees is central to all of these tasks. Concrete syntax patterns improve the practice of constructing meta programs. The combination of Concretely and Tympanic has the potential to make concrete syntax patterns available with very little effort, thereby improving and promoting the application of meta programming in the general software engineering context.

Static Code Analysis

Static Code Analysis tools can reduce the number of bugs in one program therefore it can reduce the cost of this program. Many developers don’t use these tools losing a lot of time with manual code analysis (in some cases there are no analysis at all) and a lot of money with resources to do the analysis. In this paper we will test and study the results of three static code analysis tools that by being inexpensive can efficiently remove the most common vulnerabilities in a software. It can be difficult to compare tools with different characteristics but we can get interesting results by testing the tools together.

Static analysis of dynamic scripting languages

2009

Scripting languages, such as PHP, are among the most widely used and fastest growing programming languages, particularly for web applications. Static analysis is an important tool for detecting security flaws, finding bugs, and improving compilation of programs. However, static analysis of scripting languages is difficult due to features found in languages such as PHP. These features include run-time code generation, dynamic weak typing, dynamic aliasing, implicit object and array creation, and overloading of simple operators. We find that as a result, simple analysis techniques such as SSA and defuse chains are not straightforward to use, and that a single unconstrained variable can ruin our analysis. In this paper we describe a static analyser for PHP, and show how classical static analysis techniques can be extended to analyse PHP. In particular our analysis combines alias analysis, type-inference and constantpropagation for PHP, computing results that are essential for other analyses and optimizations. We find that this combination of techniques allows the generation of meaningful and useful results from our static analysis.

Improving Dynamic Code Analysis by Code Abstraction

Electronic proceedings in theoretical computer science, 2021

In this paper, our aim is to propose a model for code abstraction, based on abstract interpretation, allowing us to improve the precision of a recently proposed static analysis by abstract interpretation of dynamic languages. The problem we tackle here is that the analysis may add some spurious code to the string-to-execute abstract value and this code may need some abstract representations in order to make it analyzable. This is precisely what we propose here, where we drive the code abstraction by the analysis we have to perform.

Combined Static and Dynamic Analysis

Electronic Notes in Theoretical Computer Science, 2005

Static analysis is usually faster than dynamic analysis but less precise. Therefore it is often desirable to retain information from static analysis for run-time verification, or to compare the results of both techniques. However, this requires writing two programs, which may not act identically under the same conditions. It would be desirable to share the same generic algorithm by static and dynamic analysis. In JNuke, a framework for static and dynamic analysis of Java programs, this has been achieved. By keeping the architecture of static analysis similar to a virtual machine, the only key difference between abstract interpretation and execution remains the nature of program states. In dynamic analysis, concrete states are available, while in static analysis, sets of (abstract) states are considered. Our new analysis is generic because it can re-use the same algorithm in static analysis and dynamic analysis. This paper describes the architecture of such a generic analysis. To our knowledge, JNuke is the first tool that has achieved this integration, which enables static and dynamic analysis to interact in novel ways.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (5)

  1. Moses.T., Syman.D., Barzanti M. Static Analysis: A Dynamic Syntax Tree Implementation. London, December 2001
  2. Moses.T., Syman.D. Static Analysis of Applications written in modern languages. Moldova, 1999. Translated from Russian and published by ResearchGate, 2008
  3. Huffman D.A., "A method for the construction of minimum- redundancy codes", Proceedings of the I.R.E., September 1952
  4. Moses.T., Syman.D., Barzanti M. Binary Analysis: A Dynamic Sandboxing Implementation. London, July 2006
  5. T. J. PARR University of Minnesota, R. W. QUONG School of Electrical Engineering, Purdue University. ANTLR: A Predicated LL(k) Parser Generator. July 1995.

Dynamic Syntax Tree: Implementation Results

Dynamic Syntax Tree: Implementation Results, 2016

Updated Results of a Dynamic Syntax Tree method implementation for enhancing the Static Analysis process. We collected the most significant results of latest 4 year, presented in this paper Keywords-dynamic syntax tree, dynamic analysis , static code analysis, abstract syntax tree, parser, semantic I.

Static Analysis: a Dynamic Syntax Tree implementation

–In our earlier research [1] on area of Static Analysis of applications written using modern languages, we discussed about lack of accurate analysis of algorithms based on Abstract Syntax and Concrete (CST, aka Parser) Trees. Moreover described is the Dynamic Syntax Tree method implementation for enhancing the Static Analysis process.

Expressive Power of the Statically Typed Concrete Syntax Trees

2021

The article specifies the definitions of a Concrete Syntax Tree and an Abstract Syntax Tree. The different types of knowledge that are shared between a parser and builder modules in a parsing machine, about the syntax tree building, are discussed. For the building of the syntax tree, various Syntax Structure Construction Commands are presented. They are transmitted from the parser to the builder, depending on the type of tree. Template grammars and a computer program (Parser Generator Profiler) that performs parser tests on their basis are described. The empirical results from the different tests (for different combinations of grammar elements), performed with different types of syntax trees, for different parsers generated by different parser generators, are shown. The measurements are based on different criteria such as the time for the tree building, its traversal time, its destruction time, and the memory used by it.

Static Analysis of Applications written in modern languages

–Most of Static Analysis tools are nowadays based on Abstract Syntax or Concrete (aka Parser) Trees. For analyzing applications written in modern programming languages, were types and objects are dynamically created, those tools cannot provide accurate analysis results because they are designed for static programming languages only. Moreover described is the new Dynamic Syntax Trees-based method for enhancing the Static Analysis process.

Dynamic Syntax Tree: Optimized Binary Sandboxing

Dynamic Syntax Tree: Optimized Binary Sandboxing, 2014

Dynamic Syntax Tree (DST) implementations [1] use Binary Sandboxing for enhancing the Static Analysis process. In this paper we present a new Dynamic Binary analysis method for collecting information on ELF, PE and Mach-O executables and dynamic libraries. This information will enrich DST contents during application scanning Keywords-dynamic syntax tree, binary analysis, sandbox, dynamic analysis , static code analysis, abstract syntax tree, parser I.

Static Program Analysis for String Manipulation Languages

Electronic Proceedings in Theoretical Computer Science

In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a hard challenge for static analysis of these programming languages. A key aspect of any dynamic language program is the multiple usage of strings, since they can be implicitly converted to another type value, transformed by string-to-code primitives or used to access an object-property. Unfortunately, string analyses for dynamic languages still lack precision and do not take into account some important string features. Moreover, string obfuscation is very popular in the context of dynamic language malicious code, for example, to hide code information inside strings and then to dynamically transform strings into executable code. In this scenario, more precise string analyses become a necessity. This paper is placed in the context of static string analysis by abstract interpretation and proposes a new semantics for string analysis, placing a first step for handling dynamic languages string features.

STALlion: a simple typed assembly language for static analysis

2004

Typed assembly languages have the goal of providing security guarantees, for example, for the limited use of resources in a host machine or the detection of autoupdate code. This work presents a simple typed assembly language which allows us to perform various kinds of static analysis tasks with the purpose of detecting flaws in the code security. The security policy we use guarantees type and memory safety. Moreover, wa can ensure that non-initialized variables are not read, and that there is no out-of-bound array accesses. The language we present, called STALlion, was designed in order to interpret a particular kind of imperative programs, more specifically abstract syntax tree.

Abstract Parsing: Static Analysis of Dynamically Generated String Output Using LR-Parsing Technology

Lecture Notes in Computer Science, 2009

We combine LR(k)-parsing technology and data-flow analysis to analyze, in advance of execution, the documents generated dynamically by a program. Based on the document language's contextfree reference grammar and the program's control structure, the analysis predicts how the documents will be generated and parses the predicted documents. Our strategy remembers context-free structure by computing abstract LR-parse stacks. The technique is implemented in Objective Caml and has statically validated a suite of PHP programs that dynamically generate HTML documents.

DyLan: Parser for Dynamic Syntax

2011

This document describes some of the details of the prototype implementation of the Dynamic Syntax (DS) grammar formalism, DyLan. As such, it should be read in conjunction with the documentation (Javadoc) that comes with the implementation, and various papers/books that describe the Dynamic Syntax framework itself (

Structural analysis and visualization of c++ code evolution using syntax trees

2007

Abstract We present a method to detect and visualize evolution patterns in C++ source code. Our method consists of three steps. First, we extract an annotated syntax tree (AST) from each version of a given C++ source code. Next, we hash the extracted syntax nodes based on a metric combining structure and type information, and construct matches (correspondences) between similar-hash subtrees. Our technique detects code fragments which have not changed, or changed little, during the software evolution.