Exploiting traces in static program analysis: better model checking through {{\tt printf}}$$ s (original) (raw)

Loopfrog: A static analyzer for ANSI-C programs

2009 IEEE/ACM …, 2009

Practical software verification is dominated by two major classes of techniques. The first is model checking, which provides total precision, but suffers from the state space explosion problem. The second is abstract interpretation, which is usually much less demanding, but often returns a high number of false positives. We present LOOPFROG, a static analyzer that combines the best of both worlds: the precision of model checking and the performance of abstract interpretation. In contrast to traditional static analyzers, it also provides 'leaping' counterexamples to aid in the diagnosis of errors.

Whole execution traces and their applications

ACM Transactions on Architecture and Code Optimization, 2005

Different types of program profiles (control flow, value, address, and dependence) have been collected and extensively studied by researchers to identify program characteristics that can then be exploited to develop more effective compilers and architectures. Because of the large amounts of profile data produced by realistic program runs, most work has focused on separately collecting and compressing different types of profiles. In this paper, we present a unified representation of profiles called Whole Execution Trace (WET), which includes the complete information contained in each of the above types of traces. Thus, WETs provide a basis for a next-generation software tool that will enable mining of program profiles to identify program characteristics that require understanding of relationships among various types of profiles. The key features of our WET representation are: WET is constructed by labeling a static program representation with profile information such that relevant and related profile information can be directly accessed by analysis algorithms as they traverse the representation; a highly effective two-tier strategy is used to significantly compress the WET; and compression techniques are designed such that they minimally affect the ability to rapidly traverse WET for extracting subsets of information corresponding to individual profile types as well as a combination of profile types. Our experimentation shows that on, an average, execution traces resulting from execution of 647 million statements can be stored in 331 megabytes of storage after compression. The compression factors range from 16 to 83. Moreover the rates at which different types of profiles can be individually or simultaneously extracted are high. We present two applications of WETs, dynamic program slicing and dynamic version matching, which make effective use of multiple kinds of profile information contained in WETs.

Model-Based Trace-Checking

2003

Trace analysis can be a useful way to discover problems in a program under test. Rather than writing a special purpose trace analysis tool, this paper proposes that traces can usefully be analysed by checking them against a formal model using a standard model-checker or else an animator for executable specifications. These techniques are illustrated using a Travel Agent case study implemented in J2EE. We added trace beans to this code that write trace information to a database. The traces are then extracted and converted into a form suitable for analysis by Spin, a popular model-checker, and Pro-B, a model-checker and animator for the B notation. This illustrates the technique, and also the fact that such a system can have a variety of models, in different notations, that capture different features. These experiments have demonstrated that model-based trace-checking is feasible. Future work is focussed on scaling up the approach to larger systems by increasing the level of automation.

Recovering High-Level Conditions from Binary Programs

FM 2016: Formal Methods, 2016

The need to get confidence in binary programs without access to their source code has pushed efforts forward to directly analyze executable programs. However, low-level programs lack high-level structures (such as types, control-flow graph, etc.), preventing the straightforward application of sourcecode analysis techniques. Especially, conditional jumps rely on low-level flag predicates, whereas they often encode high-level "natural" conditions on program variables. Most static analyzers are unable to infer any interesting information from these low-level conditions, leading to serious precision loss compared with source-level analysis. In this paper, we propose template-based recovery, an automatic approach for retrieving high-level predicates from their low-level flag versions. Especially, the technique is sound, efficient, platform-independent and it achieves very high ratio of recovery. This method allows more precise analyses and helps to understand machine encoding of conditionals rather than relying on error-prone human interpretation or (syntactic) pattern-based reasoning.

Framework for instruction-level tracing and analysis of programs

International Conference on Virtual Execution Environments, 2006

Program execution traces provide the most intimate details of a program's dynamic behavior. They can be used for program optimization, failure diagnosis, collecting software metrics like coverage, test prioritization, etc. Two major obstacles to exploiting the full potential of information they provide are: (i) performance overhead while collecting traces, and (ii) significant size of traces even for short execution scenarios. Reducing information output in an execution trace can reduce both performance overhead and the size of traces. However, the applicability of such traces is limited to a particular task. We present a runtime framework with a goal of collecting a complete, machine-and task-independent, user-mode trace of a program's execution that can be re-simulated deterministically with full fidelity down to the instruction level. The framework has reasonable runtime overhead and by using a novel compression scheme, we significantly reduce the size of traces. Our framework enables building a wide variety of tools for understanding program behavior. As examples of the applicability of our framework, we present a program analysis and a data locality profiling tool. Our program analysis tool is a time travel debugger that enables a developer to debug in both forward and backward direction over an execution trace with nearly all information available as in a regular debugging session. Our profiling tool has been used to improve data locality and reduce the dynamic working sets of real world applications.

A Rewriting-based Approach to Trace Analysis

2005

We present a rewriting-based algorithm for efficiently evaluating future time Linear Temporal Logic (LTL) formulae on finite execution traces online. While the standard models of LTL are infinite traces, finite traces appear naturally when testing and/or monitoring real applications that only run for limited time periods. The presented algorithm is implemented in the Maude executable specification language and essentially consists of a set of equations establishing an executable semantics of LTL using a simple formula transforming approach. The algorithm is further improved to build automata on-the-fly from formulae, using memoization. The result is a very efficient and small Maude program that can be used to monitor program executions. We furthermore present an alternative algorithm for synthesizing provably minimal observer finite state machines (or automata) from LTL formulae, which can be used to analyze execution traces without the need for a rewriting system, and can hence be used by observers written in conventional programming languages. The presented work is part of an ambitious runtime verification and monitoring project at NASA Ames, called PATHEXPLORER, and demonstrates that rewriting can be a tractable and attractive means for experimenting and implementing program monitoring logics.

Whole Execution Traces

37th International Symposium on Microarchitecture (MICRO-37'04)

Different types of program profiles (control flow, value, address, and dependence) have been collected and extensively studied by researchers to identify program characteristics that can then be exploited to develop more effective compilers and architectures. Due to the large amounts of profile data produced by realistic program runs, most work has focused on separately collecting and compressing different types of profiles. In this paper we present a unified representation of profiles called Whole Execution Trace (WET) which includes the complete information contained in each of the above types of traces. Thus WETs provide a basis for a next generation software tool that will enable mining of program profiles to identify program characteristics that require understanding of relationships among various types of profiles. The key features of our WET representation are: WET is constructed by labeling a static program representation with profile information such that relavent and related profile information can be directly accessed by analysis algorithms as they traverse the representation; a highly effective two tier strategy is used to significantly compress the WET; and compression techniques are designed such that they do not adversely affect the ability to rapidly traverse WET for extracting subsets of information corresponding to individual profile types as well as a combination of profile types (e.g., in form of dynamic slices of WETs). Our experimentation shows that on an average execution traces resulting from execution of 647 Million statements can be stored in 331 Megabytes of storage after compression. The compression factors range from 16 to 83. Moreover the rates at which different types of profiles can be individually or simultaneously extracted are high.

Automatic inspection of program state in an uncooperative environment

Software: Practice and Experience

The program state is formed by the values that the program manipulates. These values are stored in the stack, in the heap, or in static memory. The ability to inspect the program state is useful as a debugging or as a verification aid. Yet, there exists no general technique to insert inspection points in type-unsafe languages such as C or C++. The difficulty comes from the need to traverse the memory graph in a so-called uncooperative environment. In this dissertation, we propose an automatic technique to deal with this problem. We introduce a static code transformation approach that inserts in a program the instrumentation necessary to report its internal state. Our technique has been implemented in Low Level Virtual Machine (LLVM). It is possible to adjust the granularity of inspection points trading precision for performance. In this paper, we demonstrate how to use inspection points to debug compiler optimizations; to augment benchmarks with verification code; and to visualize data structures.

Creative Commons Attribution License. Runtime Verification Based on Executable Models: On-the-Fly Matching of Timed Traces

2016

Runtime verification is checking whether a system execution satisfies or violates a given correctness property. A procedure that automatically, and typically on the fly, verifies conformance of the sys-tem’s behavior to the specified property is called a monitor. Nowadays, a variety of formalisms are used to express properties on observed behavior of computer systems, and a lot of methods have been proposed to construct monitors. However, it is a frequent situation when advanced formalisms and methods are not needed, because an executable model of the system is available. The original purpose and structure of the model are out of importance; rather what is required is that the system and its model have similar sets of interfaces. In this case, monitoring is carried out as follows. Two “black boxes”, the system and its reference model, are executed in parallel and stimulated with the same input sequences; the monitor dynamically captures their output traces and tries to match them. T...

Rewriting-Based Techniques for Runtime Verification

Automated Software Engineering, 2005

Techniques for efficiently evaluating future time Linear Temporal Logic (abbreviated LTL) formulae on finite execution traces are presented. While the standard models of LTL are infinite traces, finite traces appear naturally when testing and/or monitoring real applications that only run for limited time periods. A finite trace variant of LTL is formally defined, together with an immediate executable semantics which turns out to be quite inefficient if used directly, via rewriting, as a monitoring procedure. Then three algorithms are investigated. First, a simple synthesis algorithm for monitors based on dynamic programming is presented; despite the efficiency of the generated monitors, they unfortunately need to analyze the trace backwards, thus making them unusable in most practical situations. To circumvent this problem, two rewriting-based practical algorithms are further investigated, one using rewriting directly as a means for online monitoring, and the other using rewriting to generate automata-like monitors, called binary transition tree finite state machines (and abbreviated BTT-FSMs). Both rewriting algorithms are implemented in Maude, an executable specification language based on a very efficient implementation of term rewriting. The first rewriting algorithm essentially consists of a set of equations establishing an executable semantics of LTL, using a simple formula transforming approach. This algorithm is further improved to build automata on-the-fly via caching and reuse of rewrites (called memoization), resulting in a very efficient and small Maude program that can be used to monitor program executions. The second rewriting algorithm builds on the first one and synthesizes provably minimal BTT-FSMs from LTL formulae, which can then be used to analyze execution traces online without the need for a rewriting system. The presented work is part of an ambitious runtime verification and monitoring project at NASA Ames, called Path Explorer, and demonstrates that rewriting can be a tractable and attractive means for experimenting and implementing logics for program monitoring.