Framework for instruction-level tracing and analysis of programs (original) (raw)

Framework for instruction-level tracing and analysis of program executions

Proceedings of the 2nd international conference on Virtual execution environments - VEE '06, 2006

Program execution traces provide the most intimate details of a program's dynamic behavior. They can be used for program optimization, failure diagnosis, collecting software metrics like coverage, test prioritization, etc. Two major obstacles to exploiting the full potential of information they provide are: (i) performance overhead while collecting traces, and (ii) significant size of traces even for short execution scenarios. Reducing information output in an execution trace can reduce both performance overhead and the size of traces. However, the applicability of such traces is limited to a particular task. We present a runtime framework with a goal of collecting a complete, machine-and task-independent, user-mode trace of a program's execution that can be re-simulated deterministically with full fidelity down to the instruction level. The framework has reasonable runtime overhead and by using a novel compression scheme, we significantly reduce the size of traces. Our framework enables building a wide variety of tools for understanding program behavior. As examples of the applicability of our framework, we present a program analysis and a data locality profiling tool. Our program analysis tool is a time travel debugger that enables a developer to debug in both forward and backward direction over an execution trace with nearly all information available as in a regular debugging session. Our profiling tool has been used to improve data locality and reduce the dynamic working sets of real world applications.

Whole execution traces and their applications

ACM Transactions on Architecture and Code Optimization, 2005

Different types of program profiles (control flow, value, address, and dependence) have been collected and extensively studied by researchers to identify program characteristics that can then be exploited to develop more effective compilers and architectures. Because of the large amounts of profile data produced by realistic program runs, most work has focused on separately collecting and compressing different types of profiles. In this paper, we present a unified representation of profiles called Whole Execution Trace (WET), which includes the complete information contained in each of the above types of traces. Thus, WETs provide a basis for a next-generation software tool that will enable mining of program profiles to identify program characteristics that require understanding of relationships among various types of profiles. The key features of our WET representation are: WET is constructed by labeling a static program representation with profile information such that relevant and related profile information can be directly accessed by analysis algorithms as they traverse the representation; a highly effective two-tier strategy is used to significantly compress the WET; and compression techniques are designed such that they minimally affect the ability to rapidly traverse WET for extracting subsets of information corresponding to individual profile types as well as a combination of profile types. Our experimentation shows that on, an average, execution traces resulting from execution of 647 million statements can be stored in 331 megabytes of storage after compression. The compression factors range from 16 to 83. Moreover the rates at which different types of profiles can be individually or simultaneously extracted are high. We present two applications of WETs, dynamic program slicing and dynamic version matching, which make effective use of multiple kinds of profile information contained in WETs.

Whole Execution Traces

37th International Symposium on Microarchitecture (MICRO-37'04)

Different types of program profiles (control flow, value, address, and dependence) have been collected and extensively studied by researchers to identify program characteristics that can then be exploited to develop more effective compilers and architectures. Due to the large amounts of profile data produced by realistic program runs, most work has focused on separately collecting and compressing different types of profiles. In this paper we present a unified representation of profiles called Whole Execution Trace (WET) which includes the complete information contained in each of the above types of traces. Thus WETs provide a basis for a next generation software tool that will enable mining of program profiles to identify program characteristics that require understanding of relationships among various types of profiles. The key features of our WET representation are: WET is constructed by labeling a static program representation with profile information such that relavent and related profile information can be directly accessed by analysis algorithms as they traverse the representation; a highly effective two tier strategy is used to significantly compress the WET; and compression techniques are designed such that they do not adversely affect the ability to rapidly traverse WET for extracting subsets of information corresponding to individual profile types as well as a combination of profile types (e.g., in form of dynamic slices of WETs). Our experimentation shows that on an average execution traces resulting from execution of 647 Million statements can be stored in 331 Megabytes of storage after compression. The compression factors range from 16 to 83. Moreover the rates at which different types of profiles can be individually or simultaneously extracted are high.

GDBTrace : A Tool for Tracing Program Execution at the Statement Level

2003

It is a well-recognized fact that debugging and tracking down the cause of a software failure are particularly difficult tasks in software development. The complexity of these processes increases with the size and age of the system. Often, it is possible to trace the cause of a failure to a certain routine or file; however, complex statements or convoluted logic may make finding the exact cause extremely difficult. This paper introduces GDBTrace, a statement-level tracing tool developed at the University of Ottawa. Although designed with statement level tracing in mind, this tool is also capable of keeping track of routine calls and variable modifications during program execution. GDBTrace has useful options that enable it to trace only selected parts of a system, thereby improving efficiency and reducing the amount of work required to analyse the trace.

CoMET: Compressing Microcontroller Execution Traces to Assist System Understanding

Recent technology advances have made possible the retrieval of execution traces on microcontrollers. However, even after a short execution time of the embedded program, the collected execution trace contains a huge amount of data. This is due to the cyclic nature of embedded programs. The huge amount of data makes extremely difficult and time-consuming the understanding of the program behavior. Software engineers need a way to get a quick understanding of execution traces. In this paper, we present an approach based on an improvement of the Sequitur algorithm to compress large execution traces of microcontrollers. By leveraging both cycles and repetitions present in such execution traces, our approach offers a compact and accurate compression of execution traces. This compression may be used by software engineers to understand the behavior of the system, for instance, identifying cycles that appears most often in the trace or comparing different cycles. Our evaluations give two major results. On one hand our approach gives high compression rate on microcontroller execution traces. On the other hand software engineers mostly agree that generated outputs (compressions) may help reviewing and understanding execution traces.

Software Profiling for Deterministic Replay Debugging of User Code

2006

Significant time is spent by companies in trying to reproduce and fix bugs in their software. The process of testing and debugging can immensely benefit from a tool that supports Deterministic Replay Debugging (DRD). A tool that supports DRD will allow a user to record a program's execution in a log, and to deterministically replay every single instruction executed as part of the application using the log.

Efficient program execution indexing

2008

Execution indexing uniquely identifies a point in an execution. Desirable execution indices reveal correlations between points in an execution and establish correspondence between points across multiple executions. Therefore, execution indexing is essential for a wide variety of dynamic program analyses, for example, it can be used to organize program profiles; it can precisely identify the point in a re-execution that corresponds to a given point in an original execution and thus facilitate debugging or dynamic instrumentation. In this paper, we formally define the concept of execution index and propose an indexing scheme based on execution structure and program state. We present a highly optimized online implementation of the technique. We also perform a client study, which targets producing a failure inducing schedule for a data race by verifying the two alternative happens-before orderings of a racing pair. Indexing is used to precisely locate corresponding points across multiple executions in the presence of non-determinism so that no heavyweight tracing/replay system is needed.

Execution architecture independent program tracing

1991

Due to dramatic increases in microprocessor performance, medium-grain ensemble multiprocessors have become an economical hardware platform on which to solve compute-intensive problems. Unfortunately, the use of these systems to solve such problems is hampered by a lack of understanding about the behavior of parallel programs at all levels of execution | hardware, operating system, and runtime system. The goal of the Parallel Execution Evaluation Testbed project at the University of Colorado is to improve the general understanding about the performance of parallel programs and systems at these levels using trace-driven simulation. In this paper, we discuss the validity of trace-driven simulation of parallel programs, the di culties of applying this approach to evaluating parallel programs, and a new technique to abstract the logical behavior of the program and capture it in the traces we collect. We describe how this abstract trace information can be used to understand the behavior of parallel systems.

SEAT: A usable trace analysis tool

2005

Abstract Understanding the dynamics of a program can be made easier if dynamic analysis techniques are used. However, the extraordinary size of typical execution traces makes exploring the content of traces a tedious task. In this paper, we present a tool called SEAT (software exploration and analysis tool) that implements several operations that can help software engineers understand the content of a large execution trace. Perhaps, the most powerful aspect of SEAT is the various filtering techniques it incorporates.

Compressing extended program traces using value predictors

Trace files record the execution behavior of programs for future analysis. Unfortunately, nontrivial program traces tend to be very large and have to be compressed. While good compression schemes exist for traces that capture only the PCs of the executed instructions, these schemes can be ineffective on extended traces that include important additional information such as register values or effective addresses. Our novel, value-prediction-based approach compresses extended traces up to 22.8 times better and about two and a half times as well on average. In addition to the higher compression rate, our lossless single-pass algorithm has a fixed memory requirement and compresses traces faster than other algorithms. It achieves compression rates of up to 6170. This paper describes the design of our compression method and illustrates how value predictors can be used to effectively compress extended program traces.