Static Program Analysis (original) (raw)
Welcome to the web site for the lecture notes on
![]() |
Anders Møller and Michael I. SchwartzbachDepartment of Computer Science, Aarhus University Last revision: June 2024 ![]() ![]() |
---|
Static program analysis is the art of reasoning about the behavior of computer programs without actually running them. This is useful not only in optimizing compilers for producing efficient code but also for automatic error detection and other tools that can help programmers.
As known from Turing and Rice, all interesting properties of the behavior of programs written in common programming languages are mathematically undecidable. This means that automated reasoning of software generally must involve approximation. It is also well known that testing may reveal errors but not show their absence. In contrast, static program analysis can - with the right kind of approximations - check all possible executions of the programs and provide guarantees about their properties. The challenge when developing such analyses is how to ensure high precision and efficiency to be practically useful.
This teaching material concisely presents the essential principles and algorithms for static program analysis. We emphasize a constraint-based approach where suitable constraint systems conceptually divide analysis into a front-end that generates constraints from program code and a back-end that solves the constraints to produce the analysis results. The style of presentation is intended to be precise but not overly formal. The readers are assumed to be familiar with advanced programming language concepts and the basics of compiler construction.
The concepts are explained using a tiny imperative programming language, TIP, which suffices to illustrate the main challenges that arise with mainstream languages.
The lecture notes, slides, implementation, and exercises have been developed since 2008 for our graduate-level course at Aarhus University. We continue to update the material regularly. Suggestions for improvements are welcome!
Contents
Topics covered:
- type analysis
- the unification solver
- lattices and fixpoints
- fixpoint solvers
- dataflow analysis with monotone frameworks, including
- sign analysis
- live variables analysis, available expressions analysis, very busy expressions analysis, and reaching definitions analysis
- initialized variables analysis
- constant propagation analysis
- interval analysis
- widening and narrowing
- path sensitive and relational analysis
- interprocedural analysis
- context-sensitive analysis (incl. call strings, functional approach)
- distributive analysis (IFDS and IDE)
- control flow analysis
- the cubic solver
- pointer analysis, including
- subset-based points-to analysis (Andersen's analysis)
- unification-based points-to analysis (Steensgaard's analysis)
- flow-sensitive points-to analysis
- null pointer analysis, strong/weak updates
- the basics of abstract interpretation, relating formal semantics and static analysis
- collecting semantics
- abstraction and concretization
- soundness, optimality, and completeness
The analyses are expressed using different kinds of constraint systems, each with their own constraint solvers:
- Unification constraints, with an almost-linear union-find algorithm (for type analysis and Steensgaard's analysis)
- Conditional subset constraints, with a cubic-time algorithm (for control flow analysis and Andersen's analysis)
- Monotone constraints over lattices, with variations of fixpoint solvers (primarily for the flow-sensitive analyses)
Solutions to (some of) the exercises are available to teachers.
Coq formalizations and proofs (by Zesen Qian and Oskar Haarklou Veileborg)
Slides
Lecture slides:
- introduction to TIP
- type analysis and the unification solver
- lattices and fixpoint solvers
- examples of flow-sensitive analyses
- interval analysis, widening and narrowing
- path sensitivity and relational analysis
- interprocedural and context sensitive analysis
- distributive analysis (IFDS and IDE)
- control flow analysis and the cubic solver
- pointer analysis
- abstract interpretation
Implementation
Most of the algorithms and analyses have been implemented (in Scala):
the TIP implementation at github (scaladoc)
(developed by Anders Møller and Gianluca Mezzetti, with contributions from Erik Krogh Kristensen, Christian Budde Christensen, Coen De Roover, and Quentin Stievenart)
The open source version omits certain parts that are left for exercises. Solutions for (many of) these programming exercises are available to teachers by request.
See also the C++ implementation, tipc, by Matthew Dwyer.
Plans for future revisions
Our TODO list:
- more about points-to analysis
- more references to literature
- sparse analysis
- more examples of analyses and abstract domains
- more exercises
- open TIP implementation issues and pull requests
- ...
Feel free to contact the authors if you have any questions or comments to the material.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.