2.6. Taint Analysis Configuration — Clang 22.0.0git documentation (original) (raw)

The Clang Static Analyzer uses taint analysis to detect injection vulnerability related issues in code. The backbone of taint analysis in the Clang SA is the TaintPropagation modeling checker. The reports are emitted via the optin.taint.GenericTaint (C, C++) checker. The TaintPropagation checker has a default taint-related configuration. The built-in default settings are defined in code, and they are always in effect. The checker also provides a configuration interface for extending the default settings via the optin.taint.TaintPropagation:Config checker config parameter by providing a configuration file to the in YAML format. This documentation describes the syntax of the configuration file and gives the informal semantics of the configuration options.

2.6.1. Overview

Taint analysis works by checking for the occurrence of special operations during the symbolic execution of the program. Taint analysis defines sources, sinks, and propagation rules. It identifies errors by detecting a flow of information that originates from a taint source, reaches a taint sink, and propagates through the program paths via propagation rules. A source, sink, or an operation that propagates taint is mainly domain-specific knowledge, but there are some built-in defaults provided by the TaintPropagation checker. It is possible to express that a statement sanitizes tainted values by providing a Filters section in the external configuration (see Example configuration file and Filter syntax and semantics). There are no default filters defined in the built-in settings. The checker’s documentation also specifies how to provide a custom taint configuration with command-line options.

2.6.2. Example configuration file

The entries that specify arguments use 0-based indexing when specifying

input arguments, and -1 is used to denote the return value.

Filters:

Filter functions

Taint is sanitized when tainted variables are pass arguments to filters.

Filter function

void cleanse_first_arg(int* arg)

Result example:

int x; // x is tainted

cleanse_first_arg(&x); // x is not tainted after the call

Propagations:

Source functions

The omission of SrcArgs key indicates unconditional taint propagation,

which is conceptually what a source does.

Source function

size_t fread(void *ptr, size_t size, size_t nmemb, FILE * stream)

Result example:

FILE* f = fopen("file.txt");

char buf[1024];

size_t read = fread(buf, sizeof(buf[0]), sizeof(buf)/sizeof(buf[0]), f);

// both read and buf are tainted

Propagation functions

The presence of SrcArgs key indicates conditional taint propagation,

which is conceptually what a propagator does.

Propagation function

char *dirname(char *path)

Result example:

char* path = read_path();

char* dir = dirname(path);

// dir is tainted if path was tainted

Sinks:

Sink functions

If taint reaches any of the arguments specified, a warning is emitted.

Sink function

int system(const char* command)

Result example:

const char* command = read_command();

system(command); // emit diagnostic if command is tainted

In the example file above, the entries under the Propagation key implement the conceptual sources and propagations, and sinks have their dedicated Sinks key. The user can define operations (function calls) where the tainted values should be cleansed by listing entries under the Filters key. Filters model the sanitization of values done by the programmer, and providing these is key to avoiding false-positive findings.

2.6.3. Configuration file syntax and semantics

The configuration file should have valid YAML syntax.

The configuration file can have the following top-level keys:

Under the Filters key, the user can specify a list of operations that remove taint (see Filter syntax and semantics for details).

Under the Propagations key, the user can specify a list of operations that introduce and propagate taint (see Propagation syntax and semantics for details). The user can mark taint sources with a SrcArgs key in the Propagation key, while propagations have none. The lack of the SrcArgs key means unconditional propagation, which is how sources are modeled. The semantics of propagations are such, that if any of the source arguments are tainted (specified by indexes in SrcArgs) then all of the destination arguments (specified by indexes in DstArgs) also become tainted.

Under the Sinks key, the user can specify a list of operations where the checker should emit a bug report if tainted data reaches it (see Sink syntax and semantics for details).

2.6.3.1. Filter syntax and semantics

An entry under Filters is a YAML object with the following mandatory keys:

The following keys are optional:

2.6.3.2. Propagation syntax and semantics

An entry under Propagation is a YAML object with the following mandatory keys:

The following keys are optional:

2.6.3.3. Sink syntax and semantics

An entry under Sinks is a YAML object with the following mandatory keys:

The following keys are optional: