Fuzzing (original) (raw)

Binaryen has built-in fuzzing and reducing capabilities. They can be used both on either Binaryen itself or other compilers, VM, or toolchains.

Fuzzing

Binaryen's wasm-opt tool has the --translate-to-fuzz / -ttf option. When set, it considers the input as a stream of arbitrary bytes that it converts into a valid wasm module - somehow. That is, the input is sort of like a random seed to a deterministic random number generator, and instead of numbers we generate wasm modules.

In other words, you can give wasm-opt -ttf any input file with any contents, and it will create a wasm file. You can then save it (using -o) and run that in another tool. For example, you can run a fuzzing script that generates a random string, feeds that to wasm-opt -ttf, and runs a VM on that output.

For fuzzing of Binaryen itself, the following options are useful:

These two options are not strictly necessary, but can greatly improve execution times, as a single invocation can do a full random module generation + optimization + binary test. For example,

wasm-opt input.dat -ttf --fuzz-exec --fuzz-passes -O3

Even on a fairly low-powered machine this lets afl-fuzz do hundreds of iterations per second.

Generated Wasm File Properties

The output wasms from -ttf are guaranteed to not hang, as they have built-in hang instrumentation. They may trap though. The JS wrapper code will catch those and print them.

Helper scripts

Local fuzzing

fuzz_opt.py is a very useful script that runs

Just running

$ python scripts/fuzz_opt.py

will run the script, which will continue to run until it finds a possible bug.

For maximum throughput, it is recommended to run scripts/fuzz_opt.py including its related binaries on a fast hard drive or, alternatively, on a ramdrive. Running on spinning disks is about an order of magnitude slower.

This script will use existing wasm files as the basis for fuzzing (mutating and expanding upon them), which is good if that set of files represents realistic content. By default the script will use all testcases in the test suite as such initial content, with a priority given to files modified in the last 30 days. You can also put wasm files in the ./fuzz/ directory and it will likewise treat them as high priority initial content, which is useful when you have some local files you want to especially fuzz.

ClusterFuzz

Binaryen has scripts for ClusterFuzz integration. See bundle_clusterfuzz.py.

Reducing

A complementary feature is reducing: taking an existing interesting testcase and reducing it to as small a testcase as possible while keeping it interesting. Binaryen's wasm-reduce tool can do that, using something like

bin/wasm-reduce start.wasm "--command=checker-command test.wasm" -t test.wasm -w work.wasm

This takes an input wasm and a bunch of options:

wasm-reduce works by trying all sorts of changes to the file that shrink it, and if a change is valid (keeps it "interesting", i.e., same result on the command) then we keep it and continue from there.

Reduction can be a slow process, because we need to check every change by running the command - so if the command takes 5 seconds, it may take that long to shrink by a single byte (!). wasm-reduce tries to get around that by taking advantage of the Binaryen optimizer: it will interleave "destructive reduction" (removing code, breaking code in ways that might alter program behavior) with "pass reduction" (running Binaryen optimizer passes, which should not alter program behavior). For example, destructive removal of a condition to an if might let an optimization pass remove one arm of the if, which can be much faster than removing all the parts of the arm one by one.

Setting up dependencies

third_party/setup.py can automatically install the necessary dependencies like the Spidermonkey JS shell (mozjs), the V8 JS shell (d8) and WABT in third_party/.

./third_party/setup.py [mozjs|v8|wabt|all]

Also helps when fuzzing on a ramdrive (requires about 300mb):

./third_party/setup.py all cp -r build/ scripts/ test/ third_party/ /path/to/ramdrive cd /path/to/ramdrive ./scripts/fuzz_opt.py --binaryen-bin build/bin