Developer’s Guide — Emscripten 4.0.9-git (dev) documentation (original) (raw)

Fork me on GitHub

This article provides information that is relevant to people who want to contribute to Emscripten. We welcome contributions from anyone that is interested in helping out!

Tip

The information will be less relevant if you’re just using Emscripten, but may still be of interest.

Setting up

For contributing to core Emscripten code, such as emcc.py, you don’t need to build any binaries as emcc.py is in Python, and the core JS generation is in JavaScript. You do still need binaries for LLVM and Binaryen, which you can get using the emsdk:

emsdk install tot emsdk activate tot

This with install the latest “tip-of-tree” binaries needed to run Emscripten. You can use these emsdk-provided binaries with a git checkout of the Emscripten repository. To do this, you can either edit your local .emscripten config file, or set EM_CONFIG=/path/to/emsdk/.emscripten in your environment.

If you do want to contribute to LLVM or Binaryen, or to test modifications to them, you canbuild them from source.

Repositories and branches of interest

The Emscripten main repository is https://github.com/emscripten-core/emscripten.

Aside from the Emscripten repo, the other codebases of interest are LLVM and Binaryen, which Emscripten invokes, andhave their own repos.

Submitting patches

Patches should be submitted as pull requests in the normal way on GitHub.

When submitting patches, please:

Code reviews

One of the core developers will review a pull request before merging it. If several days pass without any comments on your PR, please comment in the PR which will ping them. (If that happens, sorry! Sometimes things get missed.)

Compiler overview

The Emscripten Compiler Frontend (emcc) is a python script that manages the entire compilation process:

Emscripten Test Suite

Emscripten has a comprehensive test suite, which covers virtually all Emscripten functionality. These tests are run on CI automatically when you create a pull request, and they should all pass. If you run into trouble with a test failure you can’t fix, please let the developers know.

Bisecting

If you find a regression, bisection is often the fastest way to figure out what went wrong. This is true not just for finding an actual regression in Emscripten but also if your project stopped working when you upgrade, and you need to investigate if it’s an Emscripten regression or something else. The rest of this section covers bisection on Emscripten itself. It is hopefully useful for both people using Emscripten as well as Emscripten developers.

If you have a large bisection range - for example, that covers more than one version of Emscripten - then you probably have changes across multiple repos (Emscripten, LLVM, and Binaryen). In that case the easiest and fastest thing is to bisect using emsdk builds. Each step of the bisection will download a build produced by the emscripten releases builders. Using this approach you don’t need to compile anything yourself, so it can be very fast!

To do this, you need a basic understanding of Emscripten’srelease processThe key idea is that:

can install an arbitrary build of emscripten from any point in the past (assuming the build succeeded). Each build is identified by a hash (a long string of numbers and characters), which is a hash of a commit in thereleases repo. The mapping of Emscripten release numbers to such hashes is tracked byemscripten-releases-tags.json in the emsdk repo.

With that background, the bisection process would look like this:

  1. Find the hashes to bisect between. You may already know them if you found the problem on tot builds. If instead you only know Emscripten version numbers, use emscripten-releases-tags.json to find the hashes.
  2. Using those hashes, do a normal git bisect on the emscripten-releasesrepo.
  3. In each step of the bisection, download the binary build for the current commit hash (in the emscripten-releases repo that you are bisecting on) using emsdk install HASH. Then test your code and dogit bisect good or git bisect bad accordingly, and keep bisecting until you find the first bad commit.

The first bad commit is a single change in the releases repo. That commit will generally update a single sub-repo (Emscripten, LLVM, or Binaryen) to add one or more new changes. Often that list will be very short or even a single commit, and you can see which actual commit caused the problem. When filing a bug, mentioning such a bisection result can greatly speed things up (even if that commit contains multiple changes).

If that commit contains multiple changes then you can optionally bisect further on the specific repo (as all the changes will normally be in just one of them, with the others kept fixed). Doing this will require rebuilding locally, which was not needed in the main bisection described in this section.

Working with C structs and defines

If you change the layout of C structs or modify C defines that are used in JavaScript library files you may need to modify tools/struct_info.json. Any time that file is modified or a struct layout is changed you will need to run./tools/gen_struct_info.py to re-generate the information used by JavaScript. Note that you need to run both ./tools/gen_struct_info.py and./tools/gen_struct_info.py --wasm64.

The test_gen_struct_info test will fail if you forget to do this.

See also