Conventions — WebAssembly 2.0 + exception handling (Draft 2025-03-20) (original) (raw)

The binary format for WebAssembly modules is a dense linear encoding of their abstract syntax.[1]

The format is defined by an attribute grammar whose only terminal symbols are bytes. A byte sequence is a well-formed encoding of a module if and only if it is generated by the grammar.

Each production of this grammar has exactly one synthesized attribute: the abstract syntax that the respective byte sequence encodes. Thus, the attribute grammar implicitly defines a decoding function (i.e., a parsing function for the binary format).

Except for a few exceptions, the binary grammar closely mirrors the grammar of the abstract syntax.

Note

Some phrases of abstract syntax have multiple possible encodings in the binary format. For example, numbers may be encoded as if they had optional leading zeros. Implementations of decoders must support all possible alternatives; implementations of encoders can pick any allowed encoding.

The recommended extension for files containing WebAssembly modules in binary format is “\(\mathtt{.wasm}\)” and the recommended Media Type is “\(\mathtt{application/wasm}\)”.

Grammar

The following conventions are adopted in defining grammar rules for the binary format. They mirror the conventions used for abstract syntax. In order to distinguish symbols of the binary syntax from symbols of the abstract syntax, \(\mathtt{typewriter}\) font is adopted for the former.

Note

For example, the binary grammar for number types is given as follows:

\[\begin{split}\begin{array}{llcll@{\qquad\qquad}l} \def\mathdef1279#1{{}}\mathdef1279{number types} & \href{../binary/types.html#binary-numtype}{\mathtt{numtype}} &::=& \def\mathdef1319#1{\mathtt{0x#1}}\mathdef1319{7F} &\Rightarrow& \href{../syntax/types.html#syntax-valtype}{\mathsf{i32}} \\ &&|& \def\mathdef1320#1{\mathtt{0x#1}}\mathdef1320{7E} &\Rightarrow& \href{../syntax/types.html#syntax-valtype}{\mathsf{i64}} \\ &&|& \def\mathdef1321#1{\mathtt{0x#1}}\mathdef1321{7D} &\Rightarrow& \href{../syntax/types.html#syntax-valtype}{\mathsf{f32}} \\ &&|& \def\mathdef1322#1{\mathtt{0x#1}}\mathdef1322{7C} &\Rightarrow& \href{../syntax/types.html#syntax-valtype}{\mathsf{f64}} \\ \end{array}\end{split}\]

Consequently, the byte \(\def\mathdef1323#1{\mathtt{0x#1}}\mathdef1323{7F}\) encodes the type \(\href{../syntax/types.html#syntax-valtype}{\mathsf{i32}}\),\(\def\mathdef1324#1{\mathtt{0x#1}}\mathdef1324{7E}\) encodes the type \(\href{../syntax/types.html#syntax-valtype}{\mathsf{i64}}\), and so forth. No other byte value is allowed as the encoding of a number type.

The binary grammar for limits is defined as follows:

\[\begin{split}\begin{array}{llclll} \def\mathdef1279#1{{}}\mathdef1279{limits} & \href{../binary/types.html#binary-limits}{\mathtt{limits}} &::=& \def\mathdef1325#1{\mathtt{0x#1}}\mathdef1325{00}~~n{:}\href{../binary/values.html#binary-int}{\def\mathdef1284#1{{\mathtt{u}#1}}\mathdef1284{\mathtt{32}}} &\Rightarrow& \{ \href{../syntax/types.html#syntax-limits}{\mathsf{min}}~n, \href{../syntax/types.html#syntax-limits}{\mathsf{max}}~\epsilon \} \\ &&|& \def\mathdef1326#1{\mathtt{0x#1}}\mathdef1326{01}~~n{:}\href{../binary/values.html#binary-int}{\def\mathdef1284#1{{\mathtt{u}#1}}\mathdef1284{\mathtt{32}}}~~m{:}\href{../binary/values.html#binary-int}{\def\mathdef1284#1{{\mathtt{u}#1}}\mathdef1284{\mathtt{32}}} &\Rightarrow& \{ \href{../syntax/types.html#syntax-limits}{\mathsf{min}}~n, \href{../syntax/types.html#syntax-limits}{\mathsf{max}}~m \} \\ \end{array}\end{split}\]

That is, a limits pair is encoded as either the byte \(\def\mathdef1327#1{\mathtt{0x#1}}\mathdef1327{00}\) followed by the encoding of a \(\href{../syntax/values.html#syntax-int}{\mathit{u32}}\) value, or the byte \(\def\mathdef1328#1{\mathtt{0x#1}}\mathdef1328{01}\) followed by two such encodings. The variables \(n\) and \(m\) name the attributes of the respective \(\href{../binary/values.html#binary-int}{\def\mathdef1284#1{{\mathtt{u}#1}}\mathdef1284{\mathtt{32}}}\) nonterminals, which in this case are the actual unsigned integers those decode into. The attribute of the complete production then is the abstract syntax for the limit, expressed in terms of the former values.

Auxiliary Notation

When dealing with binary encodings the following notation is also used:

Vectors

Vectors are encoded with their \(\href{../binary/values.html#binary-int}{\def\mathdef1284#1{{\mathtt{u}#1}}\mathdef1284{\mathtt{32}}}\) length followed by the encoding of their element sequence.

\[\begin{split}\begin{array}{llclll@{\qquad\qquad}l} \def\mathdef1279#1{{}}\mathdef1279{vector} & \href{../binary/conventions.html#binary-vec}{\mathtt{vec}}(\mathtt{B}) &::=& n{:}\href{../binary/values.html#binary-int}{\def\mathdef1284#1{{\mathtt{u}#1}}\mathdef1284{\mathtt{32}}}~~(x{:}\mathtt{B})^n &\Rightarrow& x^n \\ \end{array}\end{split}\]