Skip to content

Compiling to WebAssembly with Binaryen

pannous edited this page Feb 7, 2018 · 40 revisions

This page explains how you can write a compiler to WebAssembly using Binaryen. First, let's get some FAQs out of the way.

Why compile to WebAssembly?

WebAssembly is a cross-browser standard for executable code. By compiling to it, you can run your code on the web, without plugins.

Why compile to WebAssembly using Binaryen?

There are already a few ways to compile to WebAssembly, and more will probably appear. Different approaches can have different benefits and tradeoffs. Specifically, Binaryen aims to be

  • Simple,
  • Fast,
  • Still generate pretty good code despite the first two.

Binaryen keeps things simple and fast by using a WebAssembly AST as its main internal IR. This makes sense for two main reasons:

  • We know our target is WebAssembly, and we don't need to think about other targets - that's a job for WebAssembly VMs! Here we don't need extra abstraction layers for numerous targets.
  • Obviously multiple IRs can help in optimization - that's another reason most compilers have them (in particular, SSA-based IRs). However, this matters less for us, for 3 reasons:
  • First, WebAssembly VMs in browsers will use a powerful SSA-based IR to optimize your code anyhow, because WebAssembly is a portable target, higher-level than what compilers like LLVM generally emit. For that reason, standard compiler optimizations matter less for us than in general.
  • Second, WebAssembly needs a bunch of special optimizations - it's designed to be compact and efficient for transmission, and as a result, looks somewhat different than typical assembly languages. So we need a WebAssembly AST anyhow to generate good code, and that's what we have.
  • Finally, it is becoming more and more clear that really good performance depends on language-specific optimizations, and not just the standard ones. That's why languages like Swift and Rust have their own IRs and optimizers before they feed into LLVM and optimize there. The same model can work with Binaryen: perform your language-specific optimizations first on your own IR that you already have, then feed that into Binaryen. As a result, you'll still have those language-specific optimizations.

Therefore even with a single IR in Binaryen we should be able to emit fairly good code. And with just one IR, you can avoid a substantial amount of overhead that most compilers have.

In addition, Binaryen's IR is designed to be lightweight and fast:

  • Small data structures (e.g., nodes do not refer to their parents)
  • Multithreaded optimization (simple in part because of those small data structures with minimal cross-references)
  • Arena allocation, which is both fast and also encourages relevant content to be contiguous in memory
  • Minimize allocation, instead prefer to reuse memory

Note that we said Binaryen uses WebAssembly as its main IR. Binaryen already has optional support for a more CFG-style input IR as well, for convenience; others may follow, but a fast one-IR path will remain.

What do I need to have in order to use Binaryen to compile to WebAssembly?

Binaryen expects input that is generally compatible with being compiled to WebAssembly. Specifically, you should be able to emit data for it in the following form:

  • Basic types are i32, i64, f32, f64 (no i8 or i16, and no i57 or anything else).
  • Mathematical operations are everything in the WebAssembly spec: add, sub, mul, ctz, popclt, etc. etc., read the spec for details.
  • "Native" control flow is structured - blocks, ifs, loops, etc. - and you can pass that in if you have it. If not, you can provide an arbitrary control flow graph, which Binaryen will "reloop" into structured control flow for you (more details later down).
  • WebAssembly IR is an AST, which means it can have nested expressions. You can pass that in if you have it. If, instead, you have a more "flat" representation of simple instructions in lists in basic blocks, then you can pass that in as well, just create Blocks and fill them with such simple instructions. That will use more local variables, but the Binaryen optimizer can optimize those out.
  • WebAssembly is not an SSA-based IR. Instead, you define an arbitrary number of local variables and can read and write to them without limit. You can pass that in if you have it. If instead you have SSA, then that should work almost trivially, just provide each SSA variable as a WebAssembly local variable (obviously you will have a lot per function, but the Binaryen optimizer can help there). For phis, the CFG support Binaryen has accepts code on branches, so you can replace a phi at a block with setting of local variables on the branch edges leading to it, which should also be almost trivial.
  • WebAssembly defines no built-in ABI. All it provides is the ability to define functions and call them (without thinking about the ABI - you perform a call and it just happens, i.e., the native call stack is managed for you, including arguments, local variables, etc.), and a linear memory that you can use however you want. You can use that linear memory to implement a user stack, for example by deciding that address 4 is the location of the stack pointer (4 might make sense as the first non-0 aligned address), and emitting code to use that stack however you want. For example, you can see which local variables need to be on the user stack (if their address is taken, as local variables are like registers, they are not in linear memory), and allocate and free space for them in function call prologues and epilogues and so forth. Some languages may not need such a stack, of course, it's entirely up to you what you do with linear memory.

Technical details

The most direct way to use Binaryen is to use the C++ API, which is what Binaryen is written in. Alternatively, there is a C API. The C API is smaller and simpler, and probably easier to use, so these docs will focus on that. It is also more likely to remain stable over time.

C++ API

The headers have comments in them explaining things. You can also see the code in the various tools, e.g., to see how to write code to binary, look at wasm-as.cpp. The implemenation of the C API may also help, in binaryen-c.cpp. For a complete example of a compiler, see the asm2wasm tool in Binaryen, which compiles asm.js to WebAssembly.

C API

The C API is in a single header here. That contains everything you need together with the libbinaryen library which is built in lib/.

There is a hello world test which is a good starting point. There is also a kitchen-sink test as well, which should cover practically all the API.

When compiling to Binaryen, you'll probably do something like what you see in those examples, which follows this pattern:

  • Create a Module. A Module represents a WebAssembly module. It will contain code and data.
  • Create functions.
  • Each function needs a type: what type parameters it receives, and what type it returns.
  • The code in the function. You can create expressions using the creator functions like BinaryenGetLocal which creates an AST node that reads a local. For details on what the AST nodes are, see the wasm spec and Binaryen's core wasm.h header.
  • When creating nodes, you'll need to specific opcode types. Opcodes are numbers that are generated by e.g. BinaryenAbs() for Abs (absolute value). These are function calls so that you don't need to use the C header if you don't want to. Their values are fixed so you can cache the output into a variable, if you want, for performance.
  • You'll also create literal values for constants, using e.g. BinaryenLiteralInt32() to get a literal representing a number of type i32. Literals are passed around by value, they are very small structs.
  • You can then create the function, passing it the type and the expression which will be its body.
  • Create imports and exports: what the Module receives from the outside, and what it provides. See the wasm spec for more details.
  • Set up memory: you can define static data that will be part of the Module, and will reside in the memory the Module sees when it runs.
  • When you have a full Module, you can perform operations on it, like
  • Validate it, to check for any errors. It's helpful to validate during development and debugging, and we don't do it automatically (as it can take noticeable time), so except in stable production builds it's best to validate as soon as the module is complete, and before saving it or optimizing it, etc.
  • Print it for debugging purposes.
  • Write the module into a buffer. That binary data is the final WebAssembly code, that you can run in a browser.
  • Modules must be cleaned up when you are finished with them, using BinaryenModuleDispose. That frees the Module and everything on it (i.e. the Module is the only thing you need to worry about memory management for).

For a complete example of a compiler using the C API, look at the mir2wasm project, which compiles Rust into WebAssembly.

CFG API

As mentioned earlier, Binaryen's native IR is an AST, but it can also receive input in an arbitrary control flow graph, which is a very common case. Binaryen can then "reloop" that code into structured control flow. This works for any CFG, even irreducible ones.

To do so, use the CFG/Relooper part of the C API, as follows:

  • Create a Relooper instance with RelooperCreate().
  • Create basic blocks with RelooperAddBlock(). The contents of a basic block is a Binaryen expression! In other words, it can be anything in the WebAssembly AST. And that content is not touched when relooping, we just add necessary things around it (loops, ifs, etc.). The one thing you should avoid is control flow in the bodies of blocks - it's actually ok to have internal control flow if you want that, but if you branch outside, that can't work. (However, you probably don't want that anyhow since you're using the relooper to generate control flow for you!) In the simple case where you have just straightline code in your basic blocks, you can create an AST Block (using BinaryenBlock()) with the list of instructions on it.
  • Create branches between blocks, using RelooperAddBranch(). A branch goes from one block to another, if a condition occurs. The condition is a regular Binaryen expression, which should be an i32 (for example, you might create an expression that compares a local to a specific value, and the branch would be taken if that is true). The condition can also be NULL, and the branches out of a block should always contain exactly one such branch, which functions as the default: if no other condition is true, we take that route.
  • You can also put code on a branch. The code happens as the branch is taken, before we reach the target. This is useful for implementing phis, if your compiler has them.
  • Finally, after creating your blocks and branches, just call RelooperRenderAndDispose(). That will reloop the code and return a Binaryen expression which contains all those basic blocks and branches, properly structured. (This call will also dispose of the Relooper instance, and you don't need to worry about memory management.)

Note that the relooper output is not optimized by default: you will see redundant blocks and so forth. If you optimize your code (using BinaryenModuleOptimize()) then you should see nice control flow.

See binaryen-c.h and src/CFG/Relooper.h for more technical details, and the test suite for concrete examples. For a full compiler, see the mir2wasm project mentioned earlier, which uses the CFG API in addition to the C API.

Debugging C API usage using tracing

The C API (including the CFG API) have a tracing option that prints out C API commands for each command you issue. The result is a runnable program that does the same things you were doing when you ran the trace. This lets you easily generate a standalone testcase from your compiler, without a dependency on your compiler itself, it will just do the same Binaryen C API calls that you did.

See BinaryenSetAPITracing in binaryen-c.h for more details. There is also an example of this in test/example/c-api-kitchen-sink.c, and check.py checks that tracing outputs a proper program for that by both matching against the known correct output, and also building and running it.

Running the generated WebAssembly

The end result of compilation is a WebAssembly binary. You can run that in a browser, giving it its imports, and receiving and calling its exports. That's all there is to it.

However, you may also want to use the Emscripten compiler infrastructure. Emscripten lets you "link" with JavaScript libraries to do useful things, like render WebGL, run a browser main loop, handle a filesystem, provide bindings to JS so it's easy to call into your compiled code, etc., as well as a full set of compiled libraries like libc and so forth. To use that, you need to provide Emscripten with your WebAssembly file as well as a metadata file, and call emcc. TODO: There are a few minor details to be finished on the Emscripten side for this to just work, but this is basically what already happens with the LLVM WebAssembly backend and Emscripten; we just need to generalize it a little.

Current limitations

The Binaryen optimizer's output code quality is still improving as we speak. Some core optimizations are done (local coloring/coalescing, local sinking/treeification, etc.) but others not (GVN, LICM, etc.).

Help is welcome!

Clone this wiki locally