/var/blog

by Marshall Pierce

Category: Software

Rust and WebAssembly With Turtle

In this post, I'll walk through a few of the highlights of getting Turtle, a Rust library for creating animated drawings, to run in the browser with WebAssembly.

Rust has recently gained built-in support for compiling to WebAssembly. While it's currently considered experimental and has some rough edges, it's not too early to start seeing what it can do with some demos. After an experiment running rust-base64 via WebAssembly with surprisingly good performance, I saw this tweet suggesting that someone port Turtle to run in the browser, and figured I'd give it a try (see PR). Turtle is a more interesting port than rust-base64's low level functionality: it's larger, and has dependencies on concepts that have no clear web analog, like forking processes and desktop windows. You can run it yourself with any recent browser; just follow these instructions.

Why Rust + WebAssembly?

See the official WebAssembly site for more detail, but in essence WebAssembly is an escape hatch from the misery of JavaScript as the sole language runtime available in browsers. Even though it's already possible to transpile to JavaScript with tools like emscripten or languages like TypeScript, that doesn't solve problems like JavaScript's lack of proper numeric types. WebAssembly provides a much better target for web-deployed code than JavaScript: it's compact, fast to parse, and JIT compiler friendly.

WebAssembly is not competition for JavaScript directly. It's more akin to JVM bytecode. This makes it a great fit for Rust, though, which has a minimal runtime and no garbage collector.

Turtle

Turtle is a library for making little single-purpose applications that draw graphics ala the Logo language. See below for a simple program (the snowflake example in the Turtle repo) that's most of the way through its drawing logic.

Turtle's existing architecture

Turtle is designed to be friendly to beginner programmers, so unlike most graphics-related code, it's not based around the concept of an explicit render loop. Instead, Turtle programs simply take over the main thread and run straightforward-looking code like this (which draws a circle):

1
2
3
4
5
6
for _ in 0..360 {
    // Move forward three steps
    turtle.forward(3.0);
    // Rotate to the right (clockwise) by 1 degree
    turtle.right(1.0);
}

The small pauses needed for animation are implicit, and programmers can use local variables in a natural way to manage state. This has some implications for running in the browser, as we'll see.

Compiling to wasm32

As with any wasm project, the first step is to get the toolchain:

1
2
rustup update
rustup target add wasm32-unknown-unknown --toolchain nightly

Hypothetically, this would be all that's needed to compile the example turtle programs:

1
cargo +nightly build --target=wasm32-unknown-unknown --examples

However, there were naturally some dependencies that didn't play nice with the wasm32 target, mainly involving Piston, the graphics library used internally. Fortunately, Piston's various crates are nicely broken down to separate the platform-specific parts from platform-agnostic things like math helper functions, so introducing a canvas feature to control Piston dependencies (among other things) was all that was needed. That brings us to:

1
cargo +nightly build --no-default-features --features=canvas --target=wasm32-unknown-unknown --release --examples

Control flow

The current architecture of a Turtle program running on a desktop uses two processes: one to run the user's logic, and another to handle rendering. When the program starts, it first forks a render process, then passes control flow to the user's logic in the parent process. When code like turtle.forward() runs, that ends up passing drawing commands (and other communication) via stdin/stdout to the child rendering process, which uses those to draw into a Piston window. The two-process model allows graphics rendering to be done by the main thread (which is required on macOS) while still allowing users to simply write a main() function and not worry about how to run their code in another thread.

Turtle code doesn't let go of control flow until the drawing has completed. This isn't a problem when Turtle code is running in one process and feeding commands to another process to be rendered to the screen, but browser-resident animation really wants to run code that runs a little bit at a time in an event loop via requestAnimationFrame(). The browser doesn't update the page until control flow for your code has exited. This has some upsides for browsers (no concerns about thread safety when all code is time-sliced into one thread, for instance), but it in Turtle's case, it means that running in the main thread would update the screen exactly once: when the drawing was complete.

We do have one trick up our sleeve, though: Web Workers. These allow simple parallelism, but there is no good way to share memory with a worker and the main thread. (Aha, you say! What about SharedArrayBuffer? Nope, it's been disabled due to Spectre.) What we can do, though, is send messages to the main thread with postMessage(). This way, we can create a worker for Turtle, let it take over that thread entirely, and periodically post messages to be sent to the main thread when it wants to update the UI.

Drawing pixels

Let's consider a very simple Turtle program where the user wishes to draw a line. This boils down to turtle.forward(10) or equivalent. However, this isn't going to get all 10 pixels drawn all at once, because the turtle has a speed. So, it will calculate how long it should take to draw 10 pixels, and draw intermediate versions as time goes on. The same applies when drawing a polygon with a fill: the fill is partially applied at each point in the animation as the edges of the polygon are drawn. As parts of the drawing accumulate over time as the animation continues, the Renderer keeps track of all of them and re-draws them with its Piston Graphics backend every frame.

Naturally, the existing graphics stack that uses piston_window to draw graphics into a desktop app's window isn't going to seamlessly just start running in the browser and render into a <canvas>. So, where should the line be drawn about what is desktop-specific (and will therefore have to be reimplemented for the web)?

One option would be to reimplement the high level turtle interface (forward(), etc) in terms of Canvas drawing calls like beginPath(). While it certainly can be done, it would require reimplementing a lot of code that animates, manages partially-completed polygons, draws the turtle itself, etc. The messages sent to the main thread would likely be a a JS array of closures which, when executed in the appropriate context, would update the <canvas> in the web page. Alternately, the canvas commands to run could be expressed as Objects with various fields, etc. This saves us from worrying about low level pixel wrangling, but at the cost of creating two parallel implementations of some nontrivial logic, one of which is quite difficult to test or reason about in isolation. It also means that as the drawing gets more complex over time, the list of commands to run grows as well, so performance will decay over time. It also involves many small allocations (and consequently more load on the JS GC).

Another approach would be to implement the Piston Graphics trait in terms of the Canvas API. This entails a lot less code duplication because things like animation, drawing the turtle "head" triangle, etc can all be re-used, but it still suffers from the performance decay over time, and has a fair amount of logic that needs to be built in JS, which I was trying to avoid. It's definitely much more practical than the previous option, though.

The approach I chose was to implement Graphics to write to an RGBA pixel buffer, which is easy to then display with a <canvas>. This has a minimal amount of JS needed, and has heavy, but consistent, allocation and message passing cost. Each frame requires one (large) allocation to copy the entire pixel buffer, which is then sent to the main thread and used to update the <canvas>. Obviously there are various ways to optimize this (only sending changed sub-sequences of pixels, etc) but for a first step, sending the entire buffer was functional and reliable. For Turtle's needs, Graphics only needs to draw triangles with a solid fill, which is pretty straightforward.

Passing control flow to Turtle

Getting from main() to Turtle logic is actually a little complicated even when running on a desktop, as mentioned above. It is undesirable to have users deal with running turtle code in a different thread, so instead while Turtle is initializing it does its best to fork at the right time. If users are doing nontrivial initialization code of their own (e.g. parsing command line arguments), they have to be careful to let Turtle fork at the right point.

When running as wasm inside a Worker, initialization is convoluted in a different way. When writing a main() function for a normal OS, all the context you might want is available via environment variables, command line parameters, or of course making syscalls, but there's nothing like that for wasm unless you build it yourself. While you can declare a start function for a wasm module, that doesn't solve our problem since this wasm code depends on knowledge from its environment, like how many pixels are in the canvas it will be drawing to.

While I could have doubled down on "call Turtle::new() at the right point and hope that it will figure it out" by having some wasm-only logic that called into the JS runtime to figure out what it needed, a small macro seemed like a tidier solution. The macro uses conditional compilation to do the right thing when it's compiled for the desktop or for wasm, with straightforward, global-free code in each case.

Before:

1
2
3
4
5
6
fn main() {
    // Must start by calling Turtle::new()
    // Secretly forks a child process and takes over stdout
    let mut turtle = Turtle::new();
    // write your drawing code here
}

After:

1
2
3
run_turtle!(|mut turtle| {
    // write your drawing code here
});

It's harder for users to screw up, and it opens the way to putting the Turtle logic in its own thread (instead of forking) when running on a desktop. This would allow users to use stdout on the desktop (currently claimed for communication with the child process), and the underpinnings could simply use shared memory instead of doing message passing via json over stdin/stdout. And, of course, it lets us get the right wasm-only initialization code in there too!

Anyway, with that in place, initializing the Worker that hosts the wasm is simple enough (see worker.js). Following Geoffroy Couprie's canvas example, we simply allocate a buffer with 4 bytes (RGBA) per pixel and call this function provided by the run_turtle! macro to kick things off:

1
2
3
4
5
6
#[allow(dead_code)]
#[cfg(feature = "canvas")]
#[no_mangle]
pub extern "C" fn web_turtle_start(pointer: *mut u8, width: usize, height: usize) {
    turtle::start_web(pointer, width, height, $f);
}

When a new frame has been rendered, the Rust logic calls into JS, which copies the contents of the pixel buffer and sends it to the main thread, which simply drops it into the <canvas>.

1
2
3
4
5
6
7
const pixelArray = new Uint8ClampedArray(wasmTurtle.memory.buffer, pointer, byteSize);
const copy = Uint8ClampedArray.from(pixelArray);

postMessage({
    'type':   'updateCanvas',
    'pixels': copy
});

Random number generation

When Rust is running on a normal OS, it uses /dev/random or equivalent internally to seed a PRNG. However, WebAssembly isn't an OS, so it doesn't offer a source of random numbers (or files, for that matter, in the same way that x86 asm doesn't have files). In practice, this means that Turtle's use of rand::thread_rng() dies with an "unreachable code" error.

Fortunately, implementing our own Rng is pretty straightforward: the only required method is next_u32(), and that much we can do with JavaScript and the browser's PRNG (32 bit ints are pretty sane in JS; major int weirdness starts at 2^52). First, we'll need a JS function to make a random number between 0 and u32::max_value(), using the normal random range idiom:

1
2
3
4
5
6
web_prng: () => {
    const min = 0;
    const max = 4294967295; // u32 max

    return Math.floor(Math.random() * (max - min + 1)) + min;
}

By making that available to the wasm module when it's initialized, we can then make that callable from Rust:

1
2
3
extern "C" {
    fn web_prng() -> u32;
}

And then use it in our Rng implementation:

1
2
3
4
5
6
7
8
9
pub struct WebRng;

impl ::rand::Rng for WebRng {
    fn next_u32(&mut self) -> u32 {
        unsafe {
            web_prng()
        }
    }
}

All that remained was to adjust how Turtle exposed random numbers so that it would use the appropriate Rng implementation on a desktop OS vs on wasm.

Conclusion

Hopefully this has demonstrated that getting Rust code running in the browser via wasm is pretty achievable even for projects that aren't a drop-in fit for the browser's runtime model.

There were other problems to solve (like measuring time, or being able to println for debugging), but this post is already pretty long, so I'll refer you to the pull request if you want all the details.

Developers Debate Unimportant Things

The other day I read a disappointing screed by a determined developer. The specific text is unimportant, but I'm sure you've read plenty just like it: claims about technology X that the author doesn't like, while simultaneously promoting technology Y that they do like as a technological panacea. Now you know why the specific article is irrelevant -- it could be Sass vs Less, or CoffeeScript vs Dart, or Maven vs Gradle, or K&R vs BSD curly brace placement, and any of those hypothetical articles could be extracted from another with find and replace.

Consider the case of a hypothetical team with a background in technology X that starts a new project in (or migrates to) technology Y. The new work goes well, and the team is enthused about Y. The question is, though, how important is Y specifically to the success of the project? Sometimes, of course, there are cases where technology choice can be a huge factor. Trying to write firmware for an embedded device with an 8K ROM in C++ template metaprogramming (it's Turing-complete, so why not?) is probably not going to go well. I'll set aside these hyperbolic cases, though, as they're rare and easily avoided with a little googling. Let's go back to our hypothetical team and consider the contributions of the following aspects of their successful project:

  • Choosing Y instead of sticking with X or choosing another option Z
  • The architectural cleanliness that comes from an at least partially blank slate
  • Having team members who care about improvement
  • Having team members who know that improvement is possible
  • Engineering leadership's trust in the team's decisions on what to do
  • The organizational health needed to work around a departure from the business-as-usual schedule
  • The corporate political bonhomie needed to allow the engineering organization to innovate rather than take the safe road of continuing with what they already have

I assert that in the common case, the success or failure of a project generally has little to do with the specific technology used. In one way, this is not a radical assertion: the importance of organizational structure, team culture, and other such intangibles has been known for decades. (This is what management is all about, after all.) My point is that even though technology usually isn't the important part, developers argue as if it is. Developers are generally detail oriented, and many (though not all) are also equipped with a surfeit of opinions. Consider topics like compile time vs runtime type checking, Java vs Scala, tabs vs spaces... We can quantify these topics, so we can bring facts (or at least aesthetics) to bear as we bicker over relative merits. My own experience is, of course, only anecdotal evidence, but across the projects I've been involved in both as a regular employee and as a consultant, the choice of technology hasn't been nearly as important to overall productivity as the team's engineering maturity, organizational health, etc. In conversation with fellow technologists, I've found I'm not the only one.

It's understandable that we might tend to attach too much importance to the things that we know well and have some control over, so where do we go from here? I definitely don't want to write off the value of a good ol' my-type-inference-is-better-than-your-type-inference debate; we all learn a lot from the exchange of ideas. Instead, I think that when we debate such things we have to keep in mind that it's only a small part of success, and that ultimately the specific technology probably doesn't matter much. If you have an opportunity to adopt your technology-du-jour because your team has the political will and organizational freedom to do so, you will have necessarily already succeeded at the hard part: being part of an organization that allows for success in the first place. On the other hand, if you're struggling to get momentum adopting a technology that you're confident will help your project, consider that the tech isn't your problem: it's that you're the only one clamoring for improvement.