An overview of the concepts in async / await in Rust

16 March 2021 · 1009 words

software development

Async / await in Rust promises to simplify concurrent code, and to allow a large number of concurrent tasks to be scheduled at the same time — with less overhead than the same number of OS Threads would require.

In general, async / await lets you write code that avoids "callback hell", in favor of a linear style similar to blocking code while still letting other tasks progress during awaits.

Any minimal async / await code in Rust is made up of at least the following pieces:

An executor
A Future with an internal state machine

But before we dig into those, let's see what operating system threads are all about.

Operating System Threads

Rust has a really good concurrency story. For me it's probably the feature I like the most. Rust prevents you from making a lot of mistakes, mistakes that are easy to make in most other imperative programming languages.

Each new thread allocates a stack, with an OS-defined size. On linux, and other operating systems with virtual memory, the program won't actually use up physical memory even though 1 MiB is "allocated" to the thread when it starts. That way you can start thousands of threads without trouble.

The problems start when the OS needs to context switch between threads, and a lot of CPU time is wasted during signalling between the kernel and the program.

Enter async / await.

Async / await

Since async / await tasks don't require OS signalling to context switch, the theory is that you can have a lot more of them, and they can make your program execute faster than a program with the same number of operating system threads.

The two most important parts of the async / await story are the executor, and the state machine wrapped in a future.

A future is a computation that will finish and yield a result at some later, unknown time. Futures don't have to be executed in an async / await context, they are a tool that is available even in the absence of async / await.

An executor is the thing that takes the state machine futures and drives them to completion.

Executors

Rust purposefully doesn't include an async / await executor in the standard library. There might be executors that are good for small embedded devices, and other executors that are good for large servers.

For the sake of argument, I will pick the very popular tokio executor and describe how it works at a high level.

Tokio in multi-threading mode starts as many operating system threads as there are cores on the given machine, on my computer that would be 8 logical cores which translates to 8 threads.

The "low" number of threads means that if there are 8 currently running tasks, one in each thread, and all of them are blocking (e.g. stupidly calling std::thread::sleep), then it would not be possible for any other tasks to make progress and my hypothetical web server would be frozen.

That's why Alice Ryhl, one of the maintainers of tokio, says that a task should not execute for more than 100 microseconds between two .awaits.

If a task needs to run for a longer time, then it should be spawned using e.g. tokio::task::spawn_blocking which will run on a separate OS thread, or just run the code on an os thread directly using std::thread::spawn.

Futures and state machines

An async function looks something like this:

async fn my_task() -> Option<String> {
    // more code
}

async fn is just sugar for transforming the return parameter into a Future that is implemented as an anonymous struct that can't be named by anyone but the compiler.

The resulting high level code transform (ignoring the anonymous struct) would be something like:

fn my_task() -> Future<Output = Option<String>> {
    // A state machine that when driven by the executor
    // finally yields an Option<String>.
}

When compiling an async function, the Rust compiler will build a state machine of the states the function can be in during its asynchronous life cycle. The two obvious states are Initialized and Complete, but between those any number of states for different code branches can be generated.

As soon as you call .await on the async function, or enqueue it as a new task using the regular non-blocking tokio::spawn method, the state machine will start moving forward. In Rust's async / await model, this is called polling.

Important note about Futures in Rust

Unlike many other programming languages that have types named Future, Rust does NOT start executing a Future automatically. The Future must be driven forward by something. The most common "something" in async / await is the Executor. To start driving a future the easiest way is to .await it. Just do my_task().await.

What about the stack?

Regular functions require a stack. In async / await, the Rust compiler will generate a state machine that is capable of storing the correct amount of data by encoding every possible combination of stack allocations in the form of an enum. Interestingly this includes the entire call graph of the async function. That is why just like recursive types are currently not possible without some hand waving in Rust, recursive async fn is not possible either - unless you Box the recursive call. Boxing the call places it on the heap, in essence turning it to its own private little "stack" segment allocation .

When to use async / await

Do use async / await when

You have to support many concurrent tasks.
You have tasks that call out to non-blocking async functions, e.g. HTTP-calls.
You have an interactive system that gets continuous input while serving output.
You can "guarantee" that your code executes for at most 100 microseconds between .awaits.

Do NOT use async / await when

You need to compute something complex / blocking as fast as possible
You only have a few well known tasks that are independent

Alternatives

Unless you fulfill the criteria for async / await, use something else:

Raw operating system threads std::thread::spawn.
A data-parallelism library like rayon.
I'm sure there are more options out there.