JavaScript Mutexes
Adding locks to our threads
In my previous post, I introduced shared memory to JavaScript workers. Sharing memory ended up introducing data races. We fixed them using atomics, but atomics can only get us so far. At some point, we’ll need more than just a simple add, subtract, load or exchange. We’ll need to lock for longer, more complex logic.
Mutexes (mutual exclusion locks) allow us to do just that. They’re one of the fundamental threading primitives in most languages. A mutex lets us “lock” on a memory location and then later unlock. And they’d let us synchronize entire blocks of code.
So, let’s just pull out a mutex and…
Um, JavaScript doesn’t have mutexes1. Instead, they have three little functions: Atomics.wait, Atomics.waitAsync, and Atomics.notify. The waitAsync function will give a promise2 so that it can be used in the main thread. The wait function just blocks, so it cannot be used in the main thread. For this article, I’m just going to focus on wait for simplicity. If you need to use waitAsync, you’ll just need to translate over to promises.
Okay, so we have functions to wait and notify on a memory address. But, those aren’t mutexes. So, how does this help us?
Well, it turns out that wait and notify form the foundation of a futex - which is an even lower-level primitive that we can use to make a mutex. A futex lets us wait on an address if the address stores a specific value. It also lets us change the value at that address, and then signal to one or more waiters that the value has changed and they should stop waiting. It’s basically a way to wait on a condition and control the signal that the condition changed.
And this was the last piece we needed. With futexes and atomic operations, we can build up any sort of synchronization primitive - whether it be mutexes, semaphores, condition variables, barriers, wait groups, or anything else. They end up being really powerful - but are also tricky to use.
To make matters worse, JavaScript doesn’t really have good multi-threaded debuggers, sanitizers, or analysis tools. The whole “threading in JavaScript” is really new, which means developing our own primitives inside JavaScript will be pretty tricky.
So, we won’t develop them in JavaScript. We’ll develop them in a language with really good tooling for making your own primitives that people normally give you, and then translate the result into JavaScript.
Out of all the languages to do it in, the one I found is the best at having those types of tools is C. Why? Because every operating system provides futexes, and a lot of developers in C make their own libraries (including “standard” libraries), and a lot of developers in C make tools to improve C, and there’s a lot more papers about making these types of primitives that assume either C or C-style C++. So, it’s our best shot at the initial design and validation prior to a translation.
That said, I’m not going to use a lot of the scary features in C. I’m not going to use malloc or free or macros. I won’t use the Win32 API (just the linux libraries) and I won’t use inline assembly. I am going to use pointers, but those are just memory addresses (basically the index into our shared memory buffer that something lives).
So, let’s get started.
Building our Mutex
For this mutex, I did a lot of looking into a lot of implementations people put online3. I did find some back and forth between the articles where some people were trying to show someone else’s implementation was slow or incorrect in a pathological edge case (like missing the lock some 2^64-1 times in a row). Eventually, I landed on a design with the following lock steps:
A single integer is stored at the mutex address
The mutex address is what is passed to the lock/unlock functions
The integer starts at 0 which signifies “unlocked”
When locking, a compare exchange is done to try to turn from 0 to 1 (locked)
If the compare exchange succeeds, then the lock is obtained
Otherwise, we enter a loop
In the loop, we do a compare exchange to set the value to 2 (contested)
We then wait on the integer’s memory address until it changes from 2 to something else
Then we try to acquire the lock and loop again if needed
For unlocking, I do the following:
Subtract 1 from the atomic value. If I end with 0, then no one was waiting and I return
Otherwise, someone was waiting, so I signal that I’m done and wake up 1 thread
The C code I ended up with looks like the following:
#include <linux/futex.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <errno.h>
typedef int Mutex;
void futex_wait(int* addr, int waitWhen);
void futex_wake(int* addr, int amount);
int compare_exchange(int* addr, int* expected, int desired);
int fetch_sub(int* addr, int val);
void store(int* addr, int val);
// GCC, linux
void lock(Mutex* mux) {
int cur = 0;
// do a compare exchange (update if the value at the address matches our expected)
// cur is our "expected" value
// this always sets cur to what was at the address
// it returns true if the exchange happened
if (compare_exchange(mux, &cur, 1)) {
// we got the lock!
return
}
do {
// indicate contention
if (cur != 2) {
compare_exchange(mux, &cur, 2);
}
futex_wait(mux, 2);
cur = 0;
// if we had contention, assume there are other threads when we lock
} while (!compare_exchange(mux, &cur, 2);
// we got the lock
}
void unlock(Mutex* mux) {
if (fetch_sub(mux, 1) != 1) {
// always indicate we're unlocked
// contended threads will indicate that there is contention
store(mux, 0);
futex_wake(mux, 1); // only wake 1 thread
}
}
void futex_wait(int* addr, int waitWhen) {
// we put it in a loop since the OS may cause spurious wake ups
do {
if (syscall(SYS_futex, addr, FUTEX_WAIT, 2, 0, 0, 0) == -1) {
if (errno == EAGAIN) return; // value changed, no wait needed
perror(); abort(); // something went wrong!
}
} while (__atomic_load_n(addr, __ATOMIC_SEQ_CST) == waitWhen);
}
int compare_exchange() {
return __atomic_compare_exchange_n(
mux,
&cur,
1,
false,
__ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST
);
}
int fetch_sub(int* addr, int val) {
return __atomic_fetch_sub(ptr, val, __ATOMIC_SEQ_CST);
}
void store(int* addr, int val) {
__atomic_store_n(addr, val, __ATOMIC_SEQ_CST);
}
void futex_wake(int* addr, int amt) {
if (syscall(SYS_futex, addr, FUTEX_WAKE, amt, 0, 0, 0) != -1) {
perror(); abort(); // something went wrong
}
}That’s, a lot of code. But most of it is fairly straightforward. We have some wrapper methods to hide verbose function names or functions with large parameter lists. We’re using a lot of atomics (which are wrapped), and we’re doing a futex wait/wake (wake being the C equivalent of notify).
The main takeaway is that we do 0 for unlocked, 1 for locked, and 2 for contended. We then have logic split between the locker and unlocker to maintain that differentiation.
As far as how it works, I’ve ran the above code (with some modifications for namespacing and additional features like timeouts) thousands of times on multiple machines, both with and without thread sensitization turned on. That’s not to say it’s “perfect” or “bug-free” - only that it works well enough for me, so that’s what we’ll translate over.
Translating to JavaScript
Translating the C code isn’t all that difficult. Our pointers do become a “memory array + offset”, and we use the Atomics methods rather than syscalls. It’s fairly straightforward, though there are some differences in the C and JavaScript atomic APIs. That said, it’s not all that complicate.
function mutex(memory, offset) {
return {
lock: () => {
let expected = 0
let cur = Atomics.compareExchange(memory, offset, expected, 1)
if (cur === expected) {
// got the lock
return
}
while(true) {
if (cur !== 2) {
expected = cur
Atomics.compareExchange(memory, offset, expected, 2)
}
Atomics.wait(memory, offset, 2)
expected = 0
cur = Atomics.compareExchange(memory, offset, expected, 2)
if (cur === 0) {
// got the lock
return
}
}
},
unlock: () => {
if (Atomics.sub(memory, offset, 1) !== 1) {
Atomics.store(memory, offset, 0)
Atomics.notify(memory, offset, 1)
}
}
}
}The biggest difference now is that instead of passing only one memory address to our workers, we need to pass two: one for the working data, and another for the mutex location. Our worker code now looks like the following:
importScripts('mutex.js') // load our mutex code
let memory = new Int32Array(new SharedArrayBuffer(1024))
let offset = 0
let muxOffset = 1 // must be different than offset!
onmessage = (e) => {
if (e.data.type === 'init') {
memory = new Int32Array(e.data.memory) // use the memory buffer were given
offset = e.data.offset
muxOffset = e.data.muxOffset
// don't respond
}
else if (e.data.type === 'run') {
const mux = mutex(memory, muxOffset)
let lastRead = 0
for (let i = 0; i < 200; ++i) {
mux.lock()
try {
memory.set([memory.at(offset) + 1], offset)
// we can do more instructions now!
lastRead = memory.at(offset)
} finally {
// unlock (in finally so we always run it)
mux.unlock()
}
}
postMessage({final: lastRead})
}
}Our worker has been updated! Time to update our runner so we pass both offsets.
/// our config
const memory = new SharedArrayBuffer(1024)
const arr = new Int32Array(memory)
const muxOffset = 1
const offset = 2
const numRunners = 4
const runners = []
for (let i = 0; i < numRunners; ++i) {
runners.push(run(memory, muxOffset, offset))
}
await Promise.all(runners)
console.log("Memory data: ", arr.at(offset))
function run(memory, muxOffset, offset) {
const worker = new Worker('mutex-worker.js')
const wait = new Promise((resolve) => {
worker.onmessage = (e) => {
resolve(e.data)
}
})
worker.postMessage({type: 'init', offset, muxOffset, memory})
return wait
}If we run that code, we see our expected final result of 800!
Of course, there is more to multi-threading. However, this should be enough to get you started. You’ve seen how to synchronize, and how to take C code and convert it over to JavaScript. With this, you should be able to turn my other C threading primitives (semaphores, condition variables, barriers) over to JavaScript.
When I next post about JavaScript threads, I’ll talk about some of the other edge cases you’ll hit (deadlocks, starvation, and logic race conditions).
They do have the Web Locks API - but that’s more for Indexed DB and Local Storage locks. It’s based around string names rather than addresses in a shared array buffer. Not saying you can’t use it - ever hammer can be used not-as-intended. Just that it’s outside the topic of this series as we’re focused on the APIs intended for workers and shared memory - and those APIs are the Atomics interface added with everything else.
It actually gives an object with two fields, a boolean indicating if there is a promise and a promise. This is due to wait only waiting conditionally, more on that later.
Futexes are Tricky; Basics of futexes; Mutexes and Condition Variables using Futexes; Landing a new syscall: What is futex? and so many more that I lost the links to

