Multi Threading with WebAssembly

Published: Sat May 23 2020

In this post I will show an example of how to use threading with WebAssembly.


JavaScript is a single threaded language, but modern browsers support parallel execution through a concept called WebWorkers. In WASM applications we can take advantage of this by running C++ threads as individual web workers. This leads to an experience that is very similar to regular C++ threading. In this post I will show how to build a spreadsheet application where I calculate the sum of each column in a separate thread.

Screenshot of the demo application below:

Clicking the button below the grid calls a C++ WASM function that launches 4 threads, one per column. Once the sum of each column has been calculated, the sum of the sums is returned back to the JavaScript application.

Let’s take a look at the code on both the JavaScript side and the C++ side in the following sections.


In JavaScript (TypeScript) I have defined the following object model for the spreadsheet.

export class Spredsheet { rows: Array<Row>; } export class Row { cells: Array<Cell>; constructor(public rowIndex, public columnCount) { this.cells = []; for (let j = 0; j < this.columnCount; j++) { this.cells.push(new Cell(j, this.rowIndex)); } } } export class Cell { cellValue: string = null; constructor(public columnIndex, public rowIndex) { } }

This data model is populated from user input in the spreadsheet, but this is unimportant for the WebAssembly side of things.

In order to calculate the column sums I have to pass an instance of this model to C++. Since this is a complex JavaScript object graph we need a way to represent it in C++. In my example I am using messagepack to serialize the object into a binary format that can be read on the C++ side. Check out my article about messagepack and WASM for more details.


On the C++ side I have created structs that act as counterparts to the JavaScript objects. See the definitions below:
struct Cell { public: MSGPACK_DEFINE_MAP(cellValue); std::string cellValue; }; struct Row { public: MSGPACK_DEFINE_MAP(cells); std::vector<Cell> cells; }; struct Spreadsheet { public: MSGPACK_DEFINE_MAP(rows); std::vector<Row> rows; }; class SpreadsheetCalculations { private: Spreadsheet spreadsheet; public: long sum = 0; int columnIndex; SpreadsheetCalculations(Spreadsheet spreadsheet, int columnIndex) { this -> spreadsheet = spreadsheet; this -> columnIndex = columnIndex; } void calculate_column_sum() { for (int rowIndex = 0; rowIndex < spreadsheet.rows.size(); rowIndex++) { sum += stol(spreadsheet.rows[rowIndex].cells[columnIndex].cellValue); } } };

Please note, I am in no way a C++ expert, so I am always happy to receive suggestions for improvements here.

Next let’s look at the rest of the C++ application:

void *thread_func(void *data_struct) { SpreadsheetCalculations* data = static_cast<SpreadsheetCalculations*>(data_struct); data -> calculate_column_sum(); return data_struct; } pthread_t create_thread(SpreadsheetCalculations* spreadsheet) { pthread_t t; if (pthread_create(&t, NULL, thread_func, spreadsheet)) { perror("Thread failed"); return -1; } return t; } bool wait_for_thread(pthread_t thread) { if (pthread_join(thread, NULL)) { perror("Thread join failed"); return false; } return true; } extern "C" long calculate_sums(char* expr, int bufferSize) { msgpack::object_handle oh = msgpack::unpack(expr, bufferSize); msgpack::object obj = oh.get(); Spreadsheet spreadsheet; obj.convert(spreadsheet); SpreadsheetCalculations *data1 = new SpreadsheetCalculations(spreadsheet, 0); pthread_t t1 = create_thread(data1); SpreadsheetCalculations *data2 = new SpreadsheetCalculations(spreadsheet, 1); pthread_t t2 = create_thread(data2); SpreadsheetCalculations *data3 = new SpreadsheetCalculations(spreadsheet, 2); pthread_t t3 = create_thread(data3); SpreadsheetCalculations *data4 = new SpreadsheetCalculations(spreadsheet, 3); pthread_t t4 = create_thread(data4); wait_for_thread(t1); wait_for_thread(t2); wait_for_thread(t3); wait_for_thread(t4); int sum = data1 -> sum + data2 -> sum + data3 -> sum + data4 -> sum; delete data1; delete data2; delete data3; delete data4; return sum; }

Here I am spawning 4 threads using the C++ threading api (pthread_create and pthread_join). The number of available threads has to be specified as a parameter to the compiler. In my case I am setting the thread pool size to 4, which means I can spawn a maximum of 4 threads.

In Emscripten you can specify the thread pool size and opt into threading by setting the flags USE_PTHREADS=1 and PTHREAD_POOL_SIZE=4

Running in the Browser

The threads are executed in web workers in the browser at runtime. If you open dev tools in Chrome you will be able to see a memory profile for each web worker under the memory tab. It’s worth noting that running these web workers adds a little bit to the overall JavaScript payload. This is because each worker loads a separate version of the wasm-module.js file along with a new worker specific JavaScript file. Overall the size is not that bad in the context of a modern JavaScript application though.

I should also point out that WebAssembly threading seems to have spotty browser support. So far I have only been able to run it in modern versions of Chrome.


The full source for this example is available on Github.

Follow me on twitter @MoreTechStories