Sharing Complex Types Between C++ WebAssembly and JavaScript

Published: 05/05/2020

One of the challenges when integrating C++ WASM with JavaScript is dealing with complex types in method arguments and return types. In this article I will show an experimental solution where I use MessagePack to serialize objects between C++ and JavaScript.

In a previous article I showed how to use web components to pass data from C++ to JavaScript as serialized JSON strings, but this assumes you already have a web component to work with. In this article I will show an alternative approach where complex objects are passed “as is” in both directions as message pack messages.

MessagePack

MessagePack is a binary serialization format for encoding and decoding data. The format is highly optimized, but even better, it’s also language agnostic. Lately I have been experimenting with WASM and MessagePack separately, but this weekend it dawned on me that it might make sense to combine the two as a solution for data flow between C++ and JavaScript.

WASM Memory Management

Memory management in WASM can be a bit tricky, especially if you are approaching this from the JavaScript side. C++ developers are already very skilled with manual memory management, so the learning curve may be less steep if you are a C++ developer. I haven’t done any serious C++ development in almost 20 years, so I had to relearn a bunch of stuff to pull this off. I am still not 100% confident in my C++ skills, so I welcome any feedback you might have.

By default you can’t pass complex types to WASM functions since there is no automatic conversion between complex JavaScript types and C++ counterparts. Instead we have to manually write data into shared memory at the byte level, which is slightly complicated for most of us.

From the JavaScript side WASM memory is expressed as collections of typed numeric arrays of various bit resolutions (8, 16, 32, 64). Since the arrays are numeric, it’s pretty easy to share numbers between C++ and JavaScript. Numbers are primitives with no need for further encoding since the data structure is already numeric. I suspect this is why most WASM tutorials you find online show how to pass an array of numbers to C++ from JavaScript. This is an important concept to start with, but how do you pass non numeric data? After all it’s pretty rare to have an interface that deals exclusively in numbers.

This is where MessagePack comes into the picture. Since MessagePack messages are encoded into a numeric format, we can serialize an arbitrary object and write it directly into the typed WASM memory arrays. On the receiving end we do the reverse by deserializing the byte values to language specific custom types. Luckily this process works in both directions, JavaScript to C++ and C++ to JavaScript.

MessagePack is language agnostic, but we still need to bring in language specific implementation in both JavaScript and C++. I think there are a few options to chose from, but I decided to go with msgpack5 in JavaScript and msgpack-c in C++. On the JavaScript side this means including a script on the page, but in C++ you have to actually compile in the external msgpack-c library in Emscripten. Luckily this was pretty straightforward, but you can refer to my demo repo for instructions.

Code

Let’s take a look at the code! In the following samples I will show how to wire up complex types in both directions.

Returning Complex Object from C++ to JavaScript

C++
struct Address { public: MSGPACK_DEFINE_MAP(firstName, lastName, zip, city, street, state); std::string firstName; std::string lastName; int zip; std::string city; std::string state; std::string street; }; extern "C" char* get_address(unsigned long* size) { Address address; address.firstName = "Joe"; address.lastName = "Smith"; address.zip = 12345; address.state = "NY"; address.city = "Test Town"; address.street = "Test St. 123"; msgpack::sbuffer sbuf; msgpack::pack(sbuf, address); *size = sbuf.size(); return sbuf.data(); }

In this case I am serializing an Address object and sending it back to JavaScript. Notice the MSGPACK_DEFINE_MAP part. This is necessary to retain property names during deserialization.

Next let’s look at the JavaScript side.

getAddress() { const get_address = Module.cwrap("get_address", "number", ["number"]); const msgpack = msgpack5(); const bufferSize = Module._malloc(8); const addressPtr = get_address(bufferSize); var offset = Module.getValue(bufferSize, "i64"); const addressData = new Uint8Array(Module.HEAPU8.subarray(addressPtr, addressPtr + offset)); Module._free(offset); Module._free(addressPtr); const address = msgpack.decode(addressData); return address; }

The variable named address is a JavaScript representation of the original C++ object that can be used just like any random JavaScript object.

Passing Complex Object from JavaScript to C++

In this example I will show how to pass an object with two properties from JavaScript to C++. Basically this is a primitive calculator function in C++ that returns the sum of the two properties.

JavaScript

Here is the object

{ operand1: 100, operand2: 1000 }

Below you can see the JavaScript method.

add(expression) { const msgpack = msgpack5(); const encoded = msgpack.encode(expression); var bufferSize = Module._malloc(encoded.length); var bytes_per_element = encoded.BYTES_PER_ELEMENT; Module.HEAP8.set(encoded, bufferSize / bytes_per_element); const add_numbers = Module.cwrap("add_numbers", "number", ["number"]); const result = add_numbers(bufferSize, encoded.length); Module._free(bufferSize); return result; }

This is essentially the same process in reverse. An object with two properties is encoded and written into shard memory before being consumed by the C++ code below.

extern "C" long add_numbers(char* expr, int bufferSize) { msgpack::object_handle oh = msgpack::unpack(expr, bufferSize); msgpack::object obj = oh.get(); Expression expression; obj.convert(expression); return expression.operand1 + expression.operand2; }

The C++ code reads out the two properties and returns the sum after adding them.

Sample Code

As always my demo project is available on Github.

Follow me on twitter @MoreTechStories