Tag Archives: WebAssembly

We shipped FinalizationRegistry in Workers: why you should never use it

2025-06-11 Ketan Gupta

Post Syndicated from Ketan Gupta original https://blog.cloudflare.com/we-shipped-finalizationregistry-in-workers-why-you-should-never-use-it/

We’ve recently added support for the FinalizationRegistry API in Cloudflare Workers. This API allows developers to request a callback when a JavaScript object is garbage-collected, a feature that can be particularly relevant for managing external resources, such as memory allocated by WebAssembly (Wasm). However, despite its availability, our general advice is: avoid using it directly in most scenarios.

Our decision to add FinalizationRegistry — while still cautioning against using it — opens up a bigger conversation: how memory management works when JavaScript and WebAssembly share the same runtime. This is becoming more common in high-performance web apps, and getting it wrong can lead to memory leaks, out-of-memory errors, and performance issues, especially in resource-constrained environments like Cloudflare Workers.

In this post, we’ll look at how JavaScript and Wasm handle memory differently, why that difference matters, and what FinalizationRegistry is actually useful for. We’ll also explain its limitations, particularly around timing and predictability, walk through why we decided to support it, and how we’ve made it safer to use. Finally, we’ll talk about how newer JavaScript language features offer a more reliable and structured approach to solving these problems.

Memory management 101

JavaScript

JavaScript relies on automatic memory management through a process called garbage collection. This means developers do not need to worry about freeing allocated memory, or lifetimes. The garbage collector identifies and reclaims memory occupied by objects that are no longer needed by the program (that is, garbage). This helps prevent memory leaks and simplifies memory management for developers.

function greet() {
  let name = "Alice";         // String is allocated in memory
  console.log("Hello, " + name);
}                             // 'name' goes out of scope

greet();
// JavaScript automatically frees allocated memory at some point in future

WebAssembly

WebAssembly (Wasm) is an assembly-like instruction format designed to run high-performance applications on the web. While it initially gained prominence in web browsers, Wasm is also highly effective on the server side. At Cloudflare, we leverage Wasm to enable users to run code written in a variety of programming languages, such as Rust and Python, directly within our V8 isolates, offering both performance and versatility.

Wasm runtimes are designed to be simple stack machines, and lack built-in garbage collectors. This necessitates manual memory management (allocation and deallocation of memory used by Wasm code), making it an ideal compilation target for languages like Rust and C++ that handle their own memory.

Wasm modules operate on linear memory: a resizable block of raw bytes, which JavaScript views as an ArrayBuffer. This memory is organized in 64 KB pages, and its initial size is defined when the module is compiled or loaded. Wasm code interacts with this memory using 32-bit offsets — integer values functioning as direct pointers that specify a byte offset from the start of its linear memory. This direct memory access model is crucial for Wasm’s high performance. The host environment (which in Cloudflare Workers is JavaScript) also shares this ArrayBuffer, reading and writing (often via TypedArrays) to enable vital data exchange between Wasm and JavaScript.

A core Wasm design is its secure sandbox. This confines Wasm code strictly to its own linear memory and explicitly declared imports from the host, preventing unauthorized memory access or system calls. Direct interaction with JavaScript objects is blocked; communication occurs through numeric values, function references, or operations on the shared ArrayBuffer. This strong isolation is vital for security, ensuring Wasm modules don’t interfere with the host or other application components, which is especially important in multi-tenant environments like Cloudflare Workers.

Bridging WebAssembly memory with JavaScript often involves writing low-level “glue” code to convert raw byte arrays from Wasm into usable JavaScript types. Doing this manually for every function or data structure is both tedious and error-prone. Fortunately, tools like wasm-bindgen and Emscripten (Embind) handle this interop automatically, generating the binding code needed to pass data cleanly between the two environments. We use these same tools under the hood — wasm-bindgen for Rust-based workers-rs projects, and Emscripten for Python Workers — to simplify integration and let developers focus on application logic rather than memory translation.

Interoperability

High-performance web apps often use JavaScript for interactive UIs and data fetching, while WebAssembly handles demanding operations like media processing and complex calculations for significant performance gains, allowing developers to maximize efficiency. Given the difference in memory management models, developers need to be careful when using WebAssembly memory in JavaScript.

For this example, we’ll use Rust to compile a WebAssembly module manually. Rust is a popular choice for WebAssembly because it offers precise control over memory and easy Wasm compilation using standard toolchains.

Rust

Here we have two simple functions. make_buffer creates a string and returns a raw pointer back to JavaScript. The function intentionally “forgets” the memory allocated so that it doesn’t get cleaned up after the function returns. free_buffer, on the other hand, expects the initial string reference handed back and frees the memory.

// Allocate a fresh byte buffer and hand the raw pointer + length to JS.
// *We intentionally “forget” the Vec so Rust will not free it right away;
//   JS now owns it and must call `free_buffer` later.*
#[no_mangle]
pub extern "C" fn make_buffer(out_len: *mut usize) -> *mut u8 {
    let mut data = b"Hello from Rust".to_vec();
    let ptr = data.as_mut_ptr();
    let len  = data.len();

    unsafe { *out_len = len };

    std::mem::forget(data);
    return ptr;
}

/// Counterpart that **must** be called by JS to avoid a leak.
#[no_mangle]
pub unsafe extern "C" fn free_buffer(ptr: *mut u8, len: usize) {
    let _ = Vec::from_raw_parts(ptr, len, len);
}

JavaScript

Back in JavaScript land, we’ll call these Wasm functions and output them using console.log. This is a common pattern in Wasm-based applications since WebAssembly doesn’t have direct access to Web APIs, and rely on a JavaScript “glue” to interface with the outer world in order to do anything useful.

const { instance } = await WebAssembly.instantiate(WasmBytes, {});

const { memory, make_buffer, free_buffer } = instance.exports;

//  Use the Rust functions
const lenPtr = 0;                 // scratch word in Wasm memory
const ptr = make_buffer(lenPtr);

const len = new DataView(memory.buffer).getUint32(lenPtr, true);
const data = new Uint8Array(memory.buffer, ptr, len);

console.log(new TextDecoder().decode(data)); // “Hello from Rust”

free_buffer(ptr, len); // free_buffer must be called to prevent memory leaks

You can find all code samples along with setup instructions here.

As you can see, working with Wasm memory from JavaScript requires care, as it introduces the risk of memory leaks if allocated memory isn’t properly released. JavaScript developers are often unfamiliar with manual memory management, and it’s easy to forget returning memory to WebAssembly after use. This can become especially tricky when Wasm-allocated data is passed into JavaScript libraries, making ownership and lifetime harder to track.

While occasional leaks may not cause immediate issues, over time they can lead to increased memory usage and degrade performance, particularly in memory-constrained environments like Cloudflare Workers.

FinalizationRegistry

FinalizationRegistry, introduced as part of the TC-39 WeakRef proposal, is a JavaScript API which lets you run “finalizers” (aka cleanup callbacks) when an object gets garbage-collected. Let’s look at a simple example to demonstrate the API:

const my_registry = new FinalizationRegistry((obj) => { console.log("Cleaned up: " + obj); });

{
  let temporary = { key: "value" };
  // Register this object in our FinalizationRegistry -- the second argument,
  // "temporary", will be passed to our callback as its obj parameter
  my_registry.register(temporary, "temporary");
}

// At some point in the future when temporary object gets garbage collected, we'll see "Cleaned up: temporary" in our logs.

Let’s see how we can use this API in our Wasm-based application:

const { instance } = await WebAssembly.instantiate(WasmBytes, {});

const { memory, make_buffer, free_buffer } = instance.exports;

// FinalizationRegistry would be responsible for returning memory back to Wasm
const cleanupFr = new FinalizationRegistry(({ ptr, len }) => {
  free_buffer(ptr, len);
});

//  Use the Rust functions
const lenPtr = 0;                 // scratch word in Wasm memory
const ptr = make_buffer(lenPtr);

const len = new DataView(memory.buffer).getUint32(lenPtr, true);
const data = new Uint8Array(memory.buffer, ptr, len);

// Register the data buffer in our FinalizationRegistry so that it gets cleaned up automatically
cleanupFr.register(data, { ptr, len });

console.log(new TextDecoder().decode(data));   // → “Hello from Rust”

// No need to manually call free_buffer, FinalizationRegistry will do this for us

We can use a FinalizationRegistry to manage any object borrowed from WebAssembly by registering it with a finalizer that calls the appropriate free function. This is the same approach used by wasm-bindgen. It shifts the burden of manual cleanup away from the JavaScript developer and delegates it to the JavaScript garbage collector. However, in practice, things aren’t quite that simple.

Inherent issues with FinalizationRegistry

There is a fundamental issue with FinalizationRegistry: garbage collection is non-deterministic, and may clean up your unused memory at some arbitrary point in the future. In some cases, garbage collection might not even run and your “finalizers” will never be triggered.

This is part of its documentation as well:

“A conforming JavaScript implementation, even one that does garbage collection, is not required to call cleanup callbacks. When and whether it does so is entirely down to the implementation of the JavaScript engine. When a registered object is reclaimed, any cleanup callbacks for it may be called then, or some time later, or not at all.”

Even Emscripten mentions this in their documentation: “… finalizers are not guaranteed to be called, and even if they are, there are no guarantees about their timing or order of execution, which makes them unsuitable for general RAII-style resource management.”

Given their non-deterministic nature, developers seldom use finalizers for any essential program logic. Treat them as a last-ditch safety net, not as a primary cleanup mechanism — explicit, deterministic teardown logic is almost always safer, faster, and easier to reason about.

Enabling FinalizationRegistry in Workers

Given its non-deterministic nature and limited early adoption, we initially disabled the FinalizationRegistry API in our runtime. However, as usage of Wasm-based Workers grew — particularly among high-traffic customers — we began to see new demands emerge. One such customer was running an extremely high requests per second (RPS) workload using WebAssembly, and needed tight control over memory to sustain massive traffic spikes without degradation. This highlighted a gap in our memory management capabilities, especially in cases where manual cleanup wasn’t always feasible or reliable. As a result, we re-evaluated our stance and began exploring the challenges and trade-offs of enabling FinalizationRegistry within the Workers environment, despite its known limitations.

Preventing footguns with safe defaults

Because this API could be misused and cause unpredictable results for our customers, we’ve added a few safeguards. Most importantly, cleanup callbacks are run without an active async context, which means they cannot perform any I/O. This includes sending events to a tail Worker, logging metrics, or making fetch requests.

While this might sound limiting, it’s very intentional. Finalization callbacks are meant for cleanup — especially for releasing WebAssembly memory — not for triggering side effects. If we allowed I/O here, developers might (accidentally) rely on finalizers to perform critical logic that depends on when garbage collection happens. That timing is non-deterministic and outside your control, which could lead to flaky, hard-to-debug behavior.

We don’t have full control over when V8’s garbage collector performs cleanup, but V8 does let us nudge the timing of finalizer execution. Like Node and Deno, Workers queue FinalizationRegistry jobs only after the microtask queue has drained, so each cleanup batch slips into the quiet slots between I/O phases of the event loop.

Security concerns

The Cloudflare Workers runtime is specifically engineered to prevent side-channel attacks in a multi-tenant environment. Prior to enabling the FinalizationRegistry API, we did a thorough analysis to assess its impact on our security model and determine the necessity of additional safeguards. The non-deterministic nature of FinalizationRegistry raised concerns about potential information leaks leading to Spectre-like vulnerabilities, particularly regarding the possibility of exploiting the garbage collector (GC) as a confused deputy or using it to create a timer.

GC as confused deputy

One concern was whether the garbage collector (GC) could act as a confused deputy — a security antipattern where a privileged component is tricked into misusing its authority on behalf of untrusted code. In theory, a clever attacker could try to exploit the GC’s ability to access internal object lifetimes and memory behavior in order to infer or manipulate sensitive information across isolation boundaries.

However, our analysis indicated that the V8 GC is effectively contained and not exposed to confused deputy risks within the runtime. This is attributed to our existing threat models and security measures, such as the isolation of user code, where the V8 Isolate serves as the primary security boundary. Furthermore, even though FinalizationRegistry involves some internal GC mechanics, the callbacks themselves execute in the same isolate that registered them — never across isolates — ensuring isolation remains intact.

GC as timer

We also evaluated the possibility of using FinalizationRegistry as a high-resolution timing mechanism — a common vector in side-channel attacks like Spectre. The concern here is that an attacker could schedule object finalization in a way that indirectly leaks information via the timing of callbacks.

In practice, though, the resolution of such a “GC timer” is low and highly variable, offering poor reliability for side-channel attacks. Additionally, we control when finalizer callbacks are scheduled — delaying them until after the microtask queue has drained — giving us an extra layer of control to limit timing precision and reduce risk.

Following a review with our security research team, we determined that our existing security model is sufficient to support this API.

Predictable cleanups?

JavaScript’s Explicit Resource Management proposal introduces a deterministic approach to handle resources needing manual cleanup, such as file handles, network connections, or database sessions. Drawing inspiration from constructs like C#’s using and Python’s with, this proposal introduces the using and await using syntax. This new syntax guarantees that objects adhering to a specific cleanup protocol are automatically disposed of when they are no longer within their scope.

Let’s look at a simple example to understand it a bit better.

class MyResource {
  [Symbol.dispose]() {
    console.log("Resource cleaned up!");
  }

  use() {
    console.log("Using the resource...");
  }
}

{
  using res = new MyResource();
  res.use();
} // When this block ends, Symbol.dispose is called automatically (and deterministically).

The proposal also includes additional features that offer finer control over when dispose methods are called. But at a high level, it provides a much-needed, deterministic way to manage resource cleanup. Let’s now update our earlier WebAssembly-based example to take advantage of this new mechanism instead of relying on FinalizationRegistry:

const { instance } = await WebAssembly.instantiate(WasmBytes, {});
const { memory, make_buffer, free_buffer } = instance.exports;

class WasmBuffer {
  constructor(ptr, len) {
    this.ptr = ptr;
    this.len = len;
  }

  [Symbol.dispose]() {
    free_buffer(this.ptr, this.len);
  }
}

{
  const lenPtr = 0;
  const ptr = make_buffer(lenPtr);
  const len = new DataView(memory.buffer).getUint32(lenPtr, true);

  using buf = new WasmBuffer(ptr, len);

  const data = new Uint8Array(memory.buffer, ptr, len);
  console.log(new TextDecoder().decode(data));  // → “Hello from Rust”
} // Symbol.dispose or free_buffer gets called deterministically here

Explicit Resource Management provides a more dependable way to clean up resources than FinalizationRegistry, as it runs cleanup logic — such as calling free_buffer in WasmBuffer via [Symbol.dispose]() and the using syntax — deterministically, rather than relying on the garbage collector’s unpredictable timing. This makes it a more reliable choice for managing critical resources, especially memory.

Future

Emscripten already makes use of Explicit Resource Management for handling Wasm memory, using FinalizationRegistry as a last resort, while wasm-bindgen supports it in experimental mode. The proposal has seen growing adoption across the ecosystem and was recently conditionally advanced to Stage 4 in the TC39 process, meaning it’ll soon officially be part of the JavaScript language standard. This reflects a broader shift toward more predictable and structured memory cleanup in WebAssembly applications.

We recently added support for this feature in Cloudflare Workers as well, enabling developers to take advantage of deterministic resource cleanup in edge environments. As support for the feature matures, it’s likely to become a standard practice for managing linear memory safely and reliably.

FinalizationRegistry: still not dead yet?

Explicit Resource Management brings much-needed structure and predictability to resource cleanup in WebAssembly and JavaScript interop applications, but it doesn’t make FinalizationRegistry obsolete. There are still important use cases, particularly when a Wasm-allocated object’s lifecycle is out of your hands or when explicit disposal isn’t practical. In scenarios involving third-party libraries, dynamic lifecycles, or integration layers that don’t follow using patterns, FinalizationRegistry remains a valuable fallback to prevent memory leaks.

Looking ahead, a hybrid approach will likely become the standard in Wasm-JavaScript applications. Developers can use ERM for deterministic cleanup of Wasm memory and other resources, while relying on FinalizationRegistry as a safety net when full control isn’t possible. Together, they offer a more reliable and flexible foundation for managing memory across the JavaScript and WebAssembly boundary.

Ready to try it yourself? Deploy a WebAssembly-powered Worker and experiment with memory management — start building with Cloudflare Workers today.

Bringing Python to Workers using Pyodide and WebAssembly

2024-04-02 Hood Chatham

Post Syndicated from Hood Chatham original https://blog.cloudflare.com/python-workers

Starting today, in open beta, you can now write Cloudflare Workers in Python.

This new support for Python is different from how Workers have historically supported languages beyond JavaScript — in this case, we have directly integrated a Python implementation into workerd, the open-source Workers runtime. All bindings, including bindings to Vectorize, Workers AI, R2, Durable Objects, and more are supported on day one. Python Workers can import a subset of popular Python packages including FastAPI, Langchain, Numpy and more. There are no extra build steps or external toolchains.

To do this, we’ve had to push the bounds of all of our systems, from the runtime itself, to our deployment system, to the contents of the Worker bundle that is published across our network. You can read the docs, and start using it today.

We want to use this post to pull back the curtain on the internal lifecycle of a Python Worker, share what we’ve learned in the process, and highlight where we’re going next.

Beyond “Just compile to WebAssembly”

Cloudflare Workers have supported WebAssembly since 2018 — each Worker is a V8 isolate, powered by the same JavaScript engine as the Chrome web browser. In principle, it’s been possible for years to write Workers in any language — including Python — so long as it first compiles to WebAssembly or to JavaScript.

In practice, just because something is possible doesn’t mean it’s simple. And just because “hello world” works doesn’t mean you can reliably build an application. Building full applications requires supporting an ecosystem of packages that developers are used to building with. For a platform to truly support a programming language, it’s necessary to go much further than showing how to compile code using external toolchains.

Python Workers are different from what we’ve done in the past. It’s early, and still in beta, but we think it shows what providing first-class support for programming languages beyond JavaScript can look like on Workers.

The lifecycle of a Python Worker

With Pyodide now built into workerd, you can write a Worker like this:

from js import Response

async def on_fetch(request, env):
    return Response.new("Hello world!")

…with a wrangler.toml file that points to a .py file:

name = "hello-world-python-worker"
main = "src/entry.py"
compatibility_date = "2024-03-18"

…and when you run npx wrangler@latest dev, the Workers runtime will:

Determine which version of Pyodide is required, based on your compatibility date
Create an isolate for your Worker, and automatically inject Pyodide
Serve your Python code using Pyodide

This all happens under the hood — no extra toolchain or precompilation steps needed. The Python execution environment is provided for you, mirroring how Workers written in JavaScript already work.

A Python interpreter built into the Workers runtime

Just as JavaScript has many engines, Python has many implementations that can execute Python code. CPython is the reference implementation of Python. If you’ve used Python before, this is almost certainly what you’ve used, and is commonly referred to as just “Python”.

Pyodide is a port of CPython to WebAssembly. It interprets Python code, without any need to precompile the Python code itself to any other format. It runs in a web browser — check out this REPL. It is true to the CPython that Python developers know and expect, providing most of the Python Standard Library. It provides a foreign function interface (FFI) to JavaScript, allowing you to call JavaScript APIs directly from Python — more on this below. It provides popular open-source packages, and can import pure Python packages directly from PyPI.

Pyodide struck us as the perfect fit for Workers. It is designed to allow the core interpreter and each native Python module to be built as separate WebAssembly modules, dynamically linked at runtime. This allows the code footprint for these modules to be shared among all Workers running on the same machine, rather than requiring each Worker to bring its own copy. This is essential to making WebAssembly work well in the Workers environment, where we often run thousands of Workers per machine — we need Workers using the same programming language to share their runtime code footprint. Running thousands of Workers on every machine is what makes it possible for us to deploy every application in every location at a reasonable price.

Just like with JavaScript Workers, with Python Workers we provide the runtime for you:

Pyodide is currently the exception — most languages that target WebAssembly do not yet support dynamic linking, so each application ends up bringing its own copy of its language runtime. We hope to see more languages support dynamic linking in the future, so that we can more effectively bring them to Workers.

How Pyodide works

Pyodide executes Python code in WebAssembly, which is a sandboxed environment, separated from the host runtime. Unlike running native code, all operations outside of pure computation (such as file reads) must be provided by a runtime environment, then imported by the WebAssembly module.

LLVM provides three target triples for WebAssembly:

wasm32-unknown-unknown – this backend provides no C standard library or system call interface; to support this backend, we would need to manually rewrite every system or library call to make use of imports we would define ourselves in the runtime.
wasm32-wasi – WASI is a standardized system interface, and defines a standard set of imports that are implemented in WASI runtimes such as wasmtime.
wasm32-unknown-emscripten – Like WASI, Emscripten defines the imports that a WebAssembly program needs to execute, but also outputs an accompanying JavaScript library that implements these imported functions.

Pyodide uses Emscripten, and provides three things:

A distribution of the CPython interpreter, compiled using Emscripten
A foreign function interface (FFI) between Python and JavaScript
A set of third-party Python packages, compiled using Emscripten’s compiler to WebAssembly.

Of these targets, only Emscripten currently supports dynamic linking, which, as we noted above, is essential to providing a shared language runtime for Python that is shared across isolates. Emscripten does this by providing implementations of dlopen and dlsym, which use the accompanying JavaScript library to modify the WebAssembly program’s table to link additional WebAssembly-compiled modules at runtime. WASI does not yet support the dlopen/dlsym dynamic linking abstractions used by CPython.

Pyodide and the magic of foreign function interfaces (FFI)

You might have noticed that in our Hello World Python Worker, we import Response from the js module:

from js import Response

async def on_fetch(request, env):
    return Response.new("Hello world!")

Why is that?

Most Workers are written in JavaScript, and most of our engineering effort on the Workers runtime goes into improving JavaScript Workers. There is a risk in adding a second language that it might never reach feature parity with the first language and always be a second class citizen. Pyodide’s foreign function interface (FFI) is critical to avoiding this by providing access to all JavaScript functionality from Python. This can be used by the Worker author directly, and it is also used to make packages like FastAPI and Langchain work out-of-the-box, as we’ll show later in this post.

An FFI is a system for calling functions in one language that are implemented in another language. In most cases, an FFI is defined by a “higher-level” language in order to call functions implemented in a systems language, often C. Python’s ctypes module is such a system. These sorts of foreign function interfaces are often difficult to use because of the nature of C APIs.

Pyodide’s foreign function interface is an interface between Python and JavaScript, which are two high level object-oriented languages with a lot of design similarities. When passed from one language to another, immutable types such as strings and numbers are transparently translated. All mutable objects are wrapped in an appropriate proxy.

When a JavaScript object is passed into Python, Pyodide determines which JavaScript protocols the object supports and dynamically constructs an appropriate Python class that implements the corresponding Python protocols. For example, if the JavaScript object supports the JavaScript iteration protocol then the proxy will support the Python iteration protocol. If the JavaScript object is a Promise or other thenable, the Python object will be an awaitable.

from js import JSON

js_array = JSON.parse("[1,2,3]")

for entry in js_array:
   print(entry)

The lifecycle of a request to a Python Worker makes use of Pyodide’s FFI, wrapping the incoming JavaScript Request object in a JsProxy object that is accessible in your Python code. It then converts the value returned by the Python Worker’s handler into a JavaScript Response object that can be delivered back to the client:

Why dynamic linking is essential, and static linking isn’t enough

Python comes with a C FFI, and many Python packages use this FFI to import native libraries. These libraries are typically written in C, so they must first be compiled down to WebAssembly in order to work on the Workers runtime. As we noted above, Pyodide is built with Emscripten, which overrides Python’s C FFI — any time a package tries to load a native library, it is instead loaded from a WebAssembly module that is provided by the Workers runtime. Dynamic linking is what makes this possible — it is what lets us override Python’s C FFI, allowing Pyodide to support many Python packages that have native library dependencies.

Dynamic linking is “pay as you go”, while static linking is “pay upfront” — if code is statically linked into your binary, it must be loaded upfront in order for the binary to run, even if this code is never used.

Dynamic linking enables the Workers runtime to share the underlying WebAssembly modules of packages across different Workers that are running on the same machine.

We won’t go too much into detail on how dynamic linking works in Emscripten, but the main takeaway is that the Emscripten runtime fetches WebAssembly modules from a filesystem abstraction provided in JavaScript. For each Worker, we generate a filesystem at runtime, whose structure mimics a Python distribution that has the Worker’s dependencies installed, but whose underlying files are shared between Workers. This makes it possible to share Python and WebAssembly files between multiple Workers that import the same dependency. Today, we’re able to share these files across Workers, but copy them into each new isolate. We think we can go even further, by employing copy-on-write techniques to share the underlying resource across many Workers.

Supporting Server and Client libraries

Python has a wide variety of popular HTTP client libraries, including httpx, urllib3, requests and more. Unfortunately, none of them work out of the box in Pyodide. Adding support for these has been one of the longest running user requests for the Pyodide project. The Python HTTP client libraries all work with raw sockets, and the browser security model and CORS do not allow this, so we needed another way to make them work in the Workers runtime.

Async Client libraries

For libraries that can make requests asynchronously, including aiohttp and httpx, we can use the Fetch API to make requests. We do this by patching the library, instructing it to use the Fetch API from JavaScript — taking advantage of Pyodide’s FFI. The httpx patch ends up quite simple —fewer than 100 lines of code. Simplified even further, it looks like this:

from js import Headers, Request, fetch

def py_request_to_js_request(py_request):
    js_headers = Headers.new(py_request.headers)
    return Request.new(py_request.url, method=py_request.method, headers=js_headers)

def js_response_to_py_response(js_response):
  ... # omitted

async def do_request(py_request):
  js_request = py_request_to_js_request(py_request)
    js_response = await fetch(js_request)
    py_response = js_response_to_py_response(js_response)
    return py_response

Synchronous Client libraries

Another challenge in supporting Python HTTP client libraries is that many Python APIs are synchronous. For these libraries, we cannot use the fetch API directly because it is asynchronous.

Thankfully, Joe Marshall recently landed a contribution to urllib3 that adds Pyodide support in web browsers by:

Checking if blocking with Atomics.wait() is possible
a. If so, start a fetch worker thread
b. Delegate the fetch operation to the worker thread and serialize the response into a SharedArrayBuffer
c. In the Python thread, use Atomics.wait to block for the response in the SharedArrayBuffer
If Atomics.wait() doesn’t work, fall back to a synchronous XMLHttpRequest

Despite this, today Cloudflare Workers do not support worker threads or synchronous XMLHttpRequest, so neither of these two approaches will work in Python Workers. We do not support synchronous requests today, but there is a way forward…

WebAssembly Stack Switching

There is an approach which will allow us to support synchronous requests. WebAssembly has a stage 3 proposal adding support for stack switching, which v8 has an implementation of. Pyodide contributors have been working on adding support for stack switching to Pyodide since September of 2022, and it is almost ready.

With this support, Pyodide exposes a function called run_sync which can block for completion of an awaitable:

from pyodide.ffi import run_sync

def sync_fetch(py_request):
   js_request = py_request_to_js_request(py_request)
   js_response  = run_sync(fetch(js_request))
   return js_response_to_py_response(js_response)

FastAPI and Python’s Asynchronous Server Gateway Interface

FastAPI is one of the most popular libraries for defining Python servers. FastAPI applications use a protocol called the Asynchronous Server Gateway Interface (ASGI). This means that FastAPI never reads from or writes to a socket itself. An ASGI application expects to be hooked up to an ASGI server, typically uvicorn. The ASGI server handles all of the raw sockets on the application’s behalf.

Conveniently for us, this means that FastAPI works in Cloudflare Workers without any patches or changes to FastAPI itself. We simply need to replace uvicorn with an appropriate ASGI server that can run within a Worker. Our initial implementation lives here, in the fork of Pyodide that we maintain. We hope to add a more comprehensive feature set, add test coverage, and then upstream this implementation into Pyodide.

You can try this yourself by cloning cloudflare/python-workers-examples, and running npx wrangler@latest dev in the directory of the FastAPI example.

Importing Python Packages

Python Workers support a subset of Python packages, which are provided directly by Pyodide, including numpy, httpx, FastAPI, Langchain, and more. This ensures compatibility with the Pyodide runtime by pinning package versions to Pyodide versions, and allows Pyodide to patch internal implementations, as we showed above in the case of httpx.

To import a package, simply add it to your requirements.txt file, without adding a version number. A specific version of the package is provided directly by Pyodide. Today, you can use packages in local development, and in the coming weeks, you will be able to deploy Workers that define dependencies in a requirements.txt file. Later in this post, we’ll show how we’re thinking about managing new versions of Pyodide and packages.

We maintain our own fork of Pyodide, which allows us to provide patches specific to the Workers runtime, and to quickly expand our support for packages in Python Workers, while also committing to upstreaming our changes back to Pyodide, so that the whole ecosystem of developers can benefit.

Python packages are often big and memory hungry though, and they can do a lot of work at import time. How can we ensure that you can bring in the packages you need, while mitigating long cold start times?

Making cold starts faster with memory snapshots

In the example at the start of this post, in local development, we mentioned injecting Pyodide into your Worker. Pyodide itself is 6.4MB — and Python packages can also be quite large.

If we simply shoved Pyodide into your Worker and uploaded it to Cloudflare, that’d be quite a large Worker to load into a new isolate — cold starts would be slow. On a fast computer with a good network connection, Pyodide takes about two seconds to initialize in a web browser, one second of network time and one second of cpu time. It wouldn’t be acceptable to initialize it every time you update your code for every isolate your Worker runs in across Cloudflare’s network.

Instead, when you run npx wrangler@latest deploy, the following happens:

Wrangler uploads your Python code and your requirements.txt file to the Workers API
We send your Python code, and your requirements.txt file to the Workers runtime to be validated
We create a new isolate for your Worker, and automatically inject Pyodide plus any packages you’ve specified in your requirements.txt file.
We scan the Worker’s code for import statements, execute them, and then take a snapshot of the Worker’s WebAssembly linear memory. Effectively, we perform the expensive work of importing packages at deploy time, rather than at runtime.
We deploy this snapshot alongside your Worker’s Python code to Cloudflare’s network.
Just like a JavaScript Worker, we execute the Worker’s top-level scope.

When a request comes in to your Worker, we load this snapshot and use it to bootstrap your Worker in an isolate, avoiding expensive initialization time:

This takes cold starts for a basic Python Worker down to below 1 second. We’re not yet satisfied with this though. We’re confident that we can drive this down much, much further. How? By reusing memory snapshots.

Reusing Memory Snapshots

When you upload a Python Worker, we generate a single memory snapshot of the Worker’s top-level imports, including both Pyodide and any dependencies. This snapshot is specific to your Worker. It can’t be shared, even though most of its contents are the same as other Python Workers.

Instead, we can create a single, shared snapshot ahead of time, and preload it into a pool of “pre-warmed” isolates. These isolates would already have the Pyodide runtime loaded and ready — making a Python Worker work just like a JavaScript Worker. In both cases, the underlying interpreter and execution environment is provided by the Workers runtime, and available on-demand without delay. The only difference is that with Python, the interpreter runs in WebAssembly, within the Worker.

Snapshots are a common pattern across runtimes and execution environments. Node.js uses V8 snapshots to speed up startup time. You can take snapshots of Firecracker microVMs and resume execution in a different process. There’s lots more we can do here — not just for Python Workers, but for Workers written in JavaScript as well, caching snapshots of compiled code from top-level scope and the state of the isolate itself. Workers are so fast and efficient that to-date we haven’t had to take snapshots in this way, but we think there are still big performance gains to be had.

This is our biggest lever towards driving cold start times down over the rest of 2024.

Future proofing compatibility with Pyodide versions and Compatibility Dates

When you deploy a Worker to Cloudflare, you expect it to keep running indefinitely, even if you never update it again. There are Workers deployed in 2018 that are still running just fine in production.

We achieve this using Compatibility Dates and Compatibility Flags, which provide explicit opt-in mechanisms for new behavior and potentially backwards-incompatible changes, without impacting existing Workers.

This works in part because it mirrors how the Internet and web browsers work. You publish a web page with some JavaScript, and rightly expect it to work forever. Web browsers and Cloudflare Workers have the same type of commitment of stability to developers.

There is a challenge with Python though — both Pyodide and CPython are versioned. Updated versions are published regularly and can contain breaking changes. And Pyodide provides a set of built-in packages, each with a pinned version number. This presents a question — how should we allow you to update your Worker to a newer version of Pyodide?

The answer is Compatibility Dates and Compatibility Flags.

A new version of Python is released every year in August, and a new version of Pyodide is released six (6) months later. When this new version of Pyodide is published, we will add it to Workers by gating it behind a Compatibility Flag, which is only enabled after a specified Compatibility Date. This lets us continually provide updates, without risk of breaking changes, extending the commitment we’ve made for JavaScript to Python.

Each Python release has a five (5) year support window. Once this support window has passed for a given version of Python, security patches are no longer applied, making this version unsafe to rely on. To mitigate this risk, while still trying to hold as true as possible to our commitment of stability and long-term support, after five years any Python Worker still on a Python release that is outside of the support window will be automatically moved forward to the next oldest Python release. Python is a mature and stable language, so we expect that in most cases, your Python Worker will continue running without issue. But we recommend updating the compatibility date of your Worker regularly, to stay within the support window.

In between Python releases, we also expect to update and add additional Python packages, using the same opt-in mechanism. A Compatibility Flag will be a combination of the Python version and the release date of a set of packages. For example, python_3.17_packages_2025_03_01.

How bindings work in Python Workers

We mentioned earlier that Pyodide provides a foreign function interface (FFI) to JavaScript — meaning that you can directly use JavaScript objects, methods, functions and more, directly from Python.

This means that from day one, all binding APIs to other Cloudflare resources are supported in Cloudflare Workers. The env object that is provided by handlers in Python Workers is a JavaScript object that Pyodide provides a proxy API to, handling type translations across languages automatically.

For example, to write to and read from a KV namespace from a Python Worker, you would write:

from js import Response

async def on_fetch(request, env):
    await env.FOO.put("bar", "baz")
    bar = await env.FOO.get("bar")
    return Response.new(bar) # returns "baz"

This works for Web APIs too — see how Response is imported from the js module? You can import any global from JavaScript this way.

Get this JavaScript out of my Python!

You’re probably reading this post because you want to write Python instead of JavaScript. from js import Response just isn’t Pythonic. We know — and we have actually tackled this challenge before for another language (Rust). And we think we can do this even better for Python.

We launched workers-rs in 2021 to make it possible to write Workers in Rust. For each JavaScript API in Workers, we, alongside open-source contributors, have written bindings that expose a more idiomatic Rust API.

We plan to do the same for Python Workers — starting with the bindings to Workers AI and Vectorize. But while workers-rs requires that you use and update an external dependency, the APIs we provide with Python Workers will be built into the Workers runtime directly. Just update your compatibility date, and get the latest, most Pythonic APIs.

This is about more than just making bindings to resources on Cloudflare more Pythonic though — it’s about compatibility with the ecosystem.

Similar to how we recently converted workers-rs to use types from the http crate, which makes it easy to use the axum crate for routing, we aim to do the same for Python Workers. For example, the Python standard library provides a raw socket API, which many Python packages depend on. Workers already provides connect(), a JavaScript API for working with raw sockets. We see ways to provide at least a subset of the Python standard library’s socket API in Workers, enabling a broader set of Python packages to work on Workers, with less of a need for patches.

But ultimately, we hope to kick start an effort to create a standardized serverless API for Python. One that is easy to use for any Python developer and offers the same capabilities as JavaScript.

We’re just getting started with Python Workers

Providing true support for a new programming language is a big investment that goes far beyond making “hello world” work. We chose Python very intentionally — it’s the second most popular programming language after JavaScript — and we are committed to continuing to improve performance and widen our support for Python packages.

We’re grateful to the Pyodide maintainers and the broader Python community — and we’d love to hear from you. Drop into the Python Workers channel in the Cloudflare Developers Discord, or start a discussion on Github about what you’d like to see next and which Python packages you’d like us to support.

Wasm core dumps and debugging Rust in Cloudflare Workers

2023-08-14 Sven Sauleau

Post Syndicated from Sven Sauleau original http://blog.cloudflare.com/wasm-coredumps/

Wasm core dumps and debugging Rust in Cloudflare Workers

A clear sign of maturing for any new programming language or environment is how easy and efficient debugging them is. Programming, like any other complex task, involves various challenges and potential pitfalls. Logic errors, off-by-ones, null pointer dereferences, and memory leaks are some examples of things that can make software developers desperate if they can't pinpoint and fix these issues quickly as part of their workflows and tools.

WebAssembly (Wasm) is a binary instruction format designed to be a portable and efficient target for the compilation of high-level languages like Rust, C, C++, and others. In recent years, it has gained significant traction for building high-performance applications in web and serverless environments.

Cloudflare Workers has had first-party support for Rust and Wasm for quite some time. We've been using this powerful combination to bootstrap and build some of our most recent services, like D1, Constellation, and Signed Exchanges, to name a few.

Using tools like Wrangler, our command-line tool for building with Cloudflare developer products, makes streaming real-time logs from our applications running remotely easy. Still, to be honest, debugging Rust and Wasm with Cloudflare Workers involves a lot of the good old time-consuming and nerve-wracking printf'ing strategy.

What if there’s a better way? This blog is about enabling and using Wasm core dumps and how you can easily debug Rust in Cloudflare Workers.

What are core dumps?

In computing, a core dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally. They also add things like the processor registers, stack pointer, program counter, and other information that may be relevant to fully understanding why the program crashed.

In most cases, depending on the system’s configuration, core dumps are usually initiated by the operating system in response to a program crash. You can then use a debugger like gdb to examine what happened and hopefully determine the cause of a crash. gdb allows you to run the executable to try to replicate the crash in a more controlled environment, inspecting the variables, and much more. The Windows' equivalent of a core dump is a minidump. Other mature languages that are interpreted, like Python, or languages that run inside a virtual machine, like Java, also have their ways of generating core dumps for post-mortem analysis.

Core dumps are particularly useful for post-mortem debugging, determining the conditions that lead to a failure after it has occurred.

WebAssembly core dumps

WebAssembly has had a proposal for implementing core dumps in discussion for a while. It's a work-in-progress experimental specification, but it provides basic support for the main ideas of post-mortem debugging, including using the DWARF (debugging with attributed record formats) debug format, the same that Linux and gdb use. Some of the most popular Wasm runtimes, like Wasmtime and Wasmer, have experimental flags that you can enable and start playing with Wasm core dumps today.

If you run Wasmtime or Wasmer with the flag:

--coredump-on-trap=/path/to/coredump/file

The core dump file will be emitted at that location path if a crash happens. You can then use tools like wasmgdb to inspect the file and debug the crash.

But let's dig into how the core dumps are generated in WebAssembly, and what’s inside them.

How are Wasm core dumps generated

(and what’s inside them)

When WebAssembly terminates execution due to abnormal behavior, we say that it entered a trap. With Rust, examples of operations that can trap are accessing out-of-bounds addresses or a division by zero arithmetic call. You can read about the security model of WebAssembly to learn more about traps.

The core dump specification plugs into the trap workflow. When WebAssembly crashes and enters a trap, core dumping support kicks in and starts unwinding the call stack gathering debugging information. For each frame in the stack, it collects the function parameters and the values stored in locals and in the stack, along with binary offsets that help us map to exact locations in the source code. Finally, it snapshots the memory and captures information like the tables and the global variables.

DWARF is used by many mature languages like C, C++, Rust, Java, or Go. By emitting DWARF information into the binary at compile time a debugger can provide information such as the source name and the line number where the exception occurred, function and argument names, and more. Without DWARF, the core dumps would be just pure assembly code without any contextual information or metadata related to the source code that generated it before compilation, and they would be much harder to debug.

WebAssembly uses a (lighter) version of DWARF that maps functions, or a module and local variables, to their names in the source code (you can read about the WebAssembly name section for more information), and naturally core dumps use this information.

All this information for debugging is then bundled together and saved to the file, the core dump file.

The core dump structure has multiple sections, but the most important are:

General information about the process;
The threads and their stack frames (note that WebAssembly is single threaded in Cloudflare Workers);
A snapshot of the WebAssembly linear memory or only the relevant regions;
Optionally, other sections like globals, data, or table.

Here’s the thread definition from the core dump specification:

corestack   ::= customsec(thread-info vec(frame))
thread-info ::= 0x0 thread-name:name ...
frame       ::= 0x0 ... funcidx:u32 codeoffset:u32 locals:vec(value)
                stack:vec(value)

A thread is a custom section called corestack. A corestack section contains the thread name and a vector (or array) of frames. Each frame contains the function index in the WebAssembly module (funcidx), the code offset relative to the function's start (codeoffset), the list of locals, and the list of values in the stack.

Values are defined as follows:

value ::= 0x01       => ∅
        | 0x7F n:i32 => n
        | 0x7E n:i64 => n
        | 0x7D n:f32 => n
        | 0x7C n:f64 => n

At the time of this writing these are the possible numbers types in a value. Again, we wanted to describe the basics; you should track the full specification to get more detail or find information about future changes. WebAssembly core dump support is in its early stages of specification and implementation, things will get better, things might change.

This is all great news. Unfortunately, however, the Cloudflare Workers runtime doesn’t support WebAssembly core dumps yet. There is no technical impediment to adding this feature to workerd; after all, it's based on V8, but since it powers a critical part of our production infrastructure and products, we tend to be conservative when it comes to adding specifications or standards that are still considered experimental and still going through the definitions phase.

So, how do we get core Wasm dumps in Cloudflare Workers today?

Polyfilling

Polyfilling means using userland code to provide modern functionality in older environments that do not natively support it. Polyfills are widely popular in the JavaScript community and the browser environment; they've been used extensively to address issues where browser vendors still didn't catch up with the latest standards, or when they implement the same features in different ways, or address cases where old browsers can never support a new standard.

Meet wasm-coredump-rewriter, a tool that you can use to rewrite a Wasm module and inject the core dump runtime functionality in the binary. This runtime code will catch most traps (exceptions in host functions are not yet catched and memory violation not by default) and generate a standard core dump file. To some degree, this is similar to how Binaryen's Asyncify works.

Let’s look at code and see how this works. He’s some simple pseudo code:

export function entry(v1, v2) {
    return addTwo(v1, v2)
}

function addTwo(v1, v2) {
  res = v1 + v2;
  throw "something went wrong";

  return res
}

An imaginary compiler could take that source and generate the following Wasm binary code:

  (func $entry (param i32 i32) (result i32)
    (local.get 0)
    (local.get 1)
    (call $addTwo)
  )

  (func $addTwo (param i32 i32) (result i32)
    (local.get 0)
    (local.get 1)
    (i32.add)
    (unreachable) ;; something went wrong
  )

  (export "entry" (func $entry))

“;;” is used to denote a comment.

entry() is the Wasm function exported to the host. In an environment like the browser, JavaScript (being the host) can call entry().

Irrelevant parts of the code have been snipped for brevity, but this is what the Wasm code will look like after wasm-coredump-rewriter rewrites it:

  (func $entry (type 0) (param i32 i32) (result i32)
    ...
    local.get 0
    local.get 1
    call $addTwo ;; see the addTwo function bellow
    global.get 2 ;; is unwinding?
    if  ;; label = @1
      i32.const x ;; code offset
      i32.const 0 ;; function index
      i32.const 2 ;; local_count
      call $coredump/start_frame
      local.get 0
      call $coredump/add_i32_local
      local.get 1
      call $coredump/add_i32_local
      ...
      call $coredump/write_coredump
      unreachable
    end)

  (func $addTwo (type 0) (param i32 i32) (result i32)
    local.get 0
    local.get 1
    i32.add
    ;; the unreachable instruction was here before
    call $coredump/unreachable_shim
    i32.const 1 ;; funcidx
    i32.const 2 ;; local_count
    call $coredump/start_frame
    local.get 0
    call $coredump/add_i32_local
    local.get 1
    call $coredump/add_i32_local
    ...
    return)

  (export "entry" (func $entry))

As you can see, a few things changed:

The (unreachable) instruction in addTwo() was replaced by a call to $coredump/unreachable_shim which starts the unwinding process. Then, the location and debugging data is captured, and the function returns normally to the entry() caller.
Code has been added after the addTwo() call instruction in entry() that detects if we have an unwinding process in progress or not. If we do, then it also captures the local debugging data, writes the core dump file and then, finally, moves to the unconditional trap unreachable.

In short, we unwind until the host function entry() gets destroyed by calling unreachable.

Let’s go over the runtime functions that we inject for more clarity, stay with us:

$coredump/start_frame(funcidx, local_count) starts a new frame in the coredump.
$coredump/add_*_local(value) captures the values of function arguments and in locals (currently capturing values from the stack isn’t implemented.)
$coredump/write_coredump is used at the end and writes the core dump in memory. We take advantage of the first 1 KiB of the Wasm linear memory, which is unused, to store our core dump.

A diagram is worth a thousand words:

Wait, what’s this about the first 1 KiB of the memory being unused, you ask? Well, it turns out that most WebAssembly linters and tools, including Emscripten and WebAssembly’s LLVM don’t use the first 1 KiB of memory. Rust and Zig also use LLVM, but they changed the default. This isn’t pretty, but the hugely popular Asyncify polyfill relies on the same trick, so there’s reasonable support until we find a better way.

But we digress, let’s continue. After the crash, the host, typically JavaScript in the browser, can now catch the exception and extract the core dump from the Wasm instance’s memory:

try {
    wasmInstance.exports.someExportedFunction();
} catch(err) {
    const image = new Uint8Array(wasmInstance.exports.memory.buffer);
    writeFile("coredump." + Date.now(), image);
}

If you're curious about the actual details of the core dump implementation, you can find the source code here. It was written in AssemblyScript, a TypeScript-like language for WebAssembly.

This is how we use the polyfilling technique to implement Wasm core dumps when the runtime doesn’t support them yet. Interestingly, some Wasm runtimes, being optimizing compilers, are likely to make debugging more difficult because function arguments, locals, or functions themselves can be optimized away. Polyfilling or rewriting the binary could actually preserve more source-level information for debugging.

You might be asking what about performance? We did some testing and found that the impact is negligible; the cost-benefit of being able to debug our crashes is positive. Also, you can easily turn wasm core dumps on or off for specific builds or environments; deciding when you need them is up to you.

Debugging from a core dump

We now know how to generate a core dump, but how do we use it to diagnose and debug a software crash?

Similarly to gdb (GNU Project Debugger) on Linux, wasmgdb is the tool you can use to parse and make sense of core dumps in WebAssembly; it understands the file structure, uses DWARF to provide naming and contextual information, and offers interactive commands to navigate the data. To exemplify how it works, wasmgdb has a demo of a Rust application that deliberately crashes; we will use it.

Let's imagine that our Wasm program crashed, wrote a core dump file, and we want to debug it.

$ wasmgdb source-program.wasm /path/to/coredump
wasmgdb>

When you fire wasmgdb, you enter a REPL (Read-Eval-Print Loop) interface, and you can start typing commands. The tool tries to mimic the gdb command syntax; you can find the list here.

Let's examine the backtrace using the bt command:

wasmgdb> bt
#18     000137 as __rust_start_panic () at library/panic_abort/src/lib.rs
#17     000129 as rust_panic () at library/std/src/panicking.rs
#16     000128 as rust_panic_with_hook () at library/std/src/panicking.rs
#15     000117 as {closure#0} () at library/std/src/panicking.rs
#14     000116 as __rust_end_short_backtrace<std::panicking::begin_panic_handler::{closure_env#0}, !> () at library/std/src/sys_common/backtrace.rs
#13     000123 as begin_panic_handler () at library/std/src/panicking.rs
#12     000194 as panic_fmt () at library/core/src/panicking.rs
#11     000198 as panic () at library/core/src/panicking.rs
#10     000012 as calculate (value=0x03000000) at src/main.rs
#9      000011 as process_thing (thing=0x2cff0f00) at src/main.rs
#8      000010 as main () at src/main.rs
#7      000008 as call_once<fn(), ()> (???=0x01000000, ???=0x00000000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/core/src/ops/function.rs
#6      000020 as __rust_begin_short_backtrace<fn(), ()> (f=0x01000000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/sys_common/backtrace.rs
#5      000016 as {closure#0}<()> () at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/rt.rs
#4      000077 as lang_start_internal () at library/std/src/rt.rs
#3      000015 as lang_start<()> (main=0x01000000, argc=0x00000000, argv=0x00000000, sigpipe=0x00620000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/rt.rs
#2      000013 as __original_main () at <directory not found>/<file not found>
#1      000005 as _start () at <directory not found>/<file not found>
#0      000264 as _start.command_export at <no location>

Each line represents a frame from the program's call stack; see frame #3:

#3      000015 as lang_start<()> (main=0x01000000, argc=0x00000000, argv=0x00000000, sigpipe=0x00620000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/rt.rs

You can read the funcidx, function name, arguments names and values and source location are all present. Let's select frame #9 now and inspect the locals, which include the function arguments:

wasmgdb> f 9
000011 as process_thing (thing=0x2cff0f00) at src/main.rs
wasmgdb> info locals
thing: *MyThing = 0xfff1c

Let’s use the p command to inspect the content of the thing argument:

wasmgdb> p (*thing)
thing (0xfff2c): MyThing = {
    value (0xfff2c): usize = 0x00000003
}

You can also use the p command to inspect the value of the variable, which can be useful for nested structures:

wasmgdb> p (*thing)->value
value (0xfff2c): usize = 0x00000003

And you can use p to inspect memory addresses. Let’s point at 0xfff2c, the start of the MyThing structure, and inspect:

wasmgdb> p (MyThing) 0xfff2c
0xfff2c (0xfff2c): MyThing = {
    value (0xfff2c): usize = 0x00000003
}

All this information in every step of the stack is very helpful to determine the cause of a crash. In our test case, if you look at frame #10, we triggered an integer overflow. Once you get comfortable walking through wasmgdb and using its commands to inspect the data, debugging core dumps will be another powerful skill under your belt.

Tidying up everything in Cloudflare Workers

We learned about core dumps and how they work, and we know how to make Cloudflare Workers generate them using the wasm-coredump-rewriter polyfill, but how does all this work in practice end to end?

We've been dogfooding the technique described in this blog at Cloudflare for a while now. Wasm core dumps have been invaluable in helping us debug Rust-based services running on top of Cloudflare Workers like D1, Privacy Edge, AMP, or Constellation.

Today we're open-sourcing the Wasm Coredump Service and enabling anyone to deploy it. This service collects the Wasm core dumps originating from your projects and applications when they crash, parses them, prints an exception with the stack information in the logs, and can optionally store the full core dump in a file in an R2 bucket (which you can then use with wasmgdb) or send the exception to Sentry.

We use a service binding to facilitate the communication between your application Worker and the Coredump service Worker. A Service binding allows you to send HTTP requests to another Worker without those requests going over the Internet, thus avoiding network latency or having to deal with authentication. Here’s a diagram of how it works:

Using it is as simple as npm/yarn installing @cloudflare/wasm-coredump, configuring a few options, and then adding a few lines of code to your other applications running in Cloudflare Workers, in the exception handling logic:

import shim, { getMemory, wasmModule } from "../build/worker/shim.mjs"

const timeoutSecs = 20;

async function fetch(request, env, ctx) {
    try {
        // see https://github.com/rustwasm/wasm-bindgen/issues/2724.
        return await Promise.race([
            shim.fetch(request, env, ctx),
            new Promise((r, e) => setTimeout(() => e("timeout"), timeoutSecs * 1000))
        ]);
    } catch (err) {
      const memory = getMemory();
      const coredumpService = env.COREDUMP_SERVICE;
      await recordCoredump({ memory, wasmModule, request, coredumpService });
      throw err;
    }
}

The ../build/worker/shim.mjs import comes from the worker-build tool, from the workers-rs packages and is automatically generated when wrangler builds your Rust-based Cloudflare Workers project. If the Wasm throws an exception, we catch it, extract the core dump from memory, and send it to our Core dump service.

You might have noticed that we race the workers-rs shim.fetch() entry point with another Promise to generate a timeout exception if the Rust code doesn't respond earlier. This is because currently, wasm-bindgen, which generates the glue between the JavaScript and Rust land, used by workers-rs, has an issue where a Promise might not be rejected if Rust panics asynchronously (leading to the Worker runtime killing the worker with “Error: The script will never generate a response”.). This can block the wasm-coredump code and make the core dump generation flaky.

We are working to improve this, but in the meantime, make sure to adjust timeoutSecs to something slightly bigger than the typical response time of your application.

Here’s an example of a Wasm core dump exception in Sentry:

You can find a working example, the Sentry and R2 configuration options, and more details in the @cloudflare/wasm-coredump GitHub repository.

Too big to fail

It's worth mentioning one corner case of this debugging technique and the solution: sometimes your codebase is so big that adding core dump and DWARF debugging information might result in a Wasm binary that is too big to fit in a Cloudflare Worker. Well, worry not; we have a solution for that too.

Fortunately the DWARF for WebAssembly specification also supports external DWARF files. To make this work, we have a tool called debuginfo-split that you can add to the build command in the wrangler.toml configuration:

command = "... && debuginfo-split ./build/worker/index.wasm"

What this does is it strips the debugging information from the Wasm binary, and writes it to a new separate file called debug-{UUID}.wasm. You then need to upload this file to the same R2 bucket used by the Wasm Coredump Service (you can automate this as part of your CI or build scripts). The same UUID is also injected into the main Wasm binary; this allows us to correlate the Wasm binary with its corresponding DWARF debugging information. Problem solved.

Binaries without DWARF information can be significantly smaller. Here’s our example:

4.5 MiB	debug-63372dbe-41e6-447d-9c2e-e37b98e4c656.wasm
313 KiB	build/worker/index.wasm

Final words

We hope you enjoyed reading this blog as much as we did writing it and that it can help you take your Wasm debugging journeys, using Cloudflare Workers or not, to another level.

Note that while the examples used here were around using Rust and WebAssembly because that's a common pattern, you can use the same techniques if you're compiling WebAssembly from other languages like C or C++.

Also, note that the WebAssembly core dump standard is a hot topic, and its implementations and adoption are evolving quickly. We will continue improving the wasm-coredump-rewriter, debuginfo-split, and wasmgdb tools and the wasm-coredump service. More and more runtimes, including V8, will eventually support core dumps natively, thus eliminating the need to use polyfills, and the tooling, in general, will get better; that's a certainty. For now, we present you with a solution that works today, and we have strong incentives to keep supporting it.

As usual, you can talk to us on our Developers Discord or the Community forum or open issues or PRs in our GitHub repositories; the team will be listening.

Use the language of your choice with Pages Functions via WebAssembly

2023-03-24 Carmen Popoviciu

Post Syndicated from Carmen Popoviciu original https://blog.cloudflare.com/pages-functions-with-webassembly/

Use the language of your choice with Pages Functions via WebAssembly

On the Cloudflare Developer Platform, we understand that building any application is a unique experience for every developer. We know that in the developer ecosystem there are a plethora of tools to choose from and as a developer you have preferences and needs. We don’t believe there are “right” or “wrong” tools to use in development and want to ensure a good developer experience no matter your choices. We believe in meeting you where you are.

When Pages Functions moved to Generally Available in November of last year, we knew it was the key that unlocks a variety of use cases – namely full-stack applications! However, we still felt we could do more to provide the flexibility for you to build what you want and how you want.

That’s why today we’re opening the doors to developers who want to build their server side applications with something other than JavaScript. We’re excited to announce WebAssembly support for Pages Functions projects!

WebAssembly (or Wasm) is a low-level assembly-like language that can run with near-native performance. It provides programming languages such as C/C++, C# or Rust with a compilation target, enabling them to run alongside JavaScript. Primarily designed to run on the web (though not exclusively), WebAssembly opens up exciting opportunities for applications to run on the web platform, both on the client and the server, that up until now couldn’t have done so.

With Pages Functions being Workers “under the hood” and Workers having Wasm module support for quite some time, it is only natural that Pages provides a similar experience for our users as well. While not all use cases are a good fit for Wasm, there are many that are. Our goal with adding Wasm support is enabling those use cases and expanding the boundaries of what Functions can build.

Using WebAssembly in Pages Functions

WebAssembly in Pages Functions works very similar to how it does today in Workers — we read wasm files as WebAssembly modules, ready for you to import and use directly from within your Functions. In short, like this:

// functions/api/distance-between.js

import wasmModule from "../../pkg/distance.wasm";

export async function onRequest({ request }) {
  const moduleInstance = await WebAssembly.instantiate(wasmModule);
  const distance = await moduleInstance.exports.distance_between();

  return new Response(distance);
}

Let’s briefly unpack the code snippet above to highlight some things that are important to understand.

import wasmModule from "../../pkg/distance.wasm";

Pages makes no assumptions as to how the binary .wasm files you want to import were compiled. In our example above, distance.wasm can be a file you compiled yourself out of code you wrote, or equally, a file provided in a third-party library’s distribution. The only thing Pages cares about is that distance.wasm is a compiled binary Wasm module file.

The result of that import is a WebAssembly.Module object, which you can then instantiate:

const moduleInstance = await WebAssembly.instantiate(wasmModule);

Once the WebAssembly.Instance object is created, you can start using whatever features your Wasm module exports, inside your Functions code:

const distance = await moduleInstance.exports.distance_between();

More modules, more fun!

Apart from Wasm modules, this work unlocks support for two other module types that you can import within your Functions code: text and binary. These are not standardized modules, but can be very handy if you need to import raw text blobs (such as HTML files) as a string:

// functions/my-function.js
import html from "404.html";

export async function onRequest() {
  return new Response(html,{
    headers: { "Content-Type": "text/html" }
  });
}

or raw data blobs (such as images) as an ArrayBuffer.

// functions/my-function.js
import image from "../hearts.png.bin";

export async function onRequest() {
  return new Response(image,{
    headers: { "Content-Type": "image/png" }
  });
}

The distance between us on the surface of Earth

Let’s take a look at a live example to see it all in action! We’ve built a small demo app that walks you through an example of Functions with WebAssembly end-to-end. You can check out the code of our demo application available on GitHub.

The application computes the distance in kilometers on the surface of Earth between your current location (based on the geo coordinates of the incoming request) and any other point on the globe, each time you click on the globe’s surface.

The code that performs the actual high-performance distance calculation is written in Rust, and is a slight adaptation of the example provided in the Rust cookbook:

fn distance_between(from_latitude_degrees: f64, from_longitude_degrees: f64, to_latitude_degrees: f64, to_longitude_degrees: f64) -> f64 {
    let earth_radius_kilometer = 6371.0_f64;

    let from_latitude = from_latitude_degrees.to_radians();
    let to_latitude = to_latitude_degrees.to_radians();

    let delta_latitude = (from_latitude_degrees - to_latitude_degrees).to_radians();
    let delta_longitude = (from_longitude_degrees - to_longitude_degrees).to_radians();

    let central_angle_inner = (delta_latitude / 2.0).sin().powi(2)
        + from_latitude.cos() * to_latitude.cos() * (delta_longitude / 2.0).sin().powi(2);
    let central_angle = 2.0 * central_angle_inner.sqrt().asin();

    let distance = earth_radius_kilometer * central_angle;
    
    return distance;
}

We have a Rust playground experiment available here, in case you want to play around with this code snippet in particular.

To use the distance_between() Rust function in Pages Functions, we first compile the code to WebAssembly using wasm-pack:

##
# generate the `pkg` folder which will contain the wasm binary
##
wasm-pack build

Then, we import the generated .wasm artifact from inside our distance-between.js Pages Function. Now, each time you click on the globe surface, a request to /api/distance-between is made, which will trigger the distance_between() function to execute. Once computed, the distance value is returned by our Function, back to the client, which proceeds to display the value to the user.

We want to point out that this application could have been built entirely in JavaScript, however, we equally wanted to show just how simple it is to build it with Rust. The decision to use Rust was motivated by a few factors. First, the tooling ecosystem for building and working with Rust-generated WebAssembly is quite mature, well documented, and easy to get started with. Second, the Rust docs are a fantastic resource if you are new to Rust or to Rust with WebAssembly! If you are looking for a step-by-step tutorial on how to generate and set up a Rust and WebAssembly project, we highly recommend checking out Rust’s official WebAssembly Book.

We hope it gives you a solid starting point in exploring what is possible with Wasm on Pages Functions, and inspires you to create some powerful applications of your own. Head over to our docs to get started today!

Running Zig with WASI on Cloudflare Workers

2022-08-01 Daniel Harper

Post Syndicated from Daniel Harper original https://blog.cloudflare.com/running-zig-with-wasi-on-cloudflare-workers/

Running Zig with WASI on Cloudflare Workers

After the recent announcement regarding WASI support in Workers, I decided to see what it would take to get code written in Zig to run as a Worker, and it turned out to be trivial. This post documents the process I followed as a new user of Zig. It’s so exciting to see how Cloudflare Workers is a polyglot platform allowing you to write programs in the language you love, or the language you’re learning!

Hello, World!

I’m not a Zig expert by any means, and to keep things entirely honest I’ve only just started looking into the language, but we all have to start somewhere. So, if my Zig code isn’t perfect please bear with me. My goal was to build a real, small program using Zig and deploy it on Cloudflare Workers. And to see how fast I can go from a blank screen to production code.

My goal for this wasn’t ambitious, just read some text from stdin and print it to stdout with line numbers, like running cat -n. But it does show just how easy the Workers paradigm is. This Zig program works identically on the command-line on my laptop and as an HTTP API deployed on Cloudflare Workers.

Here’s my code. It reads a line from stdin and outputs the same line prefixed with a line number. It terminates when there’s no more input.

const std = @import("std");

pub fn main() anyerror!void {
	// setup allocator
	var gpa = std.heap.GeneralPurposeAllocator(.{}){};
	defer std.debug.assert(!gpa.deinit());
	const allocator = gpa.allocator();

	// setup streams
	const stdout = std.io.getStdOut().writer();
	const in = std.io.getStdIn();
	var reader = std.io.bufferedReader(in.reader()).reader();

	var counter: u32 = 1;

	// read input line by line
	while (try reader.readUntilDelimiterOrEofAlloc(allocator, '\n', std.math.maxInt(usize))) |line| {
    	    defer allocator.free(line);
    	    try stdout.print("{d}\t{s}\n", .{counter, line});
    	    counter = counter + 1;
	}
}

To build Zig code, you create a build.zig file that defines how to build your project. For this trivial case I just opted to build an executable from the sources

const std = @import("std");

pub fn build(b: *std.build.Builder) void {
	const target = b.standardTargetOptions(.{});
	const mode = b.standardReleaseOptions();

	const exe = b.addExecutable("print-with-line-numbers", "src/main.zig");
	exe.setTarget(target);
	exe.setBuildMode(mode);
	exe.install();
}

By running zig build the compiler will run and output a binary under zig-out/bin

$ zig build

$ ls zig-out/bin
print-with-line-numbers

$ echo "Hello\nWorld" | ./zig-out/bin/print-with-line-numbers
1    Hello
2    World

WASI

The next step is to get this running on Workers, but first I need to compile it into WASM with WASI support.

Thankfully, this comes out of the box with recent versions of Zig, so you can just tell the compiler to build your executable using the wasm32-wasi target, which will produce a file that can be run on any WASI-compatible WebAssembly runtime, such as wasmtime.

This same .wasm file can be run in wasmtime and deployed directly to Cloudflare Workers. This makes building, testing and deploying seamless.

$ zig build -Dtarget=wasm32-wasi

$ ls zig-out/bin
print-with-line-numbers.wasm

$ echo "Hello\nWorld" | wasmtime ./zig-out/bin/print-with-line-numbers.wasm
1    Hello
2    World

Zig on Workers

With our binary ready to go, the last piece is to get it running on Cloudflare Workers using wrangler2. That is as simple as publishing the .wasm file on workers.dev. If you don’t have a workers.dev account, you can follow the tutorial on our getting started guide that will get you from code to deployment within minutes!

In fact, once I signed up for my account, all I needed to do was complete the first two steps, install wrangler and login.

$ npx wrangler@wasm login
Attempting to login via OAuth...
Opening a link in your default browser: https://dash.cloudflare.com/oauth2/auth
Successfully logged in.

Then, I ran the following command to publish my worker:

$ npx wrangler@wasm publish --name print-with-line-numbers --compatibility-date=2022-07-07 zig-out/bin/print-with-line-numbers.wasm
Uploaded print-with-line-numbers (3.04 sec)
Published print-with-line-numbers (6.28 sec)
  print-with-line-numbers.workers.dev

With that step completed, the worker is ready to run and can be invoked by calling the URL printed from the output above.

echo "Hello\nWorld" | curl https://print-with-line-numbers.workers.dev -X POST --data-binary @-
1    Hello
2    World

Success!

Conclusion

What impressed me the most here was just how easy this process was.

First, I had a binary compiled for the architecture of my laptop, then I compiled the code into WebAssembly by just passing a flag to the compiler, and finally I had this running on workers without having to change any code.

Granted, this program was not very complicated and does not do anything other than read from STDIN and write to STDOUT, but it gives me confidence of what is possible, especially as technology like WASI matures.

Announcing support for WASI on Cloudflare Workers

2022-07-07 Ben Yule

Post Syndicated from Ben Yule original https://blog.cloudflare.com/announcing-wasi-on-workers/

Announcing support for WASI on Cloudflare Workers

Today, we are announcing experimental support for WASI (the WebAssembly System Interface) on Cloudflare Workers and support within wrangler2 to make it a joy to work with. We continue to be incredibly excited about the entire WebAssembly ecosystem and are eager to adopt the standards as they are developed.

A Quick Primer on WebAssembly

So what is WASI anyway? To understand WASI, and why we’re excited about it, it’s worth a quick recap of WebAssembly, and the ecosystem around it.

WebAssembly promised us a future in which code written in compiled languages could be compiled to a common binary format and run in a secure sandbox, at near native speeds. While WebAssembly was designed with the browser in mind, the model rapidly extended to server-side platforms such as Cloudflare Workers (which has supported WebAssembly since 2017).

WebAssembly was originally designed to run alongside Javascript, and requires developers to interface directly with Javascript in order to access the world outside the sandbox. To put it another way, WebAssembly does not provide any standard interface for I/O tasks such as interacting with files, accessing the network, or reading the system clock. This means if you want to respond to an event from the outside world, it’s up to the developer to handle that event in JavaScript, and directly call functions exported from the WebAssembly module. Similarly, if you want to perform I/O from within WebAssembly, you need to implement that logic in Javascript and import it into the WebAssembly module.

Custom toolchains such as Emscripten or libraries such as wasm-bindgen have emerged to make this easier, but they are language specific and add a tremendous amount of complexity and bloat. We’ve even built our own library, workers-rs, using wasm-bindgen that attempts to make writing applications in Rust feel native within a Worker – but this has proven not only difficult to maintain, but requires developers to write code that is Workers specific, and is not portable outside the Workers ecosystem.

We need more.

The WebAssembly System Interface (WASI)

WASI aims to provide a standard interface that any language compiling to WebAssembly can target. You can read the original post by Lin Clark here, which gives an excellent introduction – code cartoons and all. In a nutshell, Lin describes WebAssembly as an assembly language for a ‘conceptual machine’, whereas WASI is a systems interface for a ‘conceptual operating system.’

This standardization of the system interface has paved the way for existing toolchains to cross-compile existing codebases to the wasm32-wasi target. A tremendous amount of progress has already been made, specifically within Clang/LLVM via the wasi-sdk and Rust toolchains. These toolchains leverage a version of Libc, which provides POSIX standard API calls, that is built on top of WASI ‘system calls.’ There are even basic implementations in more fringe toolchains such as TinyGo and SwiftWasm.

Practically speaking, this means that you can now write applications that not only interoperate with any WebAssembly runtime implementing the standard, but also any POSIX compliant system! This means the exact same ‘Hello World!’ that runs on your local Linux/Mac/Windows WSL machine.

Show me the code

WASI sounds great, but does it actually make my life easier? You tell us. Let’s run through an example of how this would work in practice.

First, let’s generate a basic Rust “Hello, world!” application, compile, and run it.

$ cargo new hello_world
$ cd ./hello_world
$ cargo build --release
   Compiling hello_world v0.1.0 (/Users/benyule/hello_world)
    Finished release [optimized] target(s) in 0.28s
$ ./target/release/hello_world
Hello, world!

It doesn’t get much simpler than this. You’ll notice we only define a main() function followed by a println to stdout.

fn main() {
    println!("Hello, world!");
}

Now, let’s take the exact same program and compile against the wasm32-wasi target, and run it in an ‘off the shelf’ wasm runtime such as Wasmtime.

$ cargo build --target wasm32-wasi --release
$ wasmtime target/wasm32-wasi/release/hello_world.wasm

Hello, world!

Neat! The same code compiles and runs in multiple POSIX environments.

Finally, let’s take the binary we just generated for Wasmtime, but instead publish it to Workers using Wrangler2.

$ npx wrangler@wasm dev target/wasm32-wasi/release/hello_world.wasm
$ curl http://localhost:8787/

Hello, world!

Unsurprisingly, it works! The same code is compatible in multiple POSIX environments and the same binary is compatible across multiple WASM runtimes.

Running your CLI apps in the cloud

The attentive reader may notice that we played a small trick with the HTTP request made via cURL. In this example, we actually stream stdin and stdout to/frome the Worker using the HTTP request and response body respectively. This pattern enables some really interesting use cases, specifically, programs designed to run on the command line can be deployed as ‘services’ to the cloud.

‘Hexyl’ is an example that works completely out of the box. Here, we ‘cat’ a binary file on our local machine and ‘pipe’ the output to curl, which will then POST that output to our service and stream the result back. Following the steps we used to compile our ‘Hello World!’, we can compile hexyl.

$ git clone [email protected]:sharkdp/hexyl.git
$ cd ./hexyl
$ cargo build --target wasm32-wasi --release

And without further modification we were able to take a real-world program and create something we can now run or deploy. Again, let’s tell wrangler2 to preview hexyl, but this time give it some input.

$ npx wrangler@wasm dev target/wasm32-wasi/release/hexyl.wasm
$ echo "Hello, world\!" | curl -X POST --data-binary @- http://localhost:8787

┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 48 65 6c 6c 6f 20 77 6f ┊ 72 6c 64 21 0a          │Hello wo┊rld!_   │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

Give it a try yourself by hitting https://hexyl.examples.workers.dev.

echo "Hello world\!" | curl https://hexyl.examples.workers.dev/ -X POST --data-binary @- --output -

A more useful example, but requires a bit more work, would be to deploy a utility such as swc (swc.rs), to the cloud and use it as an on demand JavaScript/TypeScript transpilation service. Here, we have a few extra steps to ensure that the compiled output is as small as possible, but it otherwise runs out-of-the-box. Those steps are detailed in https://github.com/zebp/wasi-example-swc, but for now let’s gloss over that and interact with the hosted example.

$ echo "const x = (x, y) => x * y;" | curl -X POST --data-binary @- https://swc-wasi.examples.workers.dev/ --output -

var x=function(a,b){return a*b}

Finally, we can also do the same with C/C++, but requires a little more lifting to get our Makefile right. Here we show an example of compiling zstd and uploading it as a streaming compression service.

https://github.com/zebp/wasi-example-zstd

$ echo "Hello world\!" | curl https://zstd.examples.workers.dev/ -s -X POST --data-binary @- | file -

What if I want to use WASI from within a JavaScript Worker?

Wrangler can make it really easy to deploy code without having to worry about the Workers ecosystem, but in some cases you may actually want to invoke your WASI based WASM module from Javascript. This can be achieved with the following simple boilerplate. An updated README will be kept at https://github.com/cloudflare/workers-wasi.

import { WASI } from "@cloudflare/workers-wasi";
import demoWasm from "./demo.wasm";

export default {
  async fetch(request, _env, ctx) {
    // Creates a TransformStream we can use to pipe our stdout to our response body.
    const stdout = new TransformStream();
    const wasi = new WASI({
      args: [],
      stdin: request.body,
      stdout: stdout.writable,
    });

    // Instantiate our WASM with our demo module and our configured WASI import.
    const instance = new WebAssembly.Instance(demoWasm, {
      wasi_snapshot_preview1: wasi.wasiImport,
    });

    // Keep our worker alive until the WASM has finished executing.
    ctx.waitUntil(wasi.start(instance));

    // Finally, let's reply with the WASM's output.
    return new Response(stdout.readable);
  },
};

Now with our JavaScript boilerplate and wasm, we can easily deploy our worker with Wrangler’s WASM feature.

$ npx wrangler publish
Total Upload: 473.89 KiB / gzip: 163.79 KiB
Uploaded wasi-javascript (2.75 sec)
Published wasi-javascript (0.30 sec)
  wasi-javascript.zeb.workers.dev

Back to the future

For those of you who have been around for the better part of the past couple of decades, you may notice this looks very similar to RFC3875, better known as CGI (The Common Gateway Interface). While our example here certainly does not conform to the specification, you can imagine how this can be extended to turn the stdin of a basic ‘command line’ application into a full-blown http handler.

We are thrilled to learn where developers take this from here. Share what you build with us on Discord or Twitter!

Native Rust support on Cloudflare Workers

2021-09-09 Steve Manuel

Post Syndicated from Steve Manuel original https://blog.cloudflare.com/workers-rust-sdk/

Native Rust support on Cloudflare Workers

You can now write Cloudflare Workers in 100% Rust, no JavaScript required. Try it out: https://github.com/cloudflare/workers-rs

Cloudflare Workers has long supported the building blocks to run many languages using WebAssembly. However, there has always been a challenging “trampoline” step required to allow languages like Rust to talk to JavaScript APIs such as fetch().

In addition to the sizable amount of boilerplate needed, lots of “off the shelf” bindings between languages don’t include support for Cloudflare APIs such as KV and Durable Objects. What we wanted was a way to write a Worker in idiomatic Rust, quickly, and without needing knowledge of the host JavaScript environment. While we had a nice “starter” template that made it easy enough to pull in some Rust libraries and use them from JavaScript, the barrier was still too high if your goal was to write a full program in Rust and ship it to our edge.

Not anymore!

Introducing the worker crate, available on GitHub and crates.io, which makes Rust developers feel right at home on the Workers platform by running code inside the V8 WebAssembly engine. In the snippet below, you can see how the worker crate does all the heavy lifting by providing Rustacean-friendly Workers APIs.

use worker::*;

#[event(fetch)]
pub async fn main(req: Request, env: Env) -> Result<Response> {
    console_log!(
        "{} {}, located at: {:?}, within: {}",
        req.method().to_string(),
        req.path(),
        req.cf().coordinates().unwrap_or_default(),
        req.cf().region().unwrap_or("unknown region".into())
    );

    if !matches!(req.method(), Method::Post) {
        return Response::error("Method Not Allowed", 405);
    }

    if let Some(file) = req.form_data().await?.get("file") {
        return match file {
            FormEntry::File(buf) => {
                Response::ok(&format!("size = {}", buf.bytes().await?.len()))
            }
            _ => Response::error("`file` part of POST form must be a file", 400),
        };
    }

    Response::error("Bad Request", 400)
}

Get your own Worker in Rust started with a single command:

# see installation instructions for our `wrangler` CLI at https://github.com/cloudflare/wrangler
# (requires v1.19.2 or higher)
$ wrangler generate --type=rust my-project

We’ve stripped away all the glue code, provided an ergonomic HTTP framework, and baked in what you need to build small scripts or full-fledged Workers apps in Rust. You’ll find fetch, a router, easy-to-use HTTP functionality, Workers KV stores and Durable Objects, secrets, and environment variables too. It’s all open source, and we’d love your feedback!

Why are we doing this?

Cloudflare Workers is on a mission to simplify the developer experience. When we took a hard look at the previous experience writing non-JavaScript Workers, we knew we could do better. Rust happens to be a great language for us to kick-start our mission: it has first-class support for WebAssembly, and a wonderful, growing ecosystem. Tools like wasm-bindgen, libraries like web-sys, and Rust’s powerful macro system gave us a significant starting-off point. Plus, Rust’s popularity is growing rapidly, and if our own use of Rust at Cloudflare is any indication, there is no question that Rust is staking its claim as a must-have in the developer toolbox.

So give it a try, leave some feedback, even open a PR! By the way, we’re always on the lookout for great people to join us, and we are hiring for many open roles (including Rust engineers!) — take a look.

Let’s build a Cloudflare Worker with WebAssembly and Haskell

2020-10-06 Cristhian Motoche

Post Syndicated from Cristhian Motoche original https://blog.cloudflare.com/cloudflare-worker-with-webassembly-and-haskell/

Let's build a Cloudflare Worker with WebAssembly and Haskell

This is a guest post by Cristhian Motoche of Stack Builders.

At Stack Builders, we believe that Haskell’s system of expressive static types offers many benefits to the software industry and the world-wide community that depends on our services. In order to fully realize these benefits, it is necessary to have proper training and access to an ecosystem that allows for reliable deployment of services. In exploring the tools that help us run our systems based on Haskell, our developer Cristhian Motoche has created a tutorial that shows how to compile Haskell to WebAssembly using Asterius for deployment on Cloudflare.

What is a Cloudflare Worker?

Cloudflare Workers is a serverless platform that allows us to run our code on the edge of the Cloudflare infrastructure. It’s built on Google V8, so it’s possible to write functionalities in JavaScript or any other language that targets WebAssembly.

WebAssembly is a portable binary instruction format that can be executed fast in a memory-safe sandboxed environment. For this reason, it’s especially useful for tasks that need to perform resource-demanding and self-contained operations.

Why use Haskell to target WebAssembly?

Haskell is a pure functional languages that can target WebAssembly. As such, It helps developers break down complex tasks into small functions that can later be composed to do complex tasks. Additionally, it’s statically typed and has type inference, so it will complain if there are type errors at compile time. Because of that and much more, Haskell is a good source language for targeting WebAssembly.

From Haskell to WebAssembly

We’ll use Asterius to target WebAssembly from Haskell. It’s a well documented tool that is updated often and supports a lot of Haskell features.

First, as suggested in the documentation, we’ll use podman to pull the Asterius prebuilt container from Docker hub. In this tutorial, we will use Asterius version 200617, which works with GHC 8.8.

podman run -it --rm -v $(pwd):/workspace -w /workspace terrorjack/asterius:200617

Now we’ll create a Haskell module called fact.hs file that will export a pure function:

module Factorial (fact) where

fact :: Int -> Int
fact n = go n 1
  where
    go 0 acc = acc
    go n acc = go (n - 1) (n*acc)

foreign export javascript "fact" fact :: Int -> Int

In this module, we define a pure function called fact, optimized with tail recursion and exported using the Asterius JavaScript FFI, so that it can be called when a WebAssembly module is instantiated in JavaScript.

Next, we’ll create a JavaScript file called fact_node.mjs that contains the following code:

import * as rts from "./rts.mjs";
import module from "./fact.wasm.mjs";
import req from "./fact.req.mjs";

async function handleModule(m) {
  const i = await rts.newAsteriusInstance(Object.assign(req, {module: m}));
  const result = await i.exports.fact(5);
  console.log(result);
}

module.then(handleModule);

This code imports rts.mjs (common runtime), WebAssembly loaders, and the required parameters for the Asterius instance. It creates a new Asterius instance, it calls the exported function fact with the input 5, and prints out the result.

You probably have noted that fact is done asynchronously. This happens with any exported function by Asterius, even if it’s a pure function.

Next, we’ll compile this code using the Asterius command line interface (CLI) ahc-link, and we’ll run the JavaScript code in Node:

ahc-link \
  --input-hs fact.hs \
  --no-main \
  --export-function=fact \
  --run \
  --input-mjs fact_node.mjs \
  --output-dir=node

This command takes fact.hs as a Haskell input file, specifies that no main function is exported, and exports the fact function. Additionally, it takes fact_node.mjs as the entry JavaScript file that replaces the generated file by default, and it places the generated code in a directory called node.

Running the ahc-link command from above will print the following output in the console:

[INFO] Compiling fact.hs to WebAssembly
...
[INFO] Running node/fact.mjs
120

As you can see, the result is executed in node and it prints out the result of fact in the console.

Push your code to Cloudflare Workers

Now we’ll set everything up for deploying our code to Cloudflare Workers.

First, let’s add a metadata.json file with the following content:

{
  "body_part": "script",
  "bindings": [
    {
      "type": "wasm_module",
      "name": "WASM",
      "part": "wasm"
    }
  ]
}

This file is needed to specify the wasm_module binding. The name value corresponds to the global variable to access the WebAssembly module from your Worker code. In our example, it’s going to have the name WASM.

Our next step is to define the main point of the Workers script.

import * as rts from "./rts.mjs";
import fact from "./fact.req.mjs";

async function handleFact(param) {
  const i = await rts.newAsteriusInstance(Object.assign(fact, { module: WASM }));
  return await i.exports.fact(param);
}

async function handleRequest(req) {
  if (req.method == "POST") {
    const data = await req.formData();
    const param = parseInt(data.get("param"));
    if (param) {
      const resp = await handleFact(param);
      return new Response(resp, {status: 200});
    } else {
      return new Response(
        "Expecting 'param' in request to be an integer",
        {status: 400},
      );
    }
  }
  return new Response("Method not allowed", {status: 405});
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
})

There are a few interesting things to point out in this code:

We import rts.mjs and fact.req.mjs to load the exported functions from our WebAssembly module.
handleFact is an asynchronous function that creates an instance of Asterius with the global WASM module, as a Workers global variable, and calls the exported function fact with some input.
handleRequest handles the request of the Worker. It expects a POST request, with a parameter called param in the request body. If param is a number, it calls handleFact to respond with the result of fact.
Using the Service Workers API, we listen to the fetch event that will respond with the result of handleRequest.

We need to build and bundle our code in a single JavaScript file, because Workers only accepts one script per worker. Fortunately, Asterius comes with Parcel.js, which will bundle all the necessary code in a single JavaScript file.

ahc-link \
  --input-hs fact.hs \
  --no-main \
  --export-function=fact \
  --input-mjs fact_cfw.mjs \
  --bundle \
  --browser \
  --output-dir worker

ahc-link will generate some files inside a directory called worker. For our Workers, we’re only interested in the JavaScript file (fact.js) and the WebAssembly module (fact.wasm). Now, it’s time to submit both of them to Workers. We can do this with the provided REST API.

Make sure you have an account id ($CF_ACCOUNT_ID), a name for your script ($SCRIPT_NAME), and an API Token ($CF_API_TOKEN):

cd worker
curl -X PUT "https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/workers/scripts/$SCRIPT_NAME" \
     -H  "Authorization: Bearer $CF_API_TOKEN" \
     -F "[email protected];type=application/json" \
     -F "[email protected];type=application/javascript" \
     -F "[email protected];type=application/wasm"

Now, visit the Workers UI, where you can use the editor to view, edit, and test the script. Also, you can enable it to on a workers.dev subdomain ($CFW_SUBDOMAIN); in that case, you could then simply:

curl -X POST $CFW_SUBDOMAIN \
       -H 'Content-Type: application/x-www-form-urlencoded' \
       --data 'param=5'

Beyond a simple Haskell file

So far, we’ve created a WebAssembly module that exports a pure Haskell function we ran in Workers. However, we can also create and build a Cabal project using Asterius ahc-cabal CLI, and then use ahc-dist to compile it to WebAssembly.

First, let’s create the project:

ahc-cabal init -m -p cabal-cfw-example

Then, let’s add some dependencies to our cabal project. The cabal file will look like this:

cabal-version:       2.4
name:                cabal-cfw-example
version:             0.1.0.0
license:             NONE

executable cabal-cfw-example
  ghc-options: -optl--export-function=handleReq
  main-is:             Main.hs
  build-depends:
    base,
    bytestring,
    aeson >=1.5 && < 1.6,
    text
  default-language:    Haskell2010

It’s a simple cabal file, except for the -optl--export-function=handleReq ghc flag. This is necessary when exporting a function from a cabal project.

In this example, we’ll define a simple User record, and we’ll define its instance automatically using Template Haskell!

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell   #-}

module Main where

import           Asterius.Types
import           Control.Monad
import           Data.Aeson                 hiding (Object)
import qualified Data.Aeson                 as A
import           Data.Aeson.TH
import qualified Data.ByteString.Lazy.Char8 as B8
import           Data.Text


main :: IO ()
main = putStrLn "CFW Cabal"

data User =
  User
    { name :: Text
    , age  :: Int
    }

$(deriveJSON defaultOptions 'User)

NOTE: It’s not necessary to create a Cabal project for this example, because the prebuilt container comes with a lot of prebuilt packages (aesona included). Nevertheless, it will help us show the potential of ahc-cabal and ahc-dist.

Next, we’ll define handleReq, which we’ll export using JavaScript FFI just like we did before.

handleReq :: JSString -> JSString -> IO JSObject
handleReq method rawBody =
  case fromJSString method of
    "POST" ->
      let eitherUser :: Either String User
          eitherUser = eitherDecode (B8.pack $ fromJSString rawBody)
       in case eitherUser of
            Right _  -> js_new_response (toJSString "Success!") 200
            Left err -> js_new_response (toJSString err) 400
    _ -> js_new_response (toJSString "Not a valid method") 405

foreign export javascript "handleReq" handleReq :: JSString -> JSString -> IO JSObject

foreign import javascript "new Response($1, {\"status\": $2})"
  js_new_response :: JSString -> Int -> IO JSObject

This time, we define js_new_response, a Haskell function that creates a JavaScript object, to create a Response. handleReq takes two string parameters from JavaScript and it uses them to prepare a response.

Now let’s build and install the binary in the current directory:

ahc-cabal new-install --installdir . --overwrite-policy=always

This will generate a binary for our executable, called cabal-cfw-example. We’re going to use ahc-dist to take that binary and target WebAssembly:

ahc-dist --input-exe cabal-cfw-example --export-function=handleReq --no-main --input-mjs cabal_cfw_example.mjs --bundle --browser

cabal_cfw_example.mjs contains the following code:

import * as rts from "./rts.mjs";
import cabal_cfw_example from "./cabal_cfw_example.req.mjs";

async function handleRequest(req) {
  const i = await rts.newAsteriusInstance(Object.assign(cabal_cfw_example, { module: WASM }));
  const body = await req.text();
  return await i.exports.handleReq(req.method, body);
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
});

Finally, we can deploy our code to Workers by defining a metadata.json file and uploading the script and the WebAssembly module using Workers API as we did before.

Caveats

Workers limits your JavaScript and WebAssembly in file size. Therefore, you need to be careful with any dependencies you add.

Conclusion

Stack Builders builds better software for better living through technologies like expressive static types. We used Asterius to compile Haskell to WebAssembly and deployed it to Cloudflare Workers using the Workers API. Asterius supports a lot of Haskell features (e.g. Template Haskell) and it provides an easy-to-use JavaScript FFI to interact with JavaScript. Additionally, it provides prebuilt containers that contain a lot of Haskell packages, so you can start writing a script right away.

Following this approach, we can write functional type-safe code in Haskell, target it to WebAssembly, and publish it to Workers, which runs on the edge of the Cloudflare infrastructure.

For more content check our blogs and tutorials!

Memory management 101

JavaScript

WebAssembly

Interoperability

Rust

JavaScript

FinalizationRegistry

Inherent issues with FinalizationRegistry

Enabling FinalizationRegistry in Workers

Preventing footguns with safe defaults

Security concerns

GC as confused deputy

GC as timer

Predictable cleanups?

Future

FinalizationRegistry: still not dead yet?

Beyond “Just compile to WebAssembly”

The lifecycle of a Python Worker

A Python interpreter built into the Workers runtime

How Pyodide works

Pyodide and the magic of foreign function interfaces (FFI)

Why dynamic linking is essential, and static linking isn’t enough

Supporting Server and Client libraries

Async Client libraries

Synchronous Client libraries

WebAssembly Stack Switching

FastAPI and Python’s Asynchronous Server Gateway Interface

Importing Python Packages

Making cold starts faster with memory snapshots

Reusing Memory Snapshots

Future proofing compatibility with Pyodide versions and Compatibility Dates

How bindings work in Python Workers

Get this JavaScript out of my Python!

We’re just getting started with Python Workers

What are core dumps?

WebAssembly core dumps

How are Wasm core dumps generated

Polyfilling

Debugging from a core dump

Tidying up everything in Cloudflare Workers

Too big to fail

Final words

Using WebAssembly in Pages Functions

More modules, more fun!

The distance between us on the surface of Earth

Hello, World!

WASI

Zig on Workers

Conclusion

A Quick Primer on WebAssembly

The WebAssembly System Interface (WASI)

Show me the code

Running your CLI apps in the cloud

What if I want to use WASI from within a JavaScript Worker?

Back to the future

Why are we doing this?

What is a Cloudflare Worker?

Why use Haskell to target WebAssembly?

From Haskell to WebAssembly

Push your code to Cloudflare Workers

Beyond a simple Haskell file

Caveats

Conclusion

The collective thoughts of the interwebz