Tag Archives: WinterCG

More Node.js APIs in Cloudflare Workers — Streams, Path, StringDecoder

Post Syndicated from James M Snell original http://blog.cloudflare.com/workers-node-js-apis-stream-path/

More Node.js APIs in Cloudflare Workers — Streams, Path, StringDecoder

More Node.js APIs in Cloudflare Workers — Streams, Path, StringDecoder

Today we are announcing support for three additional APIs from Node.js in Cloudflare Workers. This increases compatibility with the existing ecosystem of open source npm packages, allowing you to use your preferred libraries in Workers, even if they depend on APIs from Node.js.

We recently added support for AsyncLocalStorage, EventEmitter, Buffer, assert and parts of util. Today, we are adding support for:

We are also sharing a preview of a new module type, available in the open-source Workers runtime, that mirrors a Node.js environment more closely by making some APIs available as globals, and allowing imports without the node: specifier prefix.

You can start using these APIs today, in the open-source runtime that powers Cloudflare Workers, in local development, and when you deploy your Worker. Get started by enabling the nodejs_compat compatibility flag for your Worker.

Stream

The Node.js streams API is the original API for working with streaming data in JavaScript that predates the WHATWG ReadableStream standard. Now, a full implementation of Node.js streams (based directly on the official implementation provided by the Node.js project) is available within the Workers runtime.

Let's start with a quick example:

import {
  Readable,
  Transform,
} from 'node:stream';

import {
  text,
} from 'node:stream/consumers';

import {
  pipeline,
} from 'node:stream/promises';

// A Node.js-style Transform that converts data to uppercase
// and appends a newline to the end of the output.
class MyTransform extends Transform {
  constructor() {
    super({ encoding: 'utf8' });
  }
  _transform(chunk, _, cb) {
    this.push(chunk.toString().toUpperCase());
    cb();
  }
  _flush(cb) {
    this.push('\n');
    cb();
  }
}

export default {
  async fetch() {
    const chunks = [
      "hello ",
      "from ",
      "the ",
      "wonderful ",
      "world ",
      "of ",
      "node.js ",
      "streams!"
    ];

    function nextChunk(readable) {
      readable.push(chunks.shift());
      if (chunks.length === 0) readable.push(null);
      else queueMicrotask(() => nextChunk(readable));
    }

    // A Node.js-style Readable that emits chunks from the
    // array...
    const readable = new Readable({
      encoding: 'utf8',
      read() { nextChunk(readable); }
    });

    const transform = new MyTransform();
    await pipeline(readable, transform);
    return new Response(await text(transform));
  }
};

In this example we create two Node.js stream objects: one stream.Readable and one stream.Transform. The stream.Readable simply emits a sequence of individual strings, piped through the stream.Transform which converts those to uppercase and appends a new-line as a final chunk.

The example is straightforward and illustrates the basic operation of the Node.js API. For anyone already familiar with using standard WHATWG streams in Workers the pattern here should be recognizable.

The Node.js streams API is used by countless numbers of modules published on npm. Now that the Node.js streams API is available in Workers, many packages that depend on it can be used in your Workers. For example, the split2 module is a simple utility that can break a stream of data up and reassemble it so that every line is a distinct chunk. While simple, the module is downloaded over 13 million times each week and has over a thousand direct dependents on npm (and many more indirect dependents). Previously it was not possible to use split2 within Workers without also pulling in a large and complicated polyfill implementation of streams along with it. Now split2 can be used directly within Workers with no modifications and no additional polyfills. This reduces the size and complexity of your Worker by thousands of lines.

import {
  PassThrough,
} from 'node:stream';

import { default as split2 } from 'split2';

const enc = new TextEncoder();

export default {
  async fetch() {
    const pt = new PassThrough();
    const readable = pt.pipe(split2());

    pt.end('hello\nfrom\nthe\nwonderful\nworld\nof\nnode.js\nstreams!');
    for await (const chunk of readable) {
      console.log(chunk);
    }

    return new Response("ok");
  }
};

Path

The Node.js Path API provides utilities for working with file and directory paths. For example:

import path from "node:path"
path.join('/foo', 'bar', 'baz/asdf', 'quux', '..');

// Returns: '/foo/bar/baz/asdf'

Note that in the Workers implementation of path, the path.win32 variants of the path API are not implemented, and will throw an exception.

StringDecoder

The Node.js StringDecoder API is a simple legacy utility that predates the WHATWG standard TextEncoder/TextDecoder API and serves roughly the same purpose. It is used by Node.js' stream API implementation as well as a number of popular npm modules for the purpose of decoding UTF-8, UTF-16, Latin1, Base64, and Hex encoded data.

import { StringDecoder } from 'node:string_decoder';
const decoder = new StringDecoder('utf8');

const cent = Buffer.from([0xC2, 0xA2]);
console.log(decoder.write(cent));

const euro = Buffer.from([0xE2, 0x82, 0xAC]);
console.log(decoder.write(euro)); 

In the vast majority of cases, your Worker should just keep on using the standard TextEncoder/TextDecoder APIs, but the StringDecoder is available directly for workers to use now without relying on polyfills.

Node.js Compat Modules

One Worker can already be a bundle of multiple assets. This allows a single Worker to be made up of multiple individual ESM modules, CommonJS modules, JSON, text, and binary data files.

Soon there will be a new type of module that can be included in a Worker bundles: the NodeJsCompatModule.

A NodeJsCompatModule is designed to emulate the Node.js environment as much as possible. Within these modules, common Node.js global variables such as process, Buffer, and even __filename will be available. More importantly, it is possible to require() our Node.js core API implementations without using the node: specifier prefix. This maximizes compatibility with existing NPM packages that depend on globals from Node.js being present, or don’t import Node.js APIs using the node: specifier prefix.

Support for this new module type has landed in the open source workerd runtime, with deeper integration with Wrangler coming soon.

What’s next

We’re adding support for more Node.js APIs each month, and as we introduce new APIs, they will be added under the nodejs_compat compatibility flag — no need to take any action or update your compatibility date.

Have an NPM package that you wish worked on Workers, or an API you’d like to be able to use? Join the Cloudflare Developers Discord and tell us what you’re building, and what you’d like to see next.

Node.js compatibility for Cloudflare Workers – starting with Async Context Tracking, EventEmitter, Buffer, assert, and util

Post Syndicated from James M Snell original https://blog.cloudflare.com/workers-node-js-asynclocalstorage/

Node.js compatibility for Cloudflare Workers – starting with Async Context Tracking, EventEmitter, Buffer, assert, and util

Node.js compatibility for Cloudflare Workers – starting with Async Context Tracking, EventEmitter, Buffer, assert, and util

Over the coming months, Cloudflare Workers will start to roll out built-in compatibility with Node.js core APIs as part of an effort to support increased compatibility across JavaScript runtimes.

We are happy to announce today that the first of these Node.js APIs – AsyncLocalStorage, EventEmitter, Buffer, assert, and parts of util – are now available for use. These APIs are provided directly by the open-source Cloudflare Workers runtime, with no need to bundle polyfill implementations into your own code.

These new APIs are available today — start using them by enabling the nodejs_compat compatibility flag in your Workers.

Async Context Tracking with the AsyncLocalStorage API

The AsyncLocalStorage API provides a way to track context across asynchronous operations. It allows you to pass a value through your program, even across multiple layers of asynchronous code, without having to pass a context value between operations.

Consider an example where we want to add debug logging that works through multiple layers of an application, where each log contains the ID of the current request. Without AsyncLocalStorage, it would be necessary to explicitly pass the request ID down through every function call that might invoke the logging function:

function logWithId(id, state) {
  console.log(`${id} - ${state}`);
}

function doSomething(id) {
  // We don't actually use id for anything in this function!
  // It's only here because logWithId needs it.
  logWithId(id, "doing something");
  setTimeout(() => doSomethingElse(id), 10);
}

function doSomethingElse(id) {
  logWithId(id, "doing something else");
}

let idSeq = 0;

export default {
  async fetch(req) {
    const id = idSeq++;
    doSomething(id);
    logWithId(id, 'complete');
    return new Response("ok");
  }
}

While this approach works, it can be cumbersome to coordinate correctly, especially as the complexity of an application grows. Using AsyncLocalStorage this becomes significantly easier by eliminating the need to explicitly pass the context around. Our application functions (doSomething and doSomethingElse in this case) never need to know about the request ID at all while the logWithId function does exactly what we need it to:

import { AsyncLocalStorage } from 'node:async_hooks';

const requestId = new AsyncLocalStorage();

function logWithId(state) {
  console.log(`${requestId.getStore()} - ${state}`);
}

function doSomething() {
  logWithId("doing something");
  setTimeout(() => doSomethingElse(), 10);
}

function doSomethingElse() {
  logWithId("doing something else");
}

let idSeq = 0;

export default {
  async fetch(req) {
    return requestId.run(idSeq++, () => {
      doSomething();
      logWithId('complete');
      return new Response("ok");
    });
  }
}

With the nodejs_compat compatibility flag enabled, import statements are used to access specific APIs. The Workers implementation of these APIs requires the use of the node: specifier prefix that was introduced recently in Node.js (e.g. node:async_hooks, node:events, etc)

We implement a subset of the AsyncLocalStorage API in order to keep things as simple as possible. Specifically, we’ve chosen not to support the enterWith() and disable() APIs that are found in Node.js implementation simply because they make async context tracking more brittle and error prone.

Conceptually, at any given moment within a worker, there is a current “Asynchronous Context Frame”, which consists of a map of storage cells, each holding a store value for a specific AsyncLocalStorage instance. Calling asyncLocalStorage.run(...) causes a new frame to be created, inheriting the storage cells of the current frame, but using the newly provided store value for the cell associated with asyncLocalStorage.

const als1 = new AsyncLocalStorage();
const als2 = new AsyncLocalStorage();

// Code here runs in the root frame. There are two storage cells,
// one for als1, and one for als2. The store value for each is
// undefined.

als1.run(123, () => {
  // als1.run(...) creates a new frame (1). The store value for als1
  // is set to 123, the store value for als2 is still undefined.
  // This new frame is set to "current".

  als2.run(321, () => {
    // als2.run(...) creates another new frame (2). The store value
    // for als1 is still 123, the store value for als2 is set to 321.
    // This new frame is set to "current".
    console.log(als1.getStore(), als2.getStore());
  });

  // Frame (1) is restored as the current. The store value for als1
  // is still 123, but the store value for als2 is undefined again.
});

// The root frame is restored as the current. The store values for
// both als1 and als2 are both undefined again.

Whenever an asynchronous operation is initiated in JavaScript, for example, creating a new JavaScript promise, scheduling a timer, etc, the current frame is captured and associated with that operation, allowing the store values at the moment the operation was initialized to be propagated and restored as needed.

const als = new AsyncLocalStorage();
const p1 = als.run(123, () => {
  return promise.resolve(1).then(() => console.log(als.getStore());
});

const p2 = promise.resolve(1); 
const p3 = als.run(321, () => {
  return p2.then(() => console.log(als.getStore()); // prints 321
});

als.run('ABC', () => setInterval(() => {
  // prints "ABC" to the console once a second…
  setInterval(() => console.log(als.getStore(), 1000);
});

als.run('XYZ', () => queueMicrotask(() => {
  console.log(als.getStore());  // prints "XYZ"
}));

Note that for unhandled promise rejections, the “unhandledrejection” event will automatically propagate the context that is associated with the promise that was rejected. This behavior is different from other types of events emitted by EventTarget implementations, which will propagate whichever frame is current when the event is emitted.

const asyncLocalStorage = new AsyncLocalStorage();

asyncLocalStorage.run(123, () => Promise.reject('boom'));
asyncLocalStorage.run(321, () => Promise.reject('boom2'));

addEventListener('unhandledrejection', (event) => {
  // prints 123 for the first unhandled rejection ('boom'), and
  // 321 for the second unhandled rejection ('boom2')
  console.log(asyncLocalStorage.getStore());
});

Workers can use the AsyncLocalStorage.snapshot() method to create their own objects that capture and propagate the context:

const asyncLocalStorage = new AsyncLocalStorage();

class MyResource {
  #runInAsyncFrame = AsyncLocalStorage.snapshot();

  doSomething(...args) {
    return this.#runInAsyncFrame((...args) => {
      console.log(asyncLocalStorage.getStore());
    }, ...args);
  }
}

const resource1 = asyncLocalStorage.run(123, () => new MyResource());
const resource2 = asyncLocalStorage.run(321, () => new MyResource());

resource1.doSomething();  // prints 123
resource2.doSomething();  // prints 321

For more, refer to the Node.js documentation about the AsyncLocalStorage API.

There is currently an effort underway to add a new AsyncContext mechanism (inspired by AsyncLocalStorage) to the JavaScript language itself. While it is still early days for the TC-39 proposal, there is good reason to expect it to progress through the committee. Once it does, we look forward to being able to make it available in the Cloudflare Workers platform. We expect our implementation of AsyncLocalStorage to be compatible with this new API.

The proposal for AsyncContext provides an excellent set of examples and description of the motivation of why async context tracking is useful.

Events with EventEmitter

The EventEmitter API is one of the most fundamental Node.js APIs and is critical to supporting many other higher level APIs, including streams, crypto, net, and more. An EventEmitter is an object that emits named events that cause listeners to be called.

import { EventEmitter } from 'node:events';

const emitter = new EventEmitter();
emitter.on('hello', (...args) => {
  console.log(...args);
});

emitter.emit('hello', 1, 2, 3);

The implementation in the Workers runtime fully supports the entire Node.js EventEmitter API including the captureRejections option that allows improved handling of async functions as event handlers:

const emitter = new EventEmitter({ captureRejections: true });
emitter.on('hello', async (...args) => {
  throw new Error('boom');
});
emitter.on('error', (err) => {
  // the async promise rejection is emitted here!
});

Please refer to the Node.js documentation for more details on the use of the EventEmitter API: https://nodejs.org/dist/latest-v19.x/docs/api/events.html#events.

Buffer

The Buffer API in Node.js predates the introduction of the standard TypedArray and DataView APIs in JavaScript by many years and has persisted as one of the most commonly used Node.js APIs for manipulating binary data. Today, every Buffer instance extends from the standard Uint8Array class but adds a range of unique capabilities such as built-in base64 and hex encoding/decoding, byte-order manipulation, and encoding-aware substring searching.

import { Buffer } from 'node:buffer';

const buf = Buffer.from('hello world', 'utf8');

console.log(buf.toString('hex'));
// Prints: 68656c6c6f20776f726c64
console.log(buf.toString('base64'));
// Prints: aGVsbG8gd29ybGQ=

Because a Buffer extends from Uint8Array, it can be used in any workers API that currently accepts Uint8Array, such as creating a new Response:

const response = new Response(Buffer.from("hello world"));

Or interacting with streams:

const writable = getWritableStreamSomehow();
const writer = writable.getWriter();
writer.write(Buffer.from("hello world"));

Please refer to the Node.js documentation for more details on the use of the Buffer API: https://nodejs.org/dist/latest-v19.x/docs/api/buffer.html.

Assertions

The assert module in Node.js provides a number of useful assertions that are useful when building tests.

import {
  strictEqual,
  deepStrictEqual,
  ok,
  doesNotReject,
} from 'node:assert';

strictEqual(1, 1); // ok!
strictEqual(1, "1"); // fails! throws AssertionError

deepStrictEqual({ a: { b: 1 }}, { a: { b: 1 }});// ok!
deepStrictEqual({ a: { b: 1 }}, { a: { b: 2 }});// fails! throws AssertionError

ok(true); // ok!
ok(false); // fails! throws AssertionError

await doesNotReject(async () => {}); // ok!
await doesNotReject(async () => { throw new Error('boom') }); // fails! throws AssertionError

In the Workers implementation of assert, all assertions run in what Node.js calls the “strict assertion mode“, which means that non-strict methods behave like their corresponding strict methods. For instance, deepEqual() will behave like deepStrictEqual().

Please refer to the Node.js documentation for more details on the use of the assertion API: https://nodejs.org/dist/latest-v19.x/docs/api/assert.html.

Promisify/Callbackify

The promisify and callbackify APIs in Node.js provide a means of bridging between a Promise-based programming model and a callback-based model.

The promisify method allows taking a Node.js-style callback function and converting it into a Promise-returning async function:

import { promisify } from 'node:util';

function foo(args, callback) {
  try {
    callback(null, 1);
  } catch (err) {
    // Errors are emitted to the callback via the first argument.
    callback(err);
  }
}

const promisifiedFoo = promisify(foo);
await promisifiedFoo(args);

Similarly, callbackify converts a Promise-returning async function into a Node.js-style callback function:

import { callbackify } from 'node:util';

async function foo(args) {
  throw new Error('boom');
}

const callbackifiedFoo = callbackify(foo);

callbackifiedFoo(args, (err, value) => {
  if (err) throw err;
});

Together these utilities make it easy to properly handle all of the generally tricky nuances involved with properly bridging between callbacks and promises.

Please refer to the Node.js documentation for more information on how to use these APIs: https://nodejs.org/dist/latest-v19.x/docs/api/util.html#utilcallbackifyoriginal, https://nodejs.org/dist/latest-v19.x/docs/api/util.html#utilpromisifyoriginal.

Type brand-checking with util.types

The util.types API provides a reliable and generally more efficient way of checking that values are instances of various built-in types.

import { types } from 'node:util';

types.isAnyArrayBuffer(new ArrayBuffer());  // Returns true
types.isAnyArrayBuffer(new SharedArrayBuffer());  // Returns true
types.isArrayBufferView(new Int8Array());  // true
types.isArrayBufferView(Buffer.from('hello world')); // true
types.isArrayBufferView(new DataView(new ArrayBuffer(16)));  // true
types.isArrayBufferView(new ArrayBuffer());  // false
function foo() {
  types.isArgumentsObject(arguments);  // Returns true
}
types.isAsyncFunction(function foo() {});  // Returns false
types.isAsyncFunction(async function foo() {});  // Returns true
// .. and so on

Please refer to the Node.js documentation for more information on how to use the type check APIs: https://nodejs.org/dist/latest-v19.x/docs/api/util.html#utiltypes. The workers implementation currently does not provide implementations of the util.types.isExternal(), util.types.isProxy(), util.types.isKeyObject(), or util.type.isWebAssemblyCompiledModule() APIs.

What’s next

Keep your eyes open for more Node.js core APIs coming to Cloudflare Workers soon! We currently have implementations of the string decoder, streams and crypto APIs in active development. These will be introduced into the workers runtime incrementally over time and any worker using the nodejs_compat compatibility flag will automatically pick up the new modules as they are added.