Four stable kernels released

2025-09-25 jake

Post Syndicated from jake original https://lwn.net/Articles/1039533/

The 6.16.9, 6.12.49, 6.6.108, and 6.1.154 stable kernels have been released.
As usual, they all contain important fixes throughout the kernel tree.

Security updates for Thursday

2025-09-25 jake

Post Syndicated from jake original https://lwn.net/Articles/1039528/

Security updates have been issued by AlmaLinux (grub2 and kernel), Debian (chromium and libxslt), Fedora (chromium, expat, libssh, and webkitgtk), Oracle (avahi, firefox, ImageMagick, kernel, libtpms, and mysql), Red Hat (kernel), SUSE (bird3, expat, kernel, and tiff), and Ubuntu (dpkg, gnuplot, linux, linux-aws, linux-aws-5.15, linux-gcp, linux-gcp-5.15, linux-gke, linux-gkeop, linux-hwe-5.15, linux-ibm, linux-ibm-5.15, linux-intel-iotg, linux-intel-iotg-5.15, linux-lowlatency, linux-lowlatency-hwe-5.15, linux-nvidia, linux-nvidia-tegra, linux-nvidia-tegra-5.15, linux-oracle, linux-raspi, linux-riscv-5.15, linux-xilinx-zynqmp, linux, linux-aws, linux-gcp, linux-gcp-6.14, linux-oracle, linux-realtime, linux-riscv, linux-riscv-6.14, linux-aws-fips, linux-fips, linux-gcp-fips, linux-azure, linux-azure-fips, linux-ibm, linux-ibm-6.8, linux-intel-iot-realtime, linux-realtime, linux-oem-6.14, linux-oracle-5.15, linux-realtime-6.14, and python-eventlet).

PostgreSQL 18 released

2025-09-25 jzb

Post Syndicated from jzb original https://lwn.net/Articles/1039483/

Version
18 of the PostgreSQL database has been released. Notable
improvements in this release include “skip scan” lookups for
multicolumn B-tree indexes, virtual
generated columns, better text processing, oauth
authentication, and a new asynchronous I/O (AIO) subsystem to improve
performance:

AIO lets PostgreSQL issue multiple I/O requests concurrently instead
of waiting for each to finish in sequence. This expands existing
readahead and improves overall throughput. AIO operations supported in
PostgreSQL 18 include sequential scans, bitmap heap scans, and
vacuum. Benchmarking has demonstrated performance gains of up to 3x in
certain scenarios.

There are, of course, many other improvements and changes; see the
release
notes for full details.

Cloudflare’s developer platform keeps getting better, faster, and more powerful. Here’s everything that’s new.

2025-09-25 Brendan Irvine-Broque

Post Syndicated from Brendan Irvine-Broque original https://blog.cloudflare.com/cloudflare-developer-platform-keeps-getting-better-faster-and-more-powerful/

When you build on Cloudflare, we consider it our job to do the heavy lifting for you. That’s been true since we introduced Cloudflare Workers in 2017, when we first provided a runtime for you where you could just focus on building.

That commitment is still true today, and many of today’s announcements are focused on just that — removing friction where possible to free you up to build something great.

There are only so many blog posts we can write (and that you can read)! We have been busy on a much longer list of new improvements, and many of them we’ve been rolling out consistently over the course of the year. Today’s announcement breaks down all the new capabilities in detail, in one single post. The features being released today include:

Use more APIs from Node.js — including node:fs and node:https
Use models from different providers in AI Search (formerly AutoRAG)
Deploy larger container instances and more concurrent instances to our Containers platform
Run 30 concurrent headless web browsers (previously 10), via the Browser Rendering API
Use the Playwright browser automation library with the Browser Rendering API — now fully supported and GA
Use 4 vCPUs (prev 2) and 20GB of disk (prev 8GB) with Workers Builds — now GA
Connect to production services and resources from local development with Remote Bindings — now GA
R2 Infrequent Access GA – lower-cost storage class for backups, logs, and long-tail content
Resize, clip and reformat video files on-demand with Media Transformations — now GA

Alongside that, we’re constantly adding new building blocks, to make sure you have all the tools you need to build what you set out to. Those launches (that also went out today, but require a bit more explanation) include:

Connect to Postgres databases running on Planetscale
Send transactional emails via the new Cloudflare Email Service
Run distributed SQL queries with the new Cloudflare Data Platform
Deploy your own AI vibe coding platform to Cloudflare with VibeSDK

AI Search (formerly AutoRAG) — now with More Models To Choose From

AutoRAG is now AI Search! The new name marks a new and bigger mission: to make world-class search infrastructure available to every developer and business. AI Search is no longer just about retrieval for LLM apps: it’s about giving you a fast, flexible index for your content that is ready to power any AI experience. With recent additions like NLWeb support, we are expanding beyond simple retrieval to provide a foundation for top quality search experiences that are open and built for the future of the web.

With AI Search you can now use models from different providers like OpenAI and Anthropic. Last month during AI Week we announced BYO Provider Keys for AI Gateway. That capability now extends to AI Search. By attaching your keys to the AI Gateway linked to your AI Search instance, you can use many more models for both embedding and inference.

Once configured, your AI Search instance will be able to reference models available through your AI Gateway when making a /ai-search request:

export default {
  async fetch(request, env) {
    
    // Query your AI Search instance with a natural language question to an OpenAI model
    const result = await env.AI.autorag("my-ai-search").aiSearch({
      query: "What's new for Cloudflare Birthday Week?",
      model: "openai/gpt-5"
    });

    // Return only the generated answer as plain text
    return new Response(result.response, {
      headers: { "Content-Type": "text/plain" },
    });
  },
};

In the coming weeks we will also roll out updates to align the APIs with the new name. The existing APIs will continue to be supported for the time being. Stay tuned to the AI Search Changelog and Discord for more updates!

Connect to production services and resources from local development with Remote Bindings — now GA

Remote bindings for local development are generally available, supported in Wrangler v4.37.0, the Cloudflare Vite plugin, and the @cloudflare/vitest-pool-workers package. Remote bindings are bindings that are configured to connect to a deployed resource on your Cloudflare account instead of the locally simulated resource.

For example, here’s how you can instruct Wrangler or Vite to send all requests to env.MY_BUCKET to hit the real, deployed R2 bucket instead of a locally simulated one:

{
  "name": "my-worker",
  "compatibility_date": "2025-09-25",

  "r2_buckets": [
    {
      "bucket_name": "my-bucket",
      "binding": "MY_BUCKET",
      "remote": true
    },
  ],
}

With the above configuration, all requests to env.MY_BUCKET will be proxied to the remote resource, but the Worker code will still execute locally. This means you get all the benefits of local development like faster execution times – without having to seed local databases with data.

You can pair remote bindings with environments, so that you can use staging data during local development and leave production data untouched.

For example, here’s how you could point Wrangler or Vite to send all requests to env.MY_BUCKET to staging-storage-bucket when you run wrangler dev --env staging (CLOUDFLARE_ENV=staging vite dev if using Vite).

{
  "name": "my-worker",
  "compatibility_date": "2025-09-25",

"env": {
    "staging": {
      "r2_buckets": [
        {
          "binding": "MY_BUCKET",
          "bucket_name": "staging-storage-bucket",
          "remote": true
        }
      ]
    },
    "production": {
      "r2_buckets": [
        {
          "binding": "MY_BUCKET",
          "bucket_name": "production-storage-bucket" 
        }
      ]
    }
  }
}

More Node.js APIs and packages “just work” on Workers

Over the past year, we have been hard at work to make Workers more compatible with Node.js packages and APIs.

Several weeks ago, we shared how node:http and node:https APIs are now supported on Workers. This means that you can run backend Express and Koa.js work with only a few additional lines of code:

import { httpServerHandler } from 'cloudflare:node';
import express from 'express';

const app = express();

app.get('/', (req, res) => {
  res.json({ message: 'Express.js running on Cloudflare Workers!' });
});

app.listen(3000);
export default httpServerHandler({ port: 3000 });

And there’s much, much more. You can now:

Read and write temporary files in Workers, using node:fs
Do DNS looking using 1.1.1.1 with node:dns
Use node:net and node:tls for first class Socket support
Use common hashing libraries with node:crypto
Access environment variables in a Node-like fashion on process.env

Read our full recap of the last year’s Node.js-related changes for all the details.

With these changes, Workers become even more powerful and easier to adopt, regardless of where you’re coming from. The APIs that you are familiar with are there, and more packages you need will just work.

Larger Container instances, more concurrent instances

Cloudflare Containers now has higher limits on concurrent instances and an upcoming new, larger instance type.

Previously you could run 50 instances of the dev instance type or 25 instances of the basic instance type concurrently. Now you can run concurrent containers with up to 400 GiB of memory, 100 vCPUs, and 2 TB of disk. This allows you to run up to 1000 dev instances or 400 basic instances concurrently. Enterprise customers can push far beyond these limits — contact us if you need more. If you are using Containers to power your app and it goes viral, you’ll have the ability to scale on Cloudflare.

Cloudflare Containers also now has a new instance type coming soon — standard-2 which includes 8 GiB of memory, 1 vCPU, and 12 GB of disk. This new instance type is an ideal default for workloads that need more resources, from AI Sandboxes to data processing jobs.

Workers Builds provides more disk and CPU — and is now GA

Last Birthday Week, we announced the launch of our integrated CI/CD pipeline, Workers Builds, in open beta. We also gave you a detailed look into how we built this system on our Workers platform using Containers, Durable Objects, Hyperdrive, Workers Logs, and Smart Placement.

This year, we are excited to announce that Workers Builds is now Generally Available. Here’s what’s new:

Increased disk space for all plans: We’ve increased the disk size from 8 GB to 20 GB for both free and paid plans, giving you more space for your projects and dependencies
More compute for paid plans: We’ve doubled the CPU power for paid plans from 2 vCPU to 4 vCPU, making your builds significantly faster
Faster single-core and multi-core performance: To ensure consistent, high performance builds, we now run your builds on the fastest available CPUs at the time your build runs

Haven’t used Workers Builds yet? You can try it by connecting a Git repository to an existing Worker, or try it out on a fresh new project by clicking any Deploy to Cloudflare button, like the one below that deploys a blog built with Astro to your Cloudflare account:

A more consistent look and feel for the Cloudflare dashboard

Durable Objects, R2, and Workers now all have a more consistent look with the rest of our developer platform. As you explore these pages you’ll find that things should load faster, feel smoother and are easier to use.

Across storage products, you can now customize the table that lists the resources on your account, choose which data you want to see, sort by any column, and hide columns you don’t need. In the Workers and Pages dashboard, we’ve reduced clutter and have modernized the design to make it faster for you to get the data you need.

And when you create a new Pipeline or a Hyperdrive configuration, you’ll find a new interface that helps you get started and guides you through each step.

This work is ongoing, and we’re excited to continue improving with the help of your feedback, so keep it coming!

Resize, clip and reformat video files on-demand with Media Transformations — now GA

In March 2025 we announced Media Transformations in open beta, which brings the magic of Image transformations to short-form video files — including video files stored outside of Cloudflare. Since then, we have increased input and output limits, and added support for audio-only extraction. Media Transformations is now generally available.

Media Transformations is ideal if you have a large existing volume of short videos, such as generative AI output, e-commerce product videos, social media clips, or short marketing content. Content like this should be fetched from your existing storage like R2 or S3 directly, optimized by Cloudflare quickly, and delivered efficiently as small MP4 files or used to extract still images and audio.

https://example.com/cdn-cgi/media/<OPTIONS>/<SOURCE-VIDEO>

EXAMPLE, RESIZE:




EXAMPLE, STILL THUMBNAIL:
https://example.com/cdn-cgi/media/mode=frame,time=3s,width=120,height=120,fit=cover/https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4

Media Transformations includes a free tier available to all customers and is included with Media Platform subscriptions. Check out the transform videos documentation for all the latest, then enable transformations for your zone today!

Infrequent Access in R2 is now GA

R2 Infrequent Access is now generally available. Last year, we introduced the Infrequent Access storage class designed for data that doesn’t need to be accessed frequently. It’s a great fit for use cases including long-tail user content, logs, or data backups.

Since launch, Infrequent Access has been proven in production by our customers running these types of workloads at scale. The results confirmed our goal: a storage class that reduces storage costs while maintaining performance and durability.

Pricing is simple. You pay less on data storage, while data retrievals are billed per GB to reflect the additional compute required to serve data from underlying storage optimized for less frequent access. And as with all of R2, there are no egress fees, so you don’t pay for the bandwidth to move data out.

Here’s how you can upload an object to R2 infrequent access class via Workers:

export default {
  async fetch(request, env) {

    // Upload the incoming request body to R2 in Infrequent Access class
    await env.MY_BUCKET.put("my-object", request.body, {
      storageClass: "InfrequentAccess",
    });

    return new Response("Object uploaded to Infrequent Access!", {
      headers: { "Content-Type": "text/plain" },
    });
  },
};

You can also monitor your Infrequent Access vs. Standard storage usage directly in your R2 dashboard for each bucket. Get started with R2 today!

Playwright in Browser Rendering is now GA

We’re excited to announce three updates to Browser Rendering:

Our support for Playwright is now Generally Available, giving developers the stability and confidence to run critical browser tasks.
We’re introducing support for Stagehand, enabling developers to build AI agents using natural language, powered by Cloudflare Workers AI.
Finally, to help developers scale, we are tripling limits for paid plans, with more increases to come.

The browser is no longer only used by humans. AI agents need to be able to reliably navigate browsers in the same way a human would, whether that’s booking flights, filling in customer info, or scraping structured data. Playwright gives AI agents the ability to interact with web pages and perform complex tasks on behalf of humans. However, running browsers at scale is a significant infrastructure challenge. Cloudflare Browser Rendering solves this by providing headless browsers on-demand. By moving Playwright support to Generally Available, and now synced with the latest version v1.55, customers have a production-ready foundation to build reliable, scalable applications on.

To help AI agents better navigate the web, we’re introducing support for Stagehand, an open source browser automation framework. Rather than dictating exact steps or specifying selectors, Stagehand enables developers to build more reliably and flexibly by combining code with natural-language instructions powered by AI. This makes it possible for AI agents to navigate and adapt if a website changes – just like a human would.

To get started with Playwright and Stagehand, check our changelog with code examples and more.

Partnering to make full-stack fast: deploy PlanetScale databases directly from Workers

2025-09-25 Matt Silverlock

Post Syndicated from Matt Silverlock original https://blog.cloudflare.com/planetscale-postgres-workers/

We’re not burying the lede on this one: you can now connect Cloudflare Workers to your PlanetScale databases directly and ship full-stack applications backed by Postgres or MySQL.

We’ve teamed up with PlanetScale because we wanted to partner with a database provider that we could confidently recommend to our users: one that shares our obsession with performance, reliability and developer experience. These are all critical factors for any development team building a serious application.

Now, when connecting to PlanetScale databases, your connections are automatically configured for optimal performance with Hyperdrive, ensuring that you have the fastest access from your Workers to your databases, regardless of where your Workers are running.

Building full-stack

As Workers has matured into a full-stack platform, we’ve introduced more options to facilitate your connectivity to data. With Workers KV, we made it easy to store configuration and cache unstructured data on the edge. With D1 and Durable Objects, we made it possible to build multi-tenant apps with simple, isolated SQL databases. And with Hyperdrive, we made connecting to external databases fast and scalable from Workers.

Today, we’re introducing a new choice for building on Cloudflare: Postgres and MySQL PlanetScale databases, directly accessible from within the Cloudflare dashboard. Link your Cloudflare and PlanetScale accounts, stop manually copying API keys back-and-forth, and connect Workers to any of your PlanetScale databases (production or otherwise!).

^{Connect to a PlanetScale database — no figuring things out on your own}

Postgres and MySQL are the most popular options for building applications, and with good reason. Many large companies have built and scaled on these databases, providing for a robust ecosystem (like Cloudflare!). And you may want to have access to the power, familiarity, and functionality that these databases provide.

Importantly, all of this builds on Hyperdrive, our distributed connection pooler and query caching infrastructure. Hyperdrive keeps connections to your databases warm to avoid incurring latency penalties for every new request, reduces the CPU load on your database by managing a connection pool, and can cache the results of your most frequent queries, removing load from your database altogether. Given that about 80% of queries for a typical transactional database are read-only, this can be substantial — we’ve observed this in reality!

No more copying credentials around

Starting today, you can connect to your PlanetScale databases from the Cloudflare dashboard in just a few clicks. Connecting is now secure by default with a one-click password rotation option, without needing to copy and manage credentials back and forth. A Hyperdrive configuration will be created for your PlanetScale database, providing you with the optimal setup to start building on Workers.

And the experience spans both Cloudflare and PlanetScale dashboards: you can also create and view attached Hyperdrive configurations for your databases from the PlanetScale dashboard.

By automatically integrating with Hyperdrive, your PlanetScale databases are optimally configured for access from Workers. When you connect your database via Hyperdrive, Hyperdrive’s Placement system automatically determines the location of the database and places its pool of database connections in Cloudflare data centers with the lowest possible latency.

When one of your Workers connects to your Hyperdrive configuration for your PlanetScale database, Hyperdrive will ensure the fastest access to your database by eliminating the unnecessary roundtrips included in a typical database connection setup. Hyperdrive will resolve connection setup within the Hyperdrive client and use existing connections from the pool to quickly serve your queries. Better yet, Hyperdrive allows you to cache your query results in case you need to scale for high-read workloads.

This is a peek under the hood of how Hyperdrive makes access to PlanetScale as fast as possible. We’ve previously blogged about Hyperdrive’s technical underpinnings — it’s worth a read. And with this integration with Hyperdrive, you can easily connect to your databases across different Workers applications or environments, without having to reconfigure your credentials. All in all, a perfect match.

Get started with PlanetScale and Workers

With this partnership, we’re making it trivially easy to build on Workers with PlanetScale. Want to build a new application on Workers that connects to your existing PlanetScale cluster? With just a few clicks, you can create a globally deployed app that can query your database, cache your hottest queries, and keep your database connections warmed for fast access from Workers.

^{Connect directly to your PlanetScale MySQL or Postgres databases from the Cloudflare dashboard, for optimal configuration with Hyperdrive.}

To get started, you can:

Head to the Cloudflare dashboard and connect your PlanetScale account
… or head to PlanetScale and connect your Cloudflare account
… and then deploy a Worker

Review the Hyperdrive docs and/or the PlanetScale docs to learn more about how to connect Workers to PlanetScale and start shipping.

R2 SQL: a deep dive into our new distributed query engine

2025-09-25 Yevgen Safronov

Post Syndicated from Yevgen Safronov original https://blog.cloudflare.com/r2-sql-deep-dive/

How do you run SQL queries over petabytes of data… without a server?

We have an answer for that: R2 SQL, a serverless query engine that can sift through enormous datasets and return results in seconds.

This post details the architecture and techniques that make this possible. We’ll walk through our Query Planner, which uses R2 Data Catalog to prune terabytes of data before reading a single byte, and explain how we distribute the work across Cloudflare’s global network, Workers and R2 for massively parallel execution.

From catalog to query

During Developer Week 2025, we launched R2 Data Catalog, a managed Apache Iceberg catalog built directly into your Cloudflare R2 bucket. Iceberg is an open table format that provides critical database features like transactions and schema evolution for petabyte-scale object storage. It gives you a reliable catalog of your data, but it doesn’t provide a way to query it.

Until now, reading your R2 Data Catalog required setting up a separate service like Apache Spark or Trino. Operating these engines at scale is not easy: you need to provision clusters, manage resource usage, and be responsible for their availability, none of which contributes to the primary goal of getting value from your data.

R2 SQL removes that step entirely. It’s a serverless query engine that executes retrieval SQL queries against your Iceberg tables, right where your data lives.

Designing a query engine for petabytes

Object storage is fundamentally different from a traditional database’s storage. A database is structured by design; R2 is an ocean of objects, where a single logical table can be composed of potentially millions of individual files, large and small, with more arriving every second.

Apache Iceberg provides a powerful layer of logical organization on top of this reality. It works by managing the table’s state as an immutable series of snapshots, creating a reliable, structured view of the table by manipulating lightweight metadata files instead of rewriting the data files themselves.

However, this logical structure doesn’t change the underlying physical challenge: an efficient query engine must still find the specific data it needs within that vast collection of files, and this requires overcoming two major technical hurdles:

The I/O problem: A core challenge for query efficiency is minimizing the amount of data read from storage. A brute-force approach of reading every object is simply not viable. The primary goal is to read only the data that is absolutely necessary.

The Compute problem: The amount of data that does need to be read can still be enormous. We need a way to give the right amount of compute power to a query, which might be massive, for just a few seconds, and then scale it down to zero instantly to avoid waste.

Our architecture for R2 SQL is designed to solve these two problems with a two-phase approach: a Query Planner that uses metadata to intelligently prune the search space, and a Query Execution system that distributes the work across Cloudflare’s global network to process the data in parallel.

Query Planner

The most efficient way to process data is to avoid reading it in the first place. This is the core strategy of the R2 SQL Query Planner. Instead of exhaustively scanning every file, the planner makes use of the metadata structure provided by R2 Data Catalog to prune the search space, that is, to avoid reading huge swathes of data irrelevant to a query.

This is a top-down investigation where the planner navigates the hierarchy of Iceberg metadata layers, using stats at each level to build a fast plan, specifying exactly which byte ranges the query engine needs to read.

What do we mean by “stats”?

When we say the planner uses “stats” we are referring to summary metadata that Iceberg stores about the contents of the data files. These statistics create a coarse map of the data, allowing the planner to make decisions about which files to read, and which to ignore, without opening them.

There are two primary levels of statistics the planner uses for pruning:

Partition-level stats: Stored in the Iceberg manifest list, these stats describe the range of partition values for all the data in a given Iceberg manifest file. For a partition on day(event_timestamp), this would be the earliest and latest day present in the files tracked by that manifest.

Column-level stats: Stored in the manifest files, these are more granular stats about each individual data file. Data files in R2 Data Catalog are formatted using the Apache Parquet. For every column of a Parquet file, the manifest stores key information like:

The minimum and maximum values. If a query asks for http_status = 500, and a file’s stats show its http_status column has a min of 200 and a max of 404, that entire file can be skipped.
A count of null values. This allows the planner to skip files when a query specifically looks for non-null values (e.g., WHERE error_code IS NOT NULL) and the file’s metadata reports that all values for error_code are null.

Now, let’s see how the planner uses these stats as it walks through the metadata layers.

Pruning the search space

The pruning process is a top-down investigation that happens in three main steps:

Table metadata and the current snapshot

The planner begins by asking the catalog for the location of the current table metadata. This is a JSON file containing the table’s current schema, partition specs, and a log of all historical snapshots. The planner then fetches the latest snapshot to work with.

2. Manifest list and partition pruning

The current snapshot points to a single Iceberg manifest list. The planner reads this file and uses the partition-level stats for each entry to perform the first, most powerful pruning step, discarding any manifests whose partition value ranges don’t satisfy the query. For a table partitioned by day(event_timestamp), the planner can use the min/max values in the manifest list to immediately discard any manifests that don’t contain data for the days relevant to the query.

3. Manifests and file-level pruning

For the remaining manifests, the planner reads each one to get a list of the actual Parquet data files. These manifest files contain more granular, column-level stats for each individual data file they track. This allows for a second pruning step, discarding entire data files that cannot possibly contain rows matching the query’s filters.

4. File row-group pruning

Finally, for the specific data files that are still candidates, the Query Planner uses statistics stored inside Parquet file’s footers to skip over entire row groups.

The result of this multi-layer pruning is a precise list of Parquet files, and of row groups within those Parquet files. These become the query work units that are dispatched to the Query Execution system for processing.

The Planning pipeline

In R2 SQL, the multi-layer pruning we’ve described so far isn’t a monolithic process. For a table with millions of files, the metadata can be too large to process before starting any real work. Waiting for a complete plan would introduce significant latency.

Instead, R2 SQL treats planning and execution together as a concurrent pipeline. The planner’s job is to produce a stream of work units for the executor to consume as soon as they are available.

The planner’s investigation begins with two fetches to get a map of the table’s structure: one for the table’s snapshot and another for the manifest list.

Starting execution as early as possible

From that point on, the query is processed in a streaming fashion. As the Query Planner reads through the manifest files and subsequently the data files they point to and prunes them, it immediately emits any matching data files/row groups as work units to the execution queue.

This pipeline structure ensures the compute nodes can begin the expensive work of data I/O almost instantly, long before the planner has finished its full investigation.

On top of this pipeline model, the planner adds a crucial optimization: deliberate ordering. The manifest files are not streamed in an arbitrary sequence. Instead, the planner processes them in an order matching by the query’s ORDER BY clause, guided by the metadata stats. This ensures that the data most likely to contain the desired results is processed first.

These two concepts work together to address query latency from both ends of the query pipeline.

The streamed planning pipeline lets us start crunching data as soon as possible, minimizing the delay before the first byte is processed. At the other end of the pipeline, the deliberate ordering of that work lets us finish early by finding a definitive result without scanning the entire dataset.

The next section explains the mechanics behind this “finish early” strategy.

Stopping early: how to finish without reading everything

Thanks to the Query Planner streaming work units in an order matching the ORDER BY clause, the Query Execution system first processes the data that is most likely to be in the final result set.

This prioritization happens at two levels of the metadata hierarchy:

Manifest ordering: The planner first inspects the manifest list. Using the partition stats for each manifest (e.g., the latest timestamp in that group of files), it decides which entire manifest files to stream first.

Parquet file ordering: As it reads each manifest, it then uses the more granular column-level stats to decide the processing order of the individual Parquet files within that manifest.

This ensures a constantly prioritized stream of work units is sent to the execution engine. This prioritized stream is what allows us to stop the query early.

For instance, with a query like … ORDER BY timestamp DESC LIMIT 5, as the execution engine processes work units and sends back results, the planner does two things concurrently:

It maintains a bounded heap of the best 5 results seen so far, constantly comparing new results to the oldest timestamp in the heap.

It keeps a “high-water mark” on the stream itself. Thanks to the metadata, it always knows the absolute latest timestamp of any data file that has not yet been processed.

The planner is constantly comparing the state of the heap to the water mark of the remaining stream. The moment the oldest timestamp in our Top 5 heap is newer than the high-water mark of the remaining stream, the entire query can be stopped.

At that point, we can prove no remaining work unit could possibly contain a result that would make it into the top 5. The pipeline is halted, and a complete, correct result is returned to the user, often after reading only a fraction of the potentially matching data.

Currently, R2 SQL supports ordering on columns that are part of the table’s partition key only. This is a limitation we are working on lifting in the future.

Architecture

Query Execution

Query Planner streams the query work in bite-sized pieces called row groups. A single Parquet file usually contains multiple row groups, but most of the time only a few of them contain relevant data. Splitting query work into row groups allows R2 SQL to only read small parts of potentially multi-GB Parquet files.

The server that receives the user’s request and performs query planning assumes the role of query coordinator. It distributes the work across query workers and aggregates results before returning them to the user.

Cloudflare’s network is vast, and many servers can be in maintenance at the same time. The query coordinator contacts Cloudflare’s internal API to make sure only healthy, fully functioning servers are picked for query execution. Connections between coordinator and query worker go through Cloudflare Argo Smart Routing to ensure fast, reliable connectivity.

Servers that receive query execution requests from the coordinator assume the role of query workers. Query workers serve as a point of horizontal scalability in R2 SQL. With a higher number of query workers, R2 SQL can process queries faster by distributing the work among many servers. That’s especially true for queries covering large amounts of files.

Both the coordinator and query workers run on Cloudflare’s distributed network, ensuring R2 SQL has plenty of compute power and I/O throughput to handle analytical workloads.

Each query worker receives a batch of row groups from the coordinator as well as an SQL query to run on it. Additionally, the coordinator sends serialized metadata about Parquet files containing the row groups. Thanks to that, query workers know exact byte offsets where each row group is located in the Parquet file without the need to read this information from R2.

Apache DataFusion

Internally, each query worker uses Apache DataFusion to run SQL queries against row groups. DataFusion is an open-source analytical query engine written in Rust. It is built around the concept of partitions. A query is split into multiple concurrent independent streams, each working on its own partition of data.

Partitions in DataFusion are similar to partitions in Iceberg, but serve a different purpose. In Iceberg, partitions are a way to physically organize data on object storage. In DataFusion, partitions organize in-memory data for query processing. While logically they are similar – rows grouped together based on some logic – in practice, a partition in Iceberg doesn’t always correspond to a partition in DataFusion.

DataFusion partitions map perfectly to the R2 SQL query worker’s data model because each row group can be considered its own independent partition. Thanks to that, each row group is processed in parallel.

At the same time, since row groups usually contain at least 1000 rows, R2 SQL benefits from vectorized execution. Each DataFusion partition stream can execute the SQL query on multiple rows in one go, amortizing the overhead of query interpretation.

There are two ends of the spectrum when it comes to query execution: processing all rows sequentially in one big batch and processing each individual row in parallel. Sequential processing creates a so-called “tight loop”, which is usually more CPU cache friendly. In addition to that, we can significantly reduce interpretation overhead, as processing a large number of rows at a time in batches means that we go through the query plan less often. Completely parallel processing doesn’t allow us to do these things, but makes use of multiple CPU cores to finish the query faster.

DataFusion’s architecture allows us to achieve a balance on this scale, reaping benefits from both ends. For each data partition, we gain better CPU cache locality and amortized interpretation overhead. At the same time, since many partitions are processed in parallel, we distribute the workload between multiple CPUs, cutting the execution time further.

In addition to the smart query execution model, DataFusion also provides first-class Parquet support.

As a file format, Parquet has multiple optimizations designed specifically for query engines. Parquet is a column-based format, meaning that each column is physically separated from others. This separation allows better compression ratios, but it also allows the query engine to read columns selectively. If the query only ever uses five columns, we can only read them and skip reading the remaining fifty. This massively reduces the amount of data we need to read from R2 and the CPU time spent on decompression.

DataFusion does exactly that. Using R2 ranged reads, it is able to read parts of the Parquet files containing the requested columns, skipping the rest.

DataFusion’s optimizer also allows us to push down any filters to the lowest levels of the query plan. In other words, we can apply filters right as we are reading values from Parquet files. This allows us to skip materialization of results we know for sure won’t be returned to the user, cutting the query execution time further.

Returning query results

Once the query worker finishes computing results, it returns them to the coordinator through the gRPC protocol.

R2 SQL uses Apache Arrow for internal representation of query results. Arrow is an in-memory format that efficiently represents arrays of structured data. It is also used by DataFusion during query execution to represent partitions of data.

In addition to being an in-memory format, Arrow also defines the Arrow IPC serialization format. Arrow IPC isn’t designed for long-term storage of the data, but for inter-process communication, which is exactly what query workers and the coordinator do over the network. The query worker serializes all the results into the Arrow IPC format and embeds them into the gRPC response. The coordinator in turn deserializes results and can return to working on Arrow arrays.

Future plans

While R2 SQL is currently quite good at executing filter queries, we also plan to rapidly add new capabilities over the coming months. This includes, but is not limited to, adding:

Support for complex aggregations in a distributed and scalable fashion;
Tools to help provide visibility in query execution to help developers improve performance;
Support for many of the configuration options Apache Iceberg supports.

In addition to that, we have plans to improve our developer experience by allowing users to query their R2 Data Catalogs using R2 SQL from the Cloudflare Dashboard.

Given Cloudflare’s distributed compute, network capabilities, and ecosystem of developer tools, we have the opportunity to build something truly unique here. We are exploring different kinds of indexes to make R2 SQL queries even faster and provide more functionality such as full text search, geospatial queries, and more.

Try it now!

It’s early days for R2 SQL, but we’re excited for users to get their hands on it. R2 SQL is available in open beta today! Head over to our getting started guide to learn how to create an end-to-end data pipeline that processes and delivers events to an R2 Data Catalog table, which can then be queried with R2 SQL.

We’re excited to see what you build! Come share your feedback with us on our Developer Discord.

Safe in the sandbox: security hardening for Cloudflare Workers

2025-09-25 Erik Corry

Post Syndicated from Erik Corry original https://blog.cloudflare.com/safe-in-the-sandbox-security-hardening-for-cloudflare-workers/

As a serverless cloud provider, we run your code on our globally distributed infrastructure. Being able to run customer code on our network means that anyone can take advantage of our global presence and low latency. Workers isn’t just efficient though, we also make it simple for our users. In short: You write code. We handle the rest.

Part of ‘handling the rest’ is making Workers as secure as possible. We have previously written about our security architecture. Making Workers secure is an interesting problem because the whole point of Workers is that we are running third party code on our hardware. This is one of the hardest security problems there is: any attacker has the full power available of a programming language running on the victim’s system when they are crafting their attacks.

This is why we are constantly updating and improving the Workers Runtime to take advantage of the latest improvements in both hardware and software. This post shares some of the latest work we have been doing to keep Workers secure.

Some background first: Workers is built around the V8 JavaScript runtime, originally developed for Chromium-based browsers like Chrome. This gives us a head start, because V8 was forged in an adversarial environment, where it has always been under intense attack and scrutiny. Like Workers, Chromium is built to run adversarial code safely. That’s why V8 is constantly being tested against the best fuzzers and sanitizers, and over the years, it has been hardened with new technologies like Oilpan/cppgc and improved static analysis.

We use V8 in a slightly different way, though, so we will be describing in this post how we have been making some changes to V8 to improve security in our use case.

Hardware-assisted security improvements from Memory Protection Keys

Modern CPUs from Intel, AMD, and ARM have support for memory protection keys, sometimes called PKU, Protection Keys for Userspace. This is a great security feature which increases the power of virtual memory and memory protection.

Traditionally, the memory protection features of the CPU in your PC or phone were mainly used to protect the kernel and to protect different processes from each other. Within each process, all threads had access to the same memory. Memory protection keys allow us to prevent specific threads from accessing memory regions they shouldn’t have access to.

V8 already uses memory protection keys for the JIT compilers. The JIT compilers for a language like JavaScript generate optimized, specialized versions of your code as it runs. Typically, the compiler is running on its own thread, and needs to be able to write data to the code area in order to install its optimized code. However, the compiler thread doesn’t need to be able to run this code. The regular execution thread, on the other hand, needs to be able to run, but not modify, the optimized code. Memory protection keys offer a way to give each thread the permissions it needs, but no more. And the V8 team in the Chromium project certainly aren’t standing still. They describe some of their future plans for memory protection keys here.

In Workers, we have some different requirements than Chromium. The security architecture for Workers uses V8 isolates to separate different scripts that are running on our servers. (In addition, we have extra mitigations to harden the system against Spectre attacks). If V8 is working as intended, this should be enough, but we believe in defense in depth: multiple, overlapping layers of security controls.

That’s why we have deployed internal modifications to V8 to use memory protection keys to isolate the isolates from each other. There are up to 15 different keys available on a modern x64 CPU and a few are used for other purposes in V8, so we have about 12 to work with. We give each isolate a random key which is used to protect its V8 heap data, the memory area containing the JavaScript objects a script creates as it runs. This means security bugs that might previously have allowed an attacker to read data from a different isolate would now hit a hardware trap in 92% of cases. (Assuming 12 keys, 92% is about 11/12.)

The illustration shows an attacker attempting to read from a different isolate. Most of the time this is detected by the mismatched memory protection key, which kills their script and notifies us, so we can investigate and remediate. The red arrow represents the case where the attacker got lucky by hitting an isolate with the same memory protection key, represented by the isolates having the same colors.

However, we can further improve on a 92% protection rate. In the last part of this blog post we’ll explain how we can lift that to 100% for a particular common scenario. But first, let’s look at a software hardening feature in V8 that we are taking advantage of.

The V8 sandbox, a software-based security boundary

Over the past few years, V8 has been gaining another defense in depth feature: the V8 sandbox. (Not to be confused with the layer 2 sandbox which Workers have been using since the beginning.) The V8 sandbox has been a multi-year project that has been gaining maturity for a while. The sandbox project stems from the observation that many V8 security vulnerabilities start by corrupting objects in the V8 heap memory. Attackers then leverage this corruption to reach other parts of the process, giving them the opportunity to escalate and gain more access to the victim’s browser, or even the entire system.

V8’s sandbox project is an ambitious software security mitigation that aims to thwart that escalation: to make it impossible for the attacker to progress from a corruption on the V8 heap to a compromise of the rest of the process. This means, among other things, removing all pointers from the heap. But first, let’s explain in as simple terms as possible, what a memory corruption attack is.

Memory corruption attacks

A memory corruption attack tricks a program into misusing its own memory. Computer memory is just a store of integers, where each integer is stored in a location. The locations each have an address, which is also just a number. Programs interpret the data in these locations in different ways, such as text, pixels, or pointers. Pointers are addresses that identify a different memory location, so they act as a sort of arrow that points to some other piece of data.

Here’s a concrete example, which uses a buffer overflow. This is a form of attack that was historically common and relatively simple to understand: Imagine a program has a small buffer (like a 16-character text field) followed immediately by an 8-byte pointer to some ordinary data. An attacker might send the program a 24-character string, causing a “buffer overflow.” Because of a vulnerability in the program, the first 16 characters fill the intended buffer, but the remaining 8 characters spill over and overwrite the adjacent pointer.

^{See below for how such an attack would now be thwarted.}

Now the pointer has been redirected to point at sensitive data of the attacker’s choosing, rather than the normal data it was originally meant to access. When the program tries to use what it believes is its normal pointer, it’s actually accessing sensitive data chosen by the attacker.

This type of attack works in steps: first create a small confusion (like the buffer overflow), then use that confusion to create bigger problems, eventually gaining access to data or capabilities the attacker shouldn’t have. The attacker can eventually use the misdirection to either steal information or plant malicious data that the program will treat as legitimate.

This was a somewhat abstract description of memory corruption attacks using a buffer overflow, one of the simpler techniques. For some much more detailed and recent examples, see this description from Google, or this breakdown of a V8 vulnerability.

Compressed pointers in V8

Many attacks are based on corrupting pointers, so ideally we would remove all pointers from the memory of the program. Since an object-oriented language’s heap is absolutely full of pointers, that would seem, on its face, to be a hopeless task, but it is enabled by an earlier development. Starting in 2020, V8 has offered the option of saving memory by using compressed pointers. This means that, on a 64-bit system, the heap uses only 32 bit offsets, relative to a base address. This limits the total heap to maximally 4 GiB, a limitation that is acceptable for a browser, and also fine for individual scripts running in a V8 isolate on Cloudflare Workers.

^{An artificial object with various fields, showing how the layout differs in a compressed vs. an uncompressed heap. The boxes are 64 bits wide.}

If the whole of the heap is in a single 4 GiB area then the first 32 bits of all pointers will be the same, and we don’t need to store them in every pointer field in every object. In the diagram we can see that the object pointers all start with 0x12345678, which is therefore redundant and doesn’t need to be stored. This means that object pointer fields and integer fields can be reduced from 64 to 32 bits.

We still need 64 bit fields for some fields like double precision floats and for the sandbox offsets of buffers, which are typically used by the script for input and output data. See below for details.

Integers in an uncompressed heap are stored in the high 32 bits of a 64 bit field. In the compressed heap, the top 31 bits of a 32 bit field are used. In both cases the lowest bit is set to 0 to indicate integers (as opposed to pointers or offsets).

Conceptually, we have two methods for compressing and decompressing, using a base address that is divisible by 4 GiB:

// Decompress a 32 bit offset to a 64 bit pointer by adding a base address.
void* Decompress(uint32_t offset) { return base + offset; }
// Compress a 64 bit pointer to a 32 bit offset by discarding the high bits.
uint32_t Compress(void* pointer) { return (intptr_t)pointer & 0xffffffff; }

This pointer compression feature, originally primarily designed to save memory, can be used as the basis of a sandbox.

From compressed pointers to the sandbox

The biggest 32-bit unsigned integer is about 4 billion, so the Decompress() function cannot generate any pointer that is outside the range [base, base + 4 GiB]. You could say the pointers are trapped in this area, so it is sometimes called the pointer cage. V8 can reserve 4 GiB of virtual address space for the pointer cage so that only V8 objects appear in this range. By eliminating all pointers from this range, and following some other strict rules, V8 can contain any memory corruption by an attacker to this cage. Even if an attacker corrupts a 32 bit offset within the cage, it is still only a 32 bit offset and can only be used to create new pointers that are still trapped within the pointer cage.

^{The buffer overflow attack from earlier no longer works because only the attacker’s own data is available in the pointer cage.}

To construct the sandbox, we take the 4 GiB pointer cage and add another 4 GiB for buffers and other data structures to make the 8 GiB sandbox. This is why the buffer offsets above are 33 bits, so they can reach buffers in the second half of the sandbox (40 bits in Chromium with larger sandboxes). V8 stores these buffer offsets in the high 33 bits and shifts down by 31 bits before use, in case an attacker corrupted the low bits.

Cloudflare Workers have made use of compressed pointers in V8 for a while, but for us to get the full power of the sandbox we had to make some changes. Until recently, all isolates in a process had to be one single sandbox if you were using the sandboxed configuration of V8. This would have limited the total size of all V8 heaps to be less than 4 GiB, far too little for our architecture, which relies on serving 1000s of scripts at once.

That’s why we commissioned Igalia to add isolate groups to V8. Each isolate group has its own sandbox and can have 1 or more isolates within it. Building on this change we have been able to start using the sandbox, eliminating a whole class of potential security issues in one stroke. Although we can place multiple isolates in the same sandbox, we are currently only putting a single isolate in each sandbox.

^{The layout of the sandbox. In the sandbox there can be more than one isolate, but all their heap pages must be in the pointer cage: the first 4 GiB of the sandbox. Instead of pointers between the objects, we use 32 bit offsets. The offsets for the buffers are 33 bits, so they can reach the whole sandbox, but not outside it.}

Virtual memory isn’t infinite, there’s a lot going on in a Linux process

At this point, we were not quite done, though. Each sandbox reserves 8 GiB of space in the virtual memory map of the process, and it must be 4 GiB aligned for efficiency. It uses much less physical memory, but the sandbox mechanism requires this much virtual space for its security properties. This presents us with a problem, since a Linux process ‘only’ has 128 TiB of virtual address space in a 4-level page table (another 128 TiB are reserved for the kernel, not available to user space).

At Cloudflare, we want to run Workers as efficiently as possible to keep costs and prices down, and to offer a generous free tier. That means that on each machine we have so many isolates running (one per sandbox) that it becomes hard to place them all in a 128 TiB space.

Knowing this, we have to place the sandboxes carefully in memory. Unfortunately, the Linux syscall, mmap, does not allow us to specify the alignment of an allocation unless you can guess a free location to request. To get an 8 GiB area that is 4 GiB aligned, we have to ask for 12 GiB, then find the aligned 8 GiB area that must exist within that, and return the unused (hatched) edges to the OS:

If we allow the Linux kernel to place sandboxes randomly, we end up with a layout like this with gaps. Especially after running for a while, there can be both 8 GiB and 4 GiB gaps between sandboxes:

Sadly, because of our 12 GiB alignment trick, we can’t even make use of the 8 GiB gaps. If we ask the OS for 12 GiB, it will never give us a gap like the 8 GiB gap between the green and blue sandboxes above. In addition, there are a host of other things going on in the virtual address space of a Linux process: the malloc implementation may want to grab pages at particular addresses, the executable and libraries are mapped at a random location by ASLR, and V8 has allocations outside the sandbox.

The latest generation of x64 CPUs supports a much bigger address space, which solves both problems, and Linux kernels are able to make use of the extra bits with five level page tables. A process has to opt into this, which is done by a single mmap call suggesting an address outside the 47 bit area. The reason this needs an opt-in is that some programs can’t cope with such high addresses. Curiously, V8 is one of them.

This isn’t hard to fix in V8, but not all of our fleet has been upgraded yet to have the necessary hardware. So for now, we need a solution that works with the existing hardware. We have modified V8 to be able to grab huge memory areas and then use mprotect syscalls to create tightly packed 8 GiB spaces for sandboxes, bypassing the inflexible mmap API.

Putting it all together

Taking control of the sandbox placement like this actually gives us a security benefit, but first we need to describe a particular threat model.

We assume for the purposes of this threat model that an attacker has an arbitrary way to corrupt data within the sandbox. This is historically the first step in many V8 exploits. So much so that there is a special tier in Google’s V8 bug bounty program where you may assume you have this ability to corrupt memory, and they will pay out if you can leverage that to a more serious exploit.

However, we assume that the attacker does not have the ability to execute arbitrary machine code. If they did, they could disable memory protection keys. Having access to the in-sandbox memory only gives the attacker access to their own data. So the attacker must attempt to escalate, by corrupting data inside the sandbox to access data outside the sandbox.

You will recall that the compressed, sandboxed V8 heap only contains 32 bit offsets. Therefore, no corruption there can reach outside the pointer cage. But there are also arrays in the sandbox — vectors of data with a given size that can be accessed with an index. In our threat model, the attacker can modify the sizes recorded for those arrays and the indexes used to access elements in the arrays. That means an attacker could potentially turn an array in the sandbox into a tool for accessing memory incorrectly. For this reason, the V8 sandbox normally has guard regions around it: These are 32 GiB virtual address ranges that have no virtual-to-physical address mappings. This helps guard against the worst case scenario: Indexing an array where the elements are 8 bytes in size (e.g. an array of double precision floats) using a maximal 32 bit index. Such an access could reach a distance of up to 32 GiB outside the sandbox: 8 times the maximal 32 bit index of four billion.

We want such accesses to trigger an alarm, rather than letting an attacker access nearby memory. This happens automatically with guard regions, but we don’t have space for conventional 32 GiB guard regions around every sandbox.

Instead of using conventional guard regions, we can make use of memory protection keys. By carefully controlling which isolate group uses which key, we can ensure that no sandbox within 32 GiB has the same protection key. Essentially, the sandboxes are acting as each other’s guard regions, protected by memory protection keys. Now we only need a wasted 32 GiB guard region at the start and end of the huge packed sandbox areas.

With the new sandbox layout, we use strictly rotating memory protection keys. Because we are not using randomly chosen memory protection keys, for this threat model the 92% problem described above disappears. Any in-sandbox security issue is unable to reach a sandbox with the same memory protection key. In the diagram, we show that there is no memory within 32 GiB of a given sandbox that has the same memory protection key. Any attempt to access memory within 32 GiB of a sandbox will trigger an alarm, just like it would with unmapped guard regions.

The future

In a way, this whole blog post is about things our customers don’t need to do. They don’t need to upgrade their server software to get the latest patches, we do that for them. They don’t need to worry whether they are using the most secure or efficient configuration. So there’s no call to action here, except perhaps to sleep easy.

However, if you find work like this interesting, and especially if you have experience with the implementation of V8 or similar language runtimes, then you should consider coming to work for us. We are recruiting both in the US and in Europe. It’s a great place to work, and Cloudflare is going from strength to strength.

Every Cloudflare feature, available to everyone

2025-09-25 Dane Knecht

Post Syndicated from Dane Knecht original https://blog.cloudflare.com/enterprise-grade-features-for-all/

Over the next year Cloudflare will make nearly every feature we offer available to any customer who wants to buy and use it regardless of whether they are an enterprise account. No need to pick up a phone and talk to a sales team member. No requirement to find time with a solutions engineer in our team to turn on a feature. No contract necessary. We believe that if you want to use something we offer, you should just be able to buy it.

Today’s launch starts by bringing Single Sign-On (SSO) into our dashboard out of our enterprise plan and making it available to any user. That capability is the first of many. We will be sharing updates over the next few months as more and more features become available for purchase on any plan.

We are also making a commitment to ensuring that all future releases will follow this model. The goal is not to restrict new tools to the enterprise tier for some amount of time before making them widely available. We believe helping build a better Internet means making sure the best tools are available to anyone who needs them.

Enterprise grade for everyone

It’s not enough to build the best tools on the web. At Cloudflare our mission is to help build a better Internet and that means making the tools we build accessible. We believe the best way to make the Internet faster and more secure is to put powerful features into the hands of as many people as possible.

We first launched an Enterprise tier years ago when larger customers came to us looking to scale their usage of Cloudflare in new ways. They needed procurement options beyond a credit card, like invoices, custom contracts, and dedicated support. This offering was a necessary and important step to bring the benefits of our network and tools to large organizations with complex needs.

This created an unintended side effect in how we shipped products. Some of our most powerful and innovative features were launched within an enterprise-only tier. This created a gap, a two-tiered system where some of the most advanced features were reserved only for the largest companies.

It also created a divergence in our product development. Features built for our self-service customers had to be incredibly simple and intuitive from day-one. Features designated “enterprise-only” didn’t always face that same pressure to scale – we could instead rely on our solutions teams or partners to help set up and support.

It’s time to fix that. Starting today, we are doing away with the concept of “enterprise-only” features. Over the coming months and quarters, we will make many of our most advanced capabilities available to all of our customers.

The change will help build a more secure Internet by removing barriers to the adoption of the most advanced tools available. The change improves the experience for all customers. Smaller teams on our self-service plans will have access to the most powerful configuration options we offer. Existing enterprise teams will have easier pathways to adopt new tools without calling their account manager. And our own Product teams have even more reason to continue to make all features we ship easy to use.

Today we are beginning with dashboard SSO with instructions on how to begin setting that up right now below. It is the first of many though and capabilities like apex proxying and expanded upload limits, along with many others of our most requested enterprise features, will follow.

Starting with how you sign in to Cloudflare

One example of a feature we launched only to enterprise customers because of the complexity in setting it up is SSO. Enterprise teams maintain their own identity provider where they can manage internal employee accounts and how their team members log into different services.

They integrate these identity providers with the tools their employees need so that team members do not need to create and remember a username and password for each and every service. More importantly, the management of identity in a single place gives enterprises the ability to control authentication policies, onboard and offboard users, and hand out licenses for tools.

We first launched our own SSO support way back in 2018. In the last seven years we have been helping thousands of enterprise customers manually set this up, but we know that teams of all sizes rely on the security and convenience of an identity provider. As part of this announcement, the first enterprise feature we are making available to everyone is dashboard SSO.

The functionality is available immediately to anyone on any plan. To get started, follow the instructions here to integrate your identity provider with Cloudflare and to then connect your domain with your account. By setting up your identity provider for dashboard SSO you will also be able to begin using the vast majority of our Zero Trust security features, as well, which are available at no cost for up to 50 users.

We also know that some teams are too early or distributed to have a full-fledged identity provider but want the convenience and security of managing logins in one place. To that end, we are also excited to launch support for GitHub as a social login provider to the Cloudflare dashboard as part of today’s announcement.

And extending to almost everything else over the next year

We prioritized dashboard SSO because just about every team that uses Cloudflare wants it. This one change helps make nearly every customer safer by allowing them to centrally manage team access. As we burn down the list of previously enterprise-only features, we will continue targeting those that have similar broad impact.

Some capabilities, like Magic Transit, have less broad appeal. The organizations that maintain their own networks and want to deploy Magic Transit tend to already want to be enterprise customers for account management reasons. That said, we still can improve their experience by making tools like Magic Transit available to all plans because we will have to remove some of the friction in the setup that we have historically just solved with people hours from our solution engineers and partners.

We also realize that the way some of these features are priced only made sense with an invoice or enterprise license agreement model. To make this work, we need to revisit how some of our usage metering and billing functions. That will continue to be a priority for us, and we are excited about how this will push us to continue making our packaging and billing even simpler for all customers.

There are some features that we can’t make available to everyone because of non-technical reasons. For example, using our China Network has complicated legal requirements in China that are impossible for us to manage for millions of customers.

Self-service by default going forward

One thing we are not announcing today is a strategy to continue to release “enterprise-only” features for a while before they eventually make it to the self-service plans. Going forward, to launch something at Cloudflare the team will need to make sure that any customer can buy it off the shelf without talking to someone.

We expect that requirement to improve how all products are built here, not just the more advanced capabilities. We also consider it mission-critical. We have a long history of making the kinds of tools that only the largest businesses could buy available to anyone, from universal SSL over a decade ago to newer features this week that were available for self-service plans immediately like per-customer bot detection IDs and security of data in transit between SaaS applications. We are excited to continue this tradition.

What’s next?

You can get started right now setting up dashboard SSO in your Cloudflare account using the documentation available here. We will continue to share updates as previously enterprise-only features are made available to any plan.

Choice: the path to AI sovereignty

2025-09-25 Carly Ramsey

Post Syndicated from Carly Ramsey original https://blog.cloudflare.com/sovereign-ai-and-choice/

Every government is laser-focused on the potential for national transformation by AI. Many view AI as an unparalleled opportunity to solve complex national challenges, drive economic growth, and improve the lives of their citizens. Others are concerned about the risks AI can bring to its society and economy. Some sit somewhere between these two perspectives. But as plans are drawn up by governments around the world to address the question of AI development and adoption, all are grappling with the critical question of sovereignty — how much of this technology, mostly centered in the United States and China, needs to be in their direct control?

Each nation has their own response to that question — some seek ‘self-sufficiency’ and total authority. Others, particularly those that do not have the capacity to build the full AI technology stack, are approaching it layer-by-layer, seeking to build on the capacities their country does have and then forming strategic partnerships to fill the gaps.

We believe AI sovereignty at its core is about choice. Each nation should have the ability to select the right tools for the task, to control its own data, and to deploy applications at will, all without being locked into a single provider or a single way of doing things. It’s about autonomy and options, realized through a diversified, resilient digital supply chain.

Cloudflare’s mission is to help build a better Internet. We make tools for developers around the world to build Internet and AI applications that are widely, and in many cases, freely, available. We work on standards to improve interoperability and prevent vendor lock-in. And we are global — our network spans 330 cities in over 125 countries. By supporting local developers to build and deploy AI tools and services right where they are, Cloudflare can help each nation on their path to greater AI sovereignty.

Creating a future that enables many AI options

Many nations recognize the practical challenge of realizing a robust AI-driven future that incorporates sovereignty — the significant cost and complexity of the infrastructure needed to set AI in action. Cloudflare believes that countries can achieve their objectives by creating vibrant marketplaces that allow multiple options, and we are creating a path for governments that provides maximum choice:

Infrastructure accessibility: Countries often focus on building large data centers that have the compute capacity to train general purpose AI models, neglecting the infrastructure needed to effectively deploy AI. Because of their proximity to end users, distributed edge networks are critical to ensuring that consumers can actually use AI technologies at scale. Although some AI technologies will be designed to work on-device, many will need more power to run AI inference, the tasks that users ask an AI engine to complete. Distributed networks are equipped to run AI workloads at the edge, to help deliver the low latency and high performance needed for advanced technologies. Cloudflare’s distributed network gives developers a path to rapidly deploy their apps globally without massive upfront investments.

Inclusivity: Nations want their entire economies, from the small businesses, to research institutions, to non-profits and enterprises, to benefit from AI transformation. Serverless models like Cloudflare’s make it easy to get started. Developers pay only for what they use, rather than being locked into paying for expensive and unnecessary compute, dramatically lowering the barrier to entry. Our free tier allows developers to experiment, build, and even launch applications without any cost, while our pay-as-you-go model for increased usage removes the significant financial barriers that might otherwise keep advanced AI out of reach.

Control over data: An important part of sovereignty is the ability to control your own data. We believe countries should avoid equating this type of control with data locality, focusing instead on integrating security tools that provide visibility and the ability to restrict access to data. Cloudflare’s global, distributed network ensures that developers can experiment, build, and deploy AI-powered applications right where they are, setting rules and controls at the Internet edge.

Multi-modal, dynamic markets: Building new applications with closed AI models can make it challenging to switch models later, and can make developers dependent on particular providers. AI strategies must embrace diversity — developers should have access to a wide variety of both open source and closed AI models. Cloudflare’s Workers AI platform, with over 50 open source models, is model agnostic, helping to create a competitive, dynamic environment where developers can swap models in and out as better, cheaper, or more specialized options become available. Cloudflare’s AI Gateway allows our customers to connect and control all their AI models, regardless of vendor, in a single, unified, interoperable platform.

Underpinning all of this is the importance of open standards that encourage interoperability. Open standards and protocols throughout the AI technology stack help prevent dependency, create dynamic and competitive markets, and create choice for governments and their developers.

Championing regional AI innovation

Many countries have started to put their own mark on how to spur innovation in their markets, starting with large language models (LLM). AI development to date has mostly centered around LLMs trained on English-centric data, and increasingly, Chinese-centric data, leaving behind those who can’t fully access this technology in these two languages. Recognizing this gap, these nations are building and freely offering AI models trained on local language datasets that are fine-tuned to the nuances of their own cultures and languages. This approach lowers the barrier to entry for local businesses, organizations, and governments to create customized AI solutions for their specific markets. Open-sourcing these LLMs is to recognize that AI sovereignty is a means to an end. The goal is innovation, economic growth, and the ability to solve meaningful problems.

Cloudflare is now supporting these sovereign AI initiatives in India, Japan, and Southeast Asia. We are bringing these locally-developed, open-source AI models to developers around the world through our serverless inference platform, Workers AI.

India: India’s national vision is “AI for All”, which focuses on AI driving inclusive growth and social empowerment. India will host the momentous global AI Impact Summit in 2026, and a key element of showcasing empowering technological advancements that are accessible to the Global South. With its immense linguistic diversity, India is at the forefront of creating models that serve its hundreds of millions of Internet users in their native tongues. A cornerstone in this endeavor is the Government of India’s Bhashini, a digital public good platform that enables all Indian citizens to access the Internet and digital services in 22 official languages.

Cloudflare is now offering AI4Bharat’s IndicTrans2 model, a key open source language model that is also part of the Bhashini initiative. The model is able to translate text across 22 Indic languages, including Bengali, Gujarati, Hindi, Tamil, Sanskrit and even traditionally low-resourced languages like Kashmiri, Manipuri and Sindhi.

You can use the @cf/ai4bharat/indictrans2-en-indic-1B model on Workers AI as follows:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/ACCOUNT_ID/ai/run/@cf/ai4bharat/indictrans2-en-indic-1B \
  --header 'Authorization: Bearer TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
    "text": ["What is your favourite food?", "I like pizza"],
    "target_language": "guj_Gujr"
}'

Japan: Japan has a very clear and expansive vision of AI development. Concerned about Japan’s slow AI uptake, the Japanese government aims to make the country “the world’s most friendly AI nation” by creating the ideal conditions for AI growth, both at home and abroad. A major initiative for Japan’s government is supporting AI that deeply understands the complexities and cultural context of the Japanese language.

Cloudflare is offering Preferred Networks, Inc.(PFN) PLaMo-Embedding-1B, a home-grown Japanese text embedding model, made freely and openly available. The Japanese government supported PFN through its Generative AI Accelerator Challenge (GENIAC) program, which supports local LLM development through subsidized access to compute resources for training. The PLaMo Embedding model enables users to generate high-quality embeddings for Japanese text, which is helpful for building RAG-powered applications and semantic search use cases.

You can use the @cf/pfnet/plamo-embedding-1b model on Workers AI as follows:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/ACCOUNT_ID/ai/run/@cf/pfnet/plamo-embedding-1b \
  --header 'Authorization: Bearer TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
  	"text": [
            "PLaMo-Embedding-1Bは、Preferred Networks, Inc. によって開発された日本語テキスト埋め込みモデルです。",
            "最近は随分と暖かくなりましたね。"
        ]
}'

Southeast Asia: As Chair of the Association of Southeast Asian Nations (ASEAN) Working Group on AI Governance, Singapore’s ambitious National AI Strategy 2.0 aims to ensure that AI is a public good, both for Southeast Asia and the world. As a cornerstone of this strategy, Singapore is championing the development and adoption of SEA-LION, a family of open-source LLMs designed for Southeast Asia’s diverse languages and cultures. The initiative aims to establish the nation as an inclusive global AI leader, ensuring the technology is both accessible and regionally relevant to its multilingual and multicultural populaces. The models are adept in numerous regional languages, including Bahasa Indonesia, Bahasa Malaysia, Thai, Vietnamese, and Tamil, unlocking AI technologies for a significant portion of the Asian and global population.

SEA-LION model v4-27B is now available on the Workers AI platform. SEA-LION v4 stands out on the Singapore government’s leaderboard as its most powerful, efficient, multimodal and multilingual model yet.

You can use the @cf/aisingapore/gemma-sea-lion-v4-27b-it model on Workers AI as follows:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/ACCOUNT_ID/ai/run/@cf/aisingapore/gemma-sea-lion-v4-27b-it \
  --header 'Authorization: Bearer TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
  "messages": [
    {
      "role": "user",
      "content": "แล้วทำผัดไทยอย่างไร"
    }
  ]
}'

Bringing AI models to the world

Singapore, India and Japan have all chosen to open-source many of their local language models, a strategy that champions an expansive vision of AI sovereignty. This approach demonstrates a crucial understanding: true AI sovereignty is ensuring you have choices.

Supporting local language open source models is more than just supporting technology; this is a shared commitment to fostering an open, interoperable, and competitive AI ecosystem by empowering governments and developers to solve local problems, create economic opportunities, and preserve their digital and cultural heritages.

We are honored to support the initiatives of the governments of India, Japan, and Singapore on this journey. We believe that by putting their sovereign AI models into the hands of developers in their economies, we can help unlock a powerful wave of innovation that is more diverse, equitable, and representative of the world we live in. The future of AI is being built today, and we are proud to ensure that AI developers everywhere are at the forefront.

Choice is the foundation of AI sovereignty. We’re starting with the models from India, Japan, and Singapore on our serverless inference platform, but it’s only the start. Come build with us! Take the first step for free on Workers AI.

Announcing the Cloudflare Data Platform: ingest, store, and query your data directly on Cloudflare

2025-09-25 Micah Wylde

Post Syndicated from Micah Wylde original https://blog.cloudflare.com/cloudflare-data-platform/

For Developer Week in April 2025, we announced the public beta of R2 Data Catalog, a fully managed Apache Iceberg catalog on top of Cloudflare R2 object storage. Today, we are building on that foundation with three launches:

Cloudflare Pipelines receives events sent via Workers or HTTP, transforms them with SQL, and ingests them into Iceberg or as files on R2
R2 Data Catalog manages the Iceberg metadata and now performs ongoing maintenance, including compaction, to improve query performance
R2 SQL is our in-house distributed SQL engine, designed to perform petabyte-scale queries over your data in R2

Together, these products make up the Cloudflare Data Platform, a complete solution for ingesting, storing, and querying analytical data tables.

Like all Cloudflare Developer Platform products, they run on our global compute infrastructure. They’re built around open standards and interoperability. That means that you can bring your own Iceberg query engine — whether that’s PyIceberg, DuckDB, or Spark — connect with other platforms like Databricks and Snowflake — and pay no egress fees to access your data.

Analytical data is critical for modern companies. It allows you to understand your user’s behavior, your company’s performance, and alerts you to issues. But traditional data infrastructure is expensive and hard to operate, requiring fixed cloud infrastructure and in-house expertise. We built the Cloudflare Data Platform to be easy enough for anyone to use with affordable, usage-based pricing.

If you’re ready to get started now, follow the Data Platform tutorial for a step-by-step guide through creating a Pipeline that processes and delivers events to an R2 Data Catalog table, which can then be queried with R2 SQL. Or read on to learn about how we got here and how all of this works.

How did we end up building a Data Platform?

We launched R2 Object Storage in 2021 with a radical pricing strategy: no egress fees — the bandwidth costs that traditional cloud providers charge to get data out — effectively ransoming your data. This was possible because we had already built one of the largest global networks, interconnecting with thousands of ISPs, cloud services, and other enterprises.

Object storage powers a wide range of use cases, from media to static assets to AI training data. But over time, we’ve seen an increasing number of companies using open data and table formats to store their analytical data warehouses in R2.

The technology that enables this is Apache Iceberg. Iceberg is a table format, which provides database-like capabilities (including updates, ACID transactions, and schema evolution) on top of data files in object storage. In other words, it’s a metadata layer that tells clients which data files make up a particular logical table, what the schemas are, and how to efficiently query them.

The adoption of Iceberg across the industry meant users were no longer locked-in to one query engine. But egress fees still make it cost-prohibitive to query data across regions and clouds. R2, with zero-cost egress, solves that problem — users would no longer be locked-in to their clouds either. They could store their data in a vendor-neutral location and let teams use whatever query engine made sense for their data and query patterns.

But users still had to manage all of the metadata and other infrastructure themselves. We realized there was an opportunity for us to solve a major pain point and reduce the friction of storing data lakes on R2. This became R2 Data Catalog, our managed Iceberg catalog.

With the data stored on R2 and metadata managed, that still left a few gaps for users to solve.

How do you get data into your Iceberg tables? Once it’s there, how do you optimize for query performance? And how do you actually get value from your data without needing to self-host a query engine or use another cloud platform?

In the rest of this post, we’ll walk through how the three products that make up the Data Platform solve these challenges.

Cloudflare Pipelines

Analytical data tables are made up of events, things that happened at a particular point in time. They might come from server logs, mobile applications, or IoT devices, and are encoded in data formats like JSON, Avro, or Protobuf. They ideally have a schema — a standardized set of fields — but might just be whatever a particular team thought to throw in there.

But before you can query your events with Iceberg, they need to be ingested, structured according to a schema, and written into object storage. This is the role of Cloudflare Pipelines.

Built on top of Arroyo, a stream processing engine we acquired earlier this year, Pipelines receives events, transforms them with SQL queries, and sinks them to R2 and R2 Data Catalog.

Pipelines is organized around three central objects:

Streams are how you get data into Cloudflare. They’re durable, buffered queues that receive events and store them for processing. Streams can accept events in two ways: via an HTTP endpoint or from a Cloudflare Worker binding.

Sinks define the destination for your data. We support ingesting into R2 Data Catalog, as well as writing raw files to R2 as JSON or Apache Parquet. Sinks can be configured to frequently write files, prioritizing low-latency ingestion, or to write less frequent, larger files to get better query performance. In either case, ingestion is exactly-once, which means that we will never duplicate or drop events on their way to R2.

Pipelines connect streams and sinks via SQL transformations, which can modify events before writing them to storage. This enables you to shift left, pushing validation, schematization, and processing to your ingestion layer to make your queries easy, fast, and correct.

For example, here’s a pipeline that ingests events from a clickstream data source and writes them to Iceberg:

INSERT into events_table
SELECT
  user_id,
  lower(event) AS event_type,
  to_timestamp_micros(ts_us) AS event_time,
  regexp_match(url, '^https?://([^/]+)')[1]  AS domain,
  url,
  referrer,
  user_agent
FROM events_json
WHERE event = 'page_view'
  AND NOT regexp_like(user_agent, '(?i)bot|spider');

SQL transformations are very powerful and give you full control over how data is structured and written into the table. For example, you can

Schematize and normalize your data, even using JSON functions to extract fields from arbitrary JSON
Filter out events or split them into separate tables with their own schemas
Redact sensitive information before storage with regexes
Unroll nested arrays and objects into separate events

Initially, Pipelines supports stateless transformations. In the future, we’ll leverage more of Arroyo’s stateful processing capabilities to support aggregations, incrementally-updated materialized views, and joins.

Cloudflare Pipelines is available today in open beta. You can create a pipeline using the dashboard, Wrangler, or the REST API. To get started, check out our developer docs.

We aren’t currently billing for Pipelines during the open beta. However, R2 storage and operations incurred by sinks writing data to R2 are billed at standard rates. When we start billing, we anticipate charging based on the amount of data read, the amount of data processed via SQL transformations, and data delivered.

R2 Data Catalog

We launched the open beta of R2 Data Catalog in April and have been amazed by the response. Query engines like DuckDB have added native support, and we’ve seen useful integrations like marimo notebooks.

It makes getting started with Iceberg easy. There’s no need to set up a database cluster, connect to object storage, or manage any infrastructure. You can create a catalog with a couple of Wrangler commands:

$ npx wrangler bucket create mycatalog 
$ npx wrangler r2 bucket catalog enable mycatalog

This provisions a data lake that can scale to petabytes of storage, queryable by whatever engine you want to use with zero egress fees.

But just storing the data isn’t enough. Over time, as data is ingested, the number of underlying data files that make up a table will grow, leading to slower and slower query performance.

This is a particular problem with low-latency ingestion, where the goal is to have events queryable as quickly as possible. Writing data frequently means the files are smaller, and there are more of them. Each file needed for a query has to be listed, downloaded, and read. The overhead of too many small files can dominate the total query time.

The solution is compaction, a periodic maintenance operation performed automatically by the catalog. Compaction rewrites small files into larger files which reduces metadata overhead and increases query performance.

Today we are launching compaction support in R2 Data Catalog. Enabling it for your catalog is as easy as:

$ npx wrangler r2 bucket catalog compaction enable mycatalog

We’re starting with support for small-file compaction, and will expand to additional compaction strategies in the future. Check out the compaction documentation to learn more about how it works and how to enable it.

At this time, during open beta, we aren’t billing for R2 Data Catalog. Below is our current thinking on future pricing:

	Pricing*
R2 storage For standard storage class	$0.015 per GB-month (no change)
R2 Class A operations	$4.50 per million operations (no change)
R2 Class B operations	$0.36 per million operations (no change)
Data Catalog operations e.g., create table, get table metadata, update table properties	$9.00 per million catalog operations
Data Catalog compaction data processed	$0.005 per GB processed $2.00 per million objects processed
Data egress	$0 (no change, always free)

*prices subject to change prior to General Availability

We will provide at least 30 days notice before billing starts or if anything changes.

R2 SQL

Having data in R2 Data Catalog is only the first step; the real goal is getting insights and value from it. Traditionally, that means setting up and managing DuckDB, Spark, Trino, or another query engine, adding a layer of operational overhead between you and those insights. What if instead you could run queries directly on Cloudflare?

Now you can. We’ve built a query engine specifically designed for R2 Data Catalog and Cloudflare’s edge infrastructure. We call it R2 SQL, and it’s available today as an open beta.

With Wrangler, running a query on an R2 Data Catalog table is as easy as

$ npx wrangler r2 sql query "{WAREHOUSE}" "\
  SELECT user_id, url FROM events \
  WHERE domain = 'mywebsite.com'"

Cloudflare’s ability to schedule compute anywhere on its global network is the foundation of R2 SQL’s design. This lets us process data directly where it lives, instead of requiring you to manage centralized clusters for your analytical workloads.

R2 SQL is tightly integrated with R2 Data Catalog and R2, which allows the query planner to go beyond simple storage scanning and make deep use of the rich statistics stored in the R2 Data Catalog metadata. This provides a powerful foundation for a new class of query optimizations, such as auxiliary indexes or enabling more complex analytical functions in the future.

The result is a fully serverless experience for users. You can focus on your SQL without needing a deep understanding of how the engine operates. If you are interested in how R2 SQL works, the team has written a deep dive into how R2 SQL’s distributed query engine works at scale.

The open beta is an early preview of R2 SQL querying capabilities, and is initially focused around filter queries. Over time, we will be expanding its capabilities to cover more SQL features, like complex aggregations.

We’re excited to see what our users do with R2 SQL. To try it out, see the documentation and tutorials. During the beta, R2 SQL usage is not currently billed, but R2 storage and operations incurred by queries are billed at standard rates. We plan to charge for the volume of data scanned by queries in the future and will provide notice before billing begins.

Wrapping up

Today, you can use the Cloudflare Data Platform to ingest events into R2 Data Catalog and query them via R2 SQL. In the first half of 2026, we’ll be expanding on the capabilities in all of these products, including:

Integration with Logpush, so you can transform, store, and query your logs directly within Cloudflare
User-defined functions via Workers, and stateful processing support for streaming transformations
Expanding the featureset of R2 SQL to cover aggregations and joins

In the meantime, you can get started with the Cloudflare Data Platform by following the tutorial to create an end-to-end analytical data system, from ingestion with Pipelines, through storage in R2 Data Catalog, and querying with R2 SQL.

We’re excited to see what you build! Come share your feedback with us on our Developer Discord.

Announcing Cloudflare Email Service’s private beta

2025-09-25 Thomas Gauvin

Post Syndicated from Thomas Gauvin original https://blog.cloudflare.com/email-service/

If you are building an application, you rely on email to communicate with your users. You validate their signup, notify them about events, and send them invoices through email. The service continues to find new purpose with agentic workflows and other AI-powered tools that rely on a simple email as an input or output.

And it is a pain for developers to manage. It’s frequently the most annoying burden for most teams. Developers deserve a solution that is simple, reliable, and deeply integrated into their workflow.

Today, we’re excited to announce just that: the private beta of Email Sending, a new capability that allows you to send transactional emails directly from Cloudflare Workers. Email Sending joins and expands our popular Email Routing product, and together they form the new Cloudflare Email Service — a single, unified developer experience for all your email needs.

With Cloudflare Email Service, we’re distilling our years of experience securing and routing emails, and combining it with the power of the developer platform. Now, sending an email is as easy as adding a binding to a Worker and calling send:

export default {
  async fetch(request, env, ctx) {

    await env.SEND_EMAIL.send({
      to: [{ email: "[email protected]" }],
      from: { email: "[email protected]", name: "Your App" },
      subject: "Hello World",
      text: "Hello World!"
    });

    return new Response(`Successfully sent email!`);
  },
};

Email experience is user experience

Email is a core tenet of your user experience. It’s how you stay in touch with your users when they are outside your applications. Users rely on email to inform them when they need to take actions such as password resets, purchase receipts, magic login links, and onboarding flows. When they fail, your application fails.

That means it’s crucial that emails need to land in your users’ inboxes, both reliably and quickly. A magic link that arrives ten minutes late is a lost user. An email delivered to a spam folder breaks user flows and can erode trust in your product. That’s why we’re focusing on deliverability and time-to-inbox with Cloudflare Email Service.

To do this, we’re tightly integrating with DNS to automatically configure the necessary DNS records — like SPF, DKIM and DMARC — such that email providers can verify your sending domain and trust your emails. Plus, in true Cloudflare fashion, Email Service is a global service. That means that we can deliver your emails with low latency anywhere in the world, without the complexity of managing servers across regions.

Simple and flexible for developers

Treating email as a core piece of your application also means building for every touchpoint in your development workflow. We’re building Email Service as part of the Cloudflare stack to make developing with email feels as natural as writing a Worker.

In practice, that means solving for every part of the transactional email workflow:

Starting with Email Service is easy. Instead of managing API keys and secrets, you can use the Email binding to your wrangler.jsonc and send emails securely and with no risk of leaked credentials.
You can use Workers to process incoming mail, store attachments in R2, and add tasks to Queues to get email sending off the hot path of your application. And you can use wrangler to emulate Email Sending locally, allowing you to test your user journeys without jumping between tools and environments.
In production, you have clear observability over your emails with bounce rates and delivery events. And, when a user reports a missing email, you can quickly dive into the delivery status to debug issues quickly and help get your user back on track.

We’re also making sure Email Service seamlessly fits into your existing applications. If you’ve been leaning on existing email frameworks (like React Email) to send rich, HTML-rendered emails to users, you can continue to use them with Email Service. Import the library, render your template, and pass it to the send method just as you would elsewhere.

import { render, pretty, toPlainText } from '@react-email/render';
import { SignupConfirmation } from './templates';

export default {
  async fetch(request, env, ctx) {

    // Convert React Email template to html
    const html = await pretty(await render(<SignupConfirmation url="https://your-domain.com/confirmation-id"/>));

    // Use the Email Sending binding to send emails
    await env.SEND_EMAIL.send({
      to: [{ email: "[email protected]" }],
      from: { email: "[email protected]", name: "Welcome" },
      subject: "Signup Confirmation",
      html,
      text: toPlainText(html)
    });

    return new Response(`Successfully sent email!`);
  }
};

Email Routing and Email Sending: Better together

Sending email is only half the story. Applications often need to receive and parse emails to create powerful workflows. By combining Email Sending with our existing Email Routing capabilities, we’re providing a complete, end-to-end solution for all your application’s email needs.

Email Routing allows you to create custom email addresses on your domain and handle incoming messages programmatically with a Worker, which can enable powerful application flows such as:

Using Workers AI to parse, summarize and even label incoming emails: flagging security events from customers, early signs of a bug or incident, and/or generating automatic responses based on those incoming emails.
Creating support tickets in systems like JIRA or Linear from emails sent to [email protected].
Processing invoices sent to [email protected] and storing attachments in R2.

To use Email Routing, add the email handler to your Worker application and process it as needed:

export default {
  // Create an email handler to process emails delivered to your Worker
  async email(message, env, ctx) {

    // Classify incoming emails using Workers AI
    const { score, label } = env.AI.run("@cf/huggingface/distilbert-sst-2-int8", { text: message.raw" })

    env.PROCESSED_EMAILS.send({score, label, message});
  },
};

When you combine inbound routing with outbound sending, you can close the loop entirely within Cloudflare. Imagine a user emails your support address. A Worker can receive the email, parse its content, call a third-party API to create a ticket, and then use the Email Sending binding to send an immediate confirmation back to the user with their ticket number. That’s the power of a unified Email Service.

Email Sending will require a paid Workers subscription, and we’ll be charging based on messages sent. We’re still finalizing the packaging, and we’ll update our documentation, changelog, and notify users as soon as we have final pricing and long before we start charging. Email Routing limits will remain unchanged.

What’s next

Email is core to your application today, and it’s becoming essential for the next generation of AI agents, background tasks, and automated workflows. We built the Cloudflare Email Service to be the engine for this new era of applications, we’ll be making it available in private beta this November.

Interested in Email Sending? Sign up to the waitlist here.
Want to start processing inbound emails? Get started with Email Routing, which is available now, remains free and will be folded into the new email sending APIs coming.

We’re excited to be adding Email Service to our Developer Platform, and we’re looking forward to seeing how you reimagine user experiences that increasingly rely on emails!

A year of improving Node.js compatibility in Cloudflare Workers

2025-09-25 James M Snell

Post Syndicated from James M Snell original https://blog.cloudflare.com/nodejs-workers-2025/

We’ve been busy.

Compatibility with the broad JavaScript developer ecosystem has always been a key strategic investment for us. We believe in open standards and an open web. We want you to see Workers as a powerful extension of your development platform with the ability to just drop code in that Just Works. To deliver on this goal, the Cloudflare Workers team has spent the past year significantly expanding compatibility with the Node.js ecosystem, enabling hundreds (if not thousands) of popular npm modules to now work seamlessly, including the ever popular express framework.

We have implemented a substantial subset of the Node.js standard library, focusing on the most commonly used, and asked for, APIs. These include:

Module	API documentation
node:console	https://nodejs.org/docs/latest/api/console.html
node:crypto	https://nodejs.org/docs/latest/api/crypto.html
node:dns	https://nodejs.org/docs/latest/api/dns.html
node:fs	https://nodejs.org/docs/latest/api/fs.html
node:http	https://nodejs.org/docs/latest/api/http.html
node:https	https://nodejs.org/docs/latest/api/https.html
node:net	https://nodejs.org/docs/latest/api/net.html
node:process	https://nodejs.org/docs/latest/api/process.html
node:timers	https://nodejs.org/docs/latest/api/timers.html
node:tls	https://nodejs.org/docs/latest/api/tls.html
node:zlib	https://nodejs.org/docs/latest/api/zlib.html

Each of these has been carefully implemented to approximate Node.js’ behavior as closely as possible where feasible. Where matching Node.js‘ behavior is not possible, our implementations will throw a clear error when called, rather than silently failing or not being present at all. This ensures that packages that check for the presence of these APIs will not break, even if the functionality is not available.

In some cases, we had to implement entirely new capabilities within the runtime in order to provide the necessary functionality. For node:fs, we added a new virtual file system within the Workers environment. In other cases, such as with node:net, node:tls, and node:http, we wrapped the new Node.js APIs around existing Workers capabilities such as the Sockets API and fetch.

Most importantly, all of these implementations are done natively in the Workers runtime, using a combination of TypeScript and C++. Whereas our earlier Node.js compatibility efforts relied heavily on polyfills and shims injected at deployment time by developer tooling such as Wrangler, we are moving towards a model where future Workers will have these APIs available natively, without need for any additional dependencies. This not only improves performance and reduces memory usage, but also ensures that the behavior is as close to Node.js as possible.

The networking stack

Node.js has a rich set of networking APIs that allow applications to create servers, make HTTP requests, work with raw TCP and UDP sockets, send DNS queries, and more. Workers do not have direct access to raw kernel-level sockets though, so how can we support these Node.js APIs so packages still work as intended? We decided to build on top of the existing managed Sockets and fetch APIs. These implementations allow many popular Node.js packages that rely on networking APIs to work seamlessly in the Workers environment.

Let’s start with the HTTP APIs.

HTTP client and server support

From the moment we announced that we would be pursuing Node.js compatibility within Workers, users have been asking specifically for an implementation of the node:http module. There are countless modules in the ecosystem that depend directly on APIs like http.get(...) and http.createServer(...).

The node:http and node:https modules provide APIs for creating HTTP clients and servers. We have implemented both, allowing you to create HTTP clients using http.request() and servers using http.createServer(). The HTTP client implementation is built on top of the Fetch API, while the HTTP server implementation is built on top of the Workers runtime’s existing request handling capabilities.

The client side is fairly straightforward:

import http from 'node:http';

export default {
  async fetch(request) {
    return new Promise((resolve, reject) => {
      const req = http.request('http://example.com', (res) => {
        let data = '';
        res.setEncoding('utf8');
        res.on('data', (chunk) => {
          data += chunk;
        });
        res.on('end', () => {
          resolve(new Response(data));
        });
      });
      req.on('error', (err) => {
        reject(err);
      });
      req.end();
    });
  }
}

The server side is just as simple but likely even more exciting. We’ve often been asked about the possibility of supporting Express, or Koa, or Fastify within Workers, but it was difficult to do because these were so dependent on the Node.js APIs. With the new additions it is now possible to use both Express and Koa within Workers, and we’re hoping to be able to add Fastify support later.

import { createServer } from "node:http";
import { httpServerHandler } from "cloudflare:node";

const server = createServer((req, res) => {
  res.writeHead(200, { "Content-Type": "text/plain" });
  res.end("Hello from Node.js HTTP server!");
});

export default httpServerHandler(server);

The httpServerHandler() function from the cloudflare:node module integrates the HTTP server with the Workers fetch event, allowing it to handle incoming requests.

The `node:dns` module

The node:dns module provides an API for performing DNS queries.

At Cloudflare, we happen to have a DNS-over-HTTPS (DoH) service and our own DNS service called 1.1.1.1. We took advantage of this when exposing node:dns in Workers. When you use this module to perform a query, it will just make a subrequest to 1.1.1.1 to resolve the query. This way the user doesn’t have to think about DNS servers, and the query will just work.

The `node:net` and `node:tls` modules

The node:net module provides an API for creating TCP sockets, while the node:tls module provides an API for creating secure TLS sockets. As we mentioned before, both are built on top of the existing Workers Sockets API. Note that not all features of the node:net and node:tls modules are available in Workers. For instance, it is not yet possible to create a TCP server using net.createServer() yet (but maybe soon!), but we have implemented enough of the APIs to allow many popular packages that rely on these modules to work in Workers.

import net from 'node:net';
import tls from 'node:tls';

export default {
  async fetch(request) {
    const { promise, resolve } = Promise.withResolvers();
    const socket = net.connect({ host: 'example.com', port: 80 },
        () => {
      let buf = '';
      socket.setEncoding('utf8')
      socket.on('data', (chunk) => buf += chunk);
      socket.on('end', () => resolve(new Response('ok'));
      socket.end();
    });
    return promise;
  }
}

A new virtual file system and the `node:fs` module

What does supporting filesystem APIs mean in a serverless environment? When you deploy a Worker, it runs in Region:Earth and we don’t want you needing to think about individual servers with individual file systems. There are, however, countless existing applications and modules in the ecosystem that leverage the file system to store configuration data, read and write temporary data, and more.

Workers do not have access to a traditional file system like a Node.js process does, and for good reason! A Worker does not run on a single machine; a single request to one worker can run on any one of thousands of servers anywhere in Cloudflare’s global network. Coordinating and synchronizing access to shared physical resources such as a traditional file system harbor major technical challenges and risks of deadlocks and more; challenges that are inherent in any massively distributed system. Fortunately, Workers provide powerful tools like Durable Objects that provide a solution for coordinating access to shared, durable state at scale. To address the need for a file system in Workers, we built on what already makes Workers great.

We implemented a virtual file system that allows you to use the node:fs APIs to read and write temporary, in-memory files. This virtual file system is specific to each Worker. When using a stateless worker, files created in one request are not accessible in any other request. However, when using a Durable Object, this temporary file space can be shared across multiple requests from multiple users. This file system is ephemeral (for now), meaning that files are not persisted across Worker restarts or deployments, so it does not replace the use of the Durable Object Storage mechanism, but it provides a powerful new tool that greatly expands the capabilities of your Durable Objects.

The node:fs module provides a rich set of APIs for working with files and directories:

import fs from 'node:fs';

export default {
  async fetch(request) {
    // Write a temporary file
    await fs.promises.writeFile('/tmp/hello.txt', 'Hello, world!');

    // Read the file
    const data = await fs.promises.readFile('/tmp/hello.txt', 'utf-8');

    return new Response(`File contents: ${data}`);
  }
}

The virtual file system supports a wide range of file operations, including reading and writing files, creating and removing directories, and working with file descriptors. It also supports standard input/output/error streams via process.stdin, process.stdout, and process.stderr, symbolic links, streams, and more.

While the current implementation of the virtual file system is in-memory only, we are exploring options for adding persistent storage in the future that would link to existing Cloudflare storage solutions like R2 or Durable Objects. But you don’t have to wait on us! When combined with powerful tools like Durable Objects and JavaScript RPC, it’s certainly possible to create your own general purpose, durable file system abstraction backed by sqlite storage.

Cryptography with `node:crypto`

The node:crypto module provides a comprehensive set of cryptographic functionality, including hashing, encryption, decryption, and more. We have implemented a full version of the node:crypto module, allowing you to use familiar cryptographic APIs in your Workers applications. There will be some difference in behavior compared to Node.js due to the fact that Workers uses BoringSSL under the hood, while Node.js uses OpenSSL. However, we have strived to make the APIs as compatible as possible, and many popular packages that rely on node:crypto now work seamlessly in Workers.

To accomplish this, we didn’t just copy the implementation of these cryptographic operations from Node.js. Rather, we worked within the Node.js project to extract the core crypto functionality out into a separate dependency project called ncrypto that is used – not only by Workers but Bun as well – to implement Node.js compatible functionality by simply running the exact same code that Node.js is running.

import crypto from 'node:crypto';

export default {
  async fetch(request) {
    const hash = crypto.createHash('sha256');
    hash.update('Hello, world!');
    const digest = hash.digest('hex');

    return new Response(`SHA-256 hash: ${digest}`);
  }
}

All major capabilities of the node:crypto module are supported, including:

Hashing (e.g., SHA-256, SHA-512)
HMAC
Symmetric encryption/decryption
Asymmetric encryption/decryption
Digital signatures
Key generation and management
Random byte generation
Key derivation functions (e.g., PBKDF2, scrypt)
Cipher and Decipher streams
Sign and Verify streams
KeyObject class for managing keys
Certificate handling (e.g., X.509 certificates)
Support for various encoding formats (e.g., PEM, DER, base64)
and more…

Process & Environment

In Node.js, the node:process module provides a global object that gives information about, and control over, the current Node.js process. It includes properties and methods for accessing environment variables, command-line arguments, the current working directory, and more. It is one of the most fundamental modules in Node.js, and many packages rely on it for basic functionality and simply assume its presence. There are, however, some aspects of the node:process module that do not make sense in the Workers environment, such as process IDs and user/group IDs which are tied to the operating system and process model of a traditional server environment and have no equivalent in the Workers environment.

When nodejs_compat is enabled, the process global will be available in your Worker scripts or you can import it directly via import process from 'node:process'. Note that the process global is only available when the nodejs_compat flag is enabled. If you try to access process without the flag, it will be undefined and the import will throw an error.

Let’s take a look at the process APIs that do make sense in Workers, and that have been fully implemented, starting with process.env.

Environment variables

Workers have had support for environment variables for a while now, but previously they were only accessible via the env argument passed to the Worker function. Accessing the environment at the top-level of a Worker was not possible:

export default {
  async fetch(request, env) {
    const config = env.MY_ENVIRONMENT_VARIABLE;
    // ...
  }
}

With the new process.env implementation, you can now access environment variables in a more familiar way, just like in Node.js, and at any scope, including the top-level of your Worker:

import process from 'node:process';
const config = process.env.MY_ENVIRONMENT_VARIABLE;

export default {
  async fetch(request, env) {
    // You can still access env here if you need to
    const configFromEnv = env.MY_ENVIRONMENT_VARIABLE;
    // ...
  }
}

Environment variables are set in the same way as before, via the wrangler.toml or wrangler.jsonc configuration file, or via the Cloudflare dashboard or API. They may be set as simple key-value pairs or as JSON objects:

{
  "name": "my-worker-dev",
  "main": "src/index.js",
  "compatibility_date": "2025-09-15",
  "compatibility_flags": [
    "nodejs_compat"
  ],
  "vars": {
    "API_HOST": "example.com",
    "API_ACCOUNT_ID": "example_user",
    "SERVICE_X_DATA": {
      "URL": "service-x-api.dev.example",
      "MY_ID": 123
    }
  }
}

When accessed via process.env, all environment variable values are strings, just like in Node.js.

Because process.env is accessible at the global scope, it is important to note that environment variables are accessible from anywhere in your Worker script, including third-party libraries that you may be using. This is consistent with Node.js behavior, but it is something to be aware of from a security and configuration management perspective. The Cloudflare Secrets Store can provide enhanced handling around secrets within Workers as an alternative to using environment variables.

Importable environment and waitUntil

When not using the nodejs_compat flag, we decided to go a step further and make it possible to import both the environment, and the waitUntil mechanism, as a module, rather than forcing users to always access it via the env and ctx arguments passed to the Worker function. This can make it easier to access the environment in a more modular way, and can help to avoid passing the env argument through multiple layers of function calls. This is not a Node.js-compatibility feature, but we believe it is a useful addition to the Workers environment:

import { env, waitUntil } from 'cloudflare:workers';

const config = env.MY_ENVIRONMENT_VARIABLE;

export default {
  async fetch(request) {
    // You can still access env here if you need to
    const configFromEnv = env.MY_ENVIRONMENT_VARIABLE;
    // ...
  }
}

function doSomething() {
  // Bindings and waitUntil can now be accessed without
  // passing the env and ctx through every function call.
  waitUntil(env.RPC.doSomethingRemote());
}

One important note about process.env: changes to environment variables via process.env will not be reflected in the env argument passed to the Worker function, and vice versa. The process.env is populated at the start of the Worker execution and is not updated dynamically. This is consistent with Node.js behavior, where changes to process.env do not affect the actual environment variables of the running process. We did this to minimize the risk that a third-party library, originally meant to run in Node.js, could inadvertently modify the environment assumed by the rest of the Worker code.

Stdin, stdout, stderr

Workers do not have a traditional standard input/output/error streams like a Node.js process does. However, we have implemented process.stdin, process.stdout, and process.stderr as stream-like objects that can be used similarly. These streams are not connected to any actual process stdin and stdout, but they can be used to capture output that is written to the logs captured by the Worker in the same way as console.log and friends, just like them, they will show up in Workers Logs.

The process.stdout and process.stderr are Node.js writable streams:

import process from 'node:process';

export default {
  async fetch(request) {
    process.stdout.write('This will appear in the Worker logs\n');
    process.stderr.write('This will also appear in the Worker logs\n');
    return new Response('Hello, world!');
  }
}

Support for stdin, stdout, and stderr is also integrated with the virtual file system, allowing you to write to the standard file descriptors 0, 1, and 2 (representing stdin, stdout, and stderr respectively) using the node:fs APIs:

import fs from 'node:fs';
import process from 'node:process';

export default {
  async fetch(request) {
    // Write to stdout
    fs.writeSync(process.stdout.fd, 'Hello, stdout!\n');
    // Write to stderr
    fs.writeSync(process.stderr.fd, 'Hello, stderr!\n');

    return new Response('Check the logs for stdout and stderr output!');
  }
}

Other process APIs

We cannot cover every node:process API in detail here, but here are some of the other notable APIs that we have implemented:

process.nextTick(fn): Schedules a callback to be invoked after the current execution context completes. Our implementation uses the same microtask queue as promises so that it behaves exactly the same as queueMicrotask(fn).
process.cwd() and process.chdir(): Get and change the current virtual working directory. The current working directory is initialized to /bundle when the Worker starts, and every request has its own isolated view of the current working directory. Changing the working directory in one request does not affect the working directory in other requests.
process.exit(): Immediately terminates the current Worker request execution. This is unlike Node.js where process.exit() terminates the entire process. In Workers, calling process.exit() will stop execution of the current request and return an error response to the client.

Compression with `node:zlib`

The node:zlib module provides APIs for compressing and decompressing data using various algorithms such as gzip, deflate, and brotli. We have implemented the node:zlib module, allowing you to use familiar compression APIs in your Workers applications. This enables a wide range of use cases, including data compression for network transmission, response optimization, and archive handling.

import zlib from 'node:zlib';

export default {
  async fetch(request) {
    const input = 'Hello, world! Hello, world! Hello, world!';
    const compressed = zlib.gzipSync(input);
    const decompressed = zlib.gunzipSync(compressed).toString('utf-8');

    return new Response(`Decompressed data: ${decompressed}`);
  }
}

While Workers has had built-in support for gzip and deflate compression via the Web Platform Standard Compression API, the node:zlib module support brings additional support for the Brotli compression algorithm, as well as a more familiar API for Node.js developers.

Timing & scheduling

Node.js provides a set of timing and scheduling APIs via the node:timers module. We have implemented these in the runtime as well.

import timers from 'node:timers';

export default {
  async fetch(request) {
    timers.setInterval(() => {
      console.log('This will log every half-second');
    }, 500);

    timers.setImmediate(() => {
      console.log('This will log immediately after the current event loop');
    });

    return new Promise((resolve) => {
      timers.setTimeout(() => {
        resolve(new Response('Hello after 1 second!'));
      }, 1000);
    });
  }
}

The Node.js implementations of the timers APIs are very similar to the standard Web Platform with one key difference: the Node.js timers APIs return Timeout objects that can be used to manage the timers after they have been created. We have implemented the Timeout class in Workers to provide this functionality, allowing you to clear or re-fire timers as needed.

Console

The node:console module provides a set of console logging APIs that are similar to the standard console global, but with some additional features. We have implemented the node:console module as a thin wrapper around the existing globalThis.console that is already available in Workers.

How to enable the Node.js compatibility features

To enable the Node.js compatibility features as a whole within your Workers, you can set the nodejs_compat compatibility flag in your wrangler.jsonc or wrangler.toml configuration file. If you are not using Wrangler, you can also set the flag via the Cloudflare dashboard or API:

{
  "name": "my-worker",
  "main": "src/index.js",
  "compatibility_date": "2025-09-21",
  "compatibility_flags": [
    // Get everything Node.js compatibility related
    "nodejs_compat",
  ]
}

The compatibility date here is key! Update that to the most current date, and you’ll always be able to take advantage of the latest and greatest features.

The nodejs_compat flag is an umbrella flag that enables all the Node.js compatibility features at once. This is the recommended way to enable Node.js compatibility, as it ensures that all features are available and work together seamlessly. However, if you prefer, you can also enable or disable some features individually via their own compatibility flags:

Module	Enable Flag (default)	Disable Flag
node:console	enable_nodejs_console_module	disable_nodejs_console_module
node:fs	enable_nodejs_fs_module	disable_nodejs_fs_module
node:http (client)	enable_nodejs_http_modules	disable_nodejs_http_modules
node:http (server)	enable_nodejs_http_server_modules	disable_nodejs_http_server_modules
node:os	enable_nodejs_os_module	disable_nodejs_os_module
node:process	enable_nodejs_process_v2
node:zlib	nodejs_zlib	no_nodejs_zlib
process.env	nodejs_compat_populate_process_env	nodejs_compat_do_not_populate_process_env

By separating these features, you can have more granular control over which Node.js APIs are available in your Workers. At first, we had started rolling out these features under the one nodejs_compat flag, but we quickly realized that some users perform feature detection based on the presence of certain modules and APIs and that by enabling everything all at once we were risking breaking some existing Workers. Users who are checking for the existence of these APIs manually can ensure new changes don’t break their workers by opting out of specific APIs:

{
  "name": "my-worker",
  "main": "src/index.js",
  "compatibility_date": "2025-09-15",
  "compatibility_flags": [
    // Get everything Node.js compatibility related
    "nodejs_compat",
    // But disable the `node:zlib` module if necessary
    "no_nodejs_zlib",
  ]
}

But, to keep things simple, we recommend starting with the nodejs_compat flag, which will enable everything. You can always disable individual features later if needed. There is no performance penalty to having the additional features enabled.

Handling end-of-life’d APIs

One important difference between Node.js and Workers is that Node.js has a defined long term support (LTS) schedule that allows it to make breaking changes at certain points in time. More specifically, Node.js can remove APIs and features when they reach end-of-life (EOL). On Workers, however, we have a rule that once a Worker is deployed, it will continue to run as-is indefinitely, without any breaking changes as long as the compatibility date does not change. This means that we cannot simply remove APIs when they reach EOL in Node.js, since this would break existing Workers. To address this, we have introduced a new set of compatibility flags that allow users to specify that they do not want the nodejs_compat features to include end-of-life APIs. These flags are based on the Node.js major version in which the APIs were removed:

The remove_nodejs_compat_eol flag will remove all APIs that have reached EOL up to your current compatibility date:

{
  "name": "my-worker",
  "main": "src/index.js",
  "compatibility_date": "2025-09-15",
  "compatibility_flags": [
    // Get everything Node.js compatibility related
    "nodejs_compat",
    // Remove Node.js APIs that have reached EOL up to your
    // current compatibility date
    "remove_nodejs_compat_eol",
  ]
}

The remove_nodejs_compat_eol_v22 flag will remove all APIs that reached EOL in Node.js v22. When using removenodejs_compat_eol, this flag will be automatically enabled if your compatibility date is set to a date after Node.js v22’s EOL date (April 30, 2027).
The remove_nodejs_compat_eol_v23 flag will remove all APIs that reached EOL in Node.js v23. When using removenodejs_compat_eol, this flag will be automatically enabled if your compatibility date is set to a date after Node.js v24’s EOL date (April 30, 2028).
The remove_nodejs_compat_eol_v24 flag will remove all APIs that reached EOL in Node.js v24. When using removenodejs_compat_eol, this flag will be automatically enabled if your compatibility date is set to a date after Node.js v24’s EOL date (April 30, 2028).

If you look at the date for remove_nodejs_compat_eol_v23 you’ll notice that it is the same as the date for remove_nodejs_compat_eol_v24. That is not a typo! Node.js v23 is not an LTS release, and as such it has a very short support window. It was released in October 2023 and reached EOL in May 2024. Accordingly, we have decided to group the end-of-life handling of non-LTS releases into the next LTS release. This means that when you set your compatibility date to a date after the EOL date for Node.js v24, you will also be opting out of the APIs that reached EOL in Node.js v23. Importantly, these flags will not be automatically enabled until your compatibility date is set to a date after the relevant Node.js version’s EOL date, ensuring that existing Workers will have plenty of time to migrate before any APIs are removed, or can choose to just simply keep using the older APIs indefinitely by using the reverse compatibility flags like add_nodejs_compat_eol_v24.

Giving back

One other important bit of work that we have been doing is expanding Cloudflare’s investment back into the Node.js ecosystem as a whole. There are now five members of the Workers runtime team (plus one summer intern) that are actively contributing to the Node.js project on GitHub, two of which are members of Node.js’ Technical Steering Committee. While we have made a number of new feature contributions such as an implementation of the Web Platform Standard URLPattern API and improved implementation of crypto operations, our primary focus has been on improving the ability for other runtimes to interoperate and be compatible with Node.js, fixing critical bugs, and improving performance. As we continue to grow our efforts around Node.js compatibility we will also grow our contributions back to the project and ecosystem as a whole.

Aaron Snell	2025 Summer Intern, Cloudflare Containers Node.js Web Infrastructure Team
	flakey5
Dario Piotrowicz	Senior System Engineer Node.js Collaborator
	dario-piotrowicz
Guy Bedford	Principal Systems Engineer Node.js Collaborator
	guybedford
James Snell	Principal Systems Engineer Node.js TSC
	jasnell
Nicholas Paun	Systems Engineer Node.js Contributor
	npaun
Yagiz Nizipli	Principal Systems Engineer Node.js TSC
	anonrig

Cloudflare is also proud to continue supporting critical infrastructure for the Node.js project through its ongoing strategic partnership with the OpenJS Foundation, providing free access to the project to services such as Workers, R2, DNS, and more.

Give it a try!

Our vision for Node.js compatibility in Workers is not just about implementing individual APIs, but about creating a comprehensive platform that allows developers to run existing Node.js code seamlessly in the Workers environment. This involves not only implementing the APIs themselves, but also ensuring that they work together harmoniously, and that they integrate well with the unique aspects of the Workers platform.

In some cases, such as with node:fs and node:crypto, we have had to implement entirely new capabilities that were not previously available in Workers and did so at the native runtime level. This allows us to tailor the implementations to the unique aspects of the Workers environment and ensure both performance and security.

And we’re not done yet. We are continuing to work on implementing additional Node.js APIs, as well as improving the performance and compatibility of the existing implementations. We are also actively engaging with the community to understand their needs and priorities, and to gather feedback on our implementations. If there are specific Node.js APIs or npm packages that you would like to see supported in Workers, please let us know! If there are any issues or bugs you encounter, please report them on our GitHub repository. While we might not be able to implement every single Node.js API, nor match Node.js’ behavior exactly in every case, we are committed to providing a robust and comprehensive Node.js compatibility layer that meets the needs of the community.

All the Node.js compatibility features described in this post are available now. To get started, simply enable the nodejs_compat compatibility flag in your wrangler.toml or wrangler.jsonc file, or via the Cloudflare dashboard or API. You can then start using the Node.js APIs in your Workers applications right away.

Malicious-Looking URL Creation Service

2025-09-25 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/09/malicious-looking-url-creation-service.html

This site turns your URL into something sketchy-looking.

For example, www.schneier.com becomes
https://cheap-bitcoin.online/firewall-snatcher/cipher-injector/phishing_sniffer_tool.html?form=inject&host=spoof&id=bb1bc121¶meter=inject&payload=%28function%28%29%7B+return+%27+hi+%27.trim%28%29%3B+%7D%29%28%29%3B&port=spoof.

Found on Boing Boing.

Свободата на словото като палачинка

2025-09-25 Светла Енчева

Post Syndicated from Светла Енчева original https://www.toest.bg/svobodata-na-slovoto-kato-palachinka/

Свободата на словото като палачинка

Има понятия, които мнозина биха поставили на челно място в ценностната си система, но влагат в тях различно съдържание. Да вземем свободата. Ето няколко коренно различни вида свобода: на българското националноосвободително движение от 70-те години на XIX век, на сексуалната революция от 60-те години на ХХ век, на независимата от държавата икономическа инициатива, на будистите, стремящи се към свобода от егото си и присъщите му емоции.

Свободата на словото (и по-общо – на изразяването) е също толкова хлъзгаво понятие. Тя не само може да носи различно съдържание в зависимост от ценностите на адептите си, а и разбирането на границите ѝ може да варира спрямо разнообразни фактори, един от които – отношението към властта.

Доскорошните представи за свобода на словото

До неотдавна клишето гласеше, че в САЩ свободата на словото е пълна, защото е гарантирана от Първата поправка на американската Конституция, докато в Европа съществуват повече регулации – защита на личния живот, недопускане на омраза и пр. И действително, зад океана обидите, включително към публични личности, както и изразяването на омраза към цели социални групи оставаха като цяло без последствия.

Тази форма на свобода на словото се защитаваше като върховна ценност най-вече от дясно настроени американци. Но и от неамериканци, повлияни от идеала за „свободната Америка“ – в България например този светоглед се радва на широка популярност, не на последно място защото след 1989 г. на САЩ по нашите ширини се гледаше като на образец за свободния свят.

Затова едва ли не в мръсна дума се превърна т.нар. политическа коректност, тоест опитите да се въведе език, който да включва всички членове на обществото и да не е обиден или неприемащ за отделни групи. Дори за хора, които се идентифицират като либерални, стана непонятно и неприемливо използването например на определени местоимения, които да включват и транс хората, или избягването на думи като „негър“ и „циганин“.

Ако надзърнем обаче зад клишето,

ще се разкрие доста по-различна картинка. Американската представа за свобода на словото и на изразяването не поставя под въпрос определени табута върху неща, които в Европа се смятат за приемливи.

Когато американци дойдат в европейска страна, едно от първите неща, които ги изумява, е как хората свободно си пият алкохол на открито. В повечето градове в САЩ човек и бира да си е купил, да не говорим за уиски, трябва да крие бутилката в непрозрачен плик, ако е навън, да не би случайно някоя крехка детска душа да я види и да се поквари.

В Европа също така има доста по-голяма свобода по отношение на включването на сексуално съдържание и голота в изкуството и медиите в сравнение със САЩ. В много европейски страни е съвсем в реда на нещата открито да се продават списания, на чиито корици има хора с оскъдно или никакво облекло. Много филми и сериали, които в Щатите са забранени за лица под 18-годишна възраст, в Европа имат по-либерално ограничение – например 16+.

За разлика от САЩ, в Европа не е и толкова безвъпросно слагането на знак на равенство между голота и секс. В германските сауни например не е прието човек да влиза облечен и се предполага, че там хората, макар и дибидюс, нямат сексуален интерес един към друг. Не че и отвъд океана няма адепти на голотата, но те са доста по-нишови и субкултурни.

На 180 градуса

Едно от основните предизборни обещания на Тръмп и на кръга около него беше възстановяването на свободата на словото. Вече нищо не трябваше да бъде пречка пред нея – нито политическата коректност, нито правилата на социалните мрежи, нито институциите. Изобщо – пълна свобода.

Още преди официалното встъпване на Тръмп в длъжност започна да става ясно, че предстои обрат, а с течение на времето този обрат придобива невиждани размери за държава, смятана за демократична. Ето само няколко повея от вятъра на промяната:

Тръмп спря финансирането на медийната група, част от която са „Гласът на Америка“ и „Свободна Европа“ (то беше частично възстановено в резултат на съдебни дела).
Поведе борба срещу американски университети, включително със спиране на финансирането, защото били „леви“, „либерални“ и в тях имало студенти, протестиращи в подкрепа на Палестина.
Инициира процес на заличаване на научни изследвания и понятия, които не му харесват.
По същия начин започна да подменя историята, заличавайки теми, свързани с робството, расизма и заслугите на цветнокожите американци.
Обвърза получаването на визи и правото на престой в САЩ с това какво човек пише в социалните мрежи. Ако не харесваш Тръмп и/или подкрепяш Палестина – вън.
Смени директорката на статистическата служба, защото не му харесват данните, събрани от институцията ѝ.
Заплаши с отнемане на лиценза медиите, които са критични към него.

И „словото на омразата“ се върна на бял кон.

Довчера отричано от републиканците, понятието „слово на омразата“ вече си е напълно легитимно и не се възприема като недопустимо посегателство срещу свободата на изразяване. То преживя направо апогей след убийството на консервативния лидер Чарли Кърк – всяка критика на възгледите му или дори само споменаването, че Кърк е отстоявал правото да носиш оръжие даже с цената на човешки жертви, се заклеймява като реч на омразата, заслужаваща най-строга санкция.

Вицепрезидентът Джей Ди Ванс например призова американците да донасят на шефовете си, ако техни колеги се радват на смъртта на Кърк. А за конгресмена републиканец Клей Хигинс доживотното отнемане на достъп до всички социални мрежи на авторите на „всеки пост или коментар, който омаловажава убийството на Чарли Кърк“ дори не е достатъчно наказание. Той настоява бизнесите им да бъдат включени в черни списъци, да бъдат изгонени от училища и университети и дори да се отнемат шофьорските им книжки. С две думи, осъжда „провинилите се“ на гражданска смърт.

I’m going to use Congressional authority and every influence with big tech platforms to mandate immediate ban for life of every post or commenter that belittled the assassination of Charlie Kirk. If they ran their mouth with their smartass hatred celebrating the heinous murder of…

— Rep. Clay Higgins (@RepClayHiggins) September 11, 2025

Казусът с Джими Кимъл

Насред призивите за разправа с всеки злорадстващ за смъртта на Чарли Кърк, вечерното шоу на известния комик Джими Кимъл беше спряно и след няколко дни възстановено. Официалният аргумент за спирането на шоуто е именно неподходящият коментар на водещия за убития Кърк.

Колкото и да търсите обаче с какви думи Кимъл е обиждал Кърк, едва ли ще намерите, защото такива просто няма. „Скандалната“ му реплика дори не е за самата жертва, а за последователите на Тръмп, които

отчаяно се опитват да характеризират хлапето, убило Чарли Кърк, като нещо различно от себе си и правят всичко по силите си, за да натрупат политически точки от това.

Коментарът на Кимъл изобщо не е по-остър от реакциите на други комици, журналисти и публични личности с демократични убеждения. Нито по-смел. Скоро след излъчването му обаче председателят на Федералната комисия по далекосъобщенията Брендън Кар, приближен на Тръмп, заяви, че шоуто трябва да бъде спряно, добавяйки:

Можем да го направим по лесния или по трудния начин.

„Дисни“, собственици на телевизия ABC, в която е предаването на Кимъл, избират лесния начин. Защото Кар може и да няма право да нарежда кое шоу да бъде свалено, но от комисията му зависят медийни придобивания и сливания, каквито „Дисни“ извършва. След вълната от протести и спрени абонаменти компанията връща Кимъл, понеже щетите от „канселирането“ му току-виж се оказали по-големи от рисковете, ако остане в ефир.

Каква е истинската причина за спирането на Кимъл, припомня друг „канселиран“ комик – Стивън Колбер. Макар той да е лицето на вечерното шоу с най-висок рейтинг в САЩ, то няма да бъде подновено след май 2026 г. Официалният аргумент е поради финансови загуби, а по-вероятният – водещият не се харесва на Тръмп. По повод на Кимъл Колбер казва:

„Така че, каквото и да твърдят, не става дума изцяло за това какво Джими е казал в понеделник, а за част от план. Откъде знам ли? Преди два месеца, когато президентът изискано празнуваше спирането на шоуто ми, той постна: „Джими Кимъл е следващият, който ще си ходи.“

Кимъл е трън в очите на Тръмп от години. Ясно е било, че в шоуто си водещият ще коментира нещо по повод смъртта на Чарли Кърк. И е било все едно какво точно ще каже – просто се е търсел повод да бъде отстранен. И ако първият опит беше неуспешен, това не означава, че няма да има следващ. Тръмп вече заплаши ABC заради решението на телевизията да върне комика.

Двойните стандарти за свободата на словото и речта на омразата

Републиканците, които разчистват сметки с политическите си противници, претендирайки, че се борят с речта на омразата, много бързо са забравили какви са ги приказвали самите те, преди Тръмп да си върне властта. CNN припомня, че само преди година същият Брендън Кар, според когото Джими Кимъл трябва да бъде отстранен „по лесния или по трудния начин“, е нарекъл свободата на словото противотежест на демокрацията, която ограничава правителствения контрол, а цензурата – мечтата на авторитарния лидер.

В същото предаване на CNN, както и в The Daily Show, водено от друг неудобен за властта комик – Джон Стюарт, се споменават редица случаи, когато последователи на Тръмп (а понякога и самият той) изразяват омраза, подиграват се с жертви на престъпления и призовават към насилие, но думите им остават без последствия.

Водещият на Fox Джеси Уотърс например разпространява конспиративната теория, че имунологът Антъни Фаучи е виновен за разпространението на COVID-19 в САЩ, и предлага Фаучи да бъде застрелян от засада. В резултат Фаучи получава смъртни заплахи, дъщерите му са тормозени, но Уотърс не е наказан.

През 2022 г. Пол Пелоси, съпруг на тогавашната председателка на Долната камара на Конгреса Нанси Пелоси, е нападнат с чук. Това отприщва вълна от подигравки в лагера на Тръмп, въпреки че 82-годишният Пелоси получава фрактура на черепа и претърпява операция. Тогавашният водещ на Fox Пийт Хегсет се смее в ефир на коментар на друга водеща в медията, че Пелоси може би е имал нужда от чук. „Имаше последици – този джентълмен трябваше да напусне телевизията“, каза Джон Стюърт, иронизирайки факта, че днес Хегсет е министър на отбраната на САЩ.

Доналд Тръмп-младши пък поства снимка на чук върху долни гащи с думите: „Костюмът ми на Пол Пелоси за Хелоуин е готов“. Самият Доналд Тръмп пък заяви: „Ще се изправим срещу лудата Нанси Пелоси, която съсипа Сан Франциско“, и ехидно се обърна към нея: „Как е съпругът ти, между другото?“

Ценностна хигиена

По времето на втория мандат на Тръмп властта се обръща срещу свободата на словото по безпрецедентен за САЩ начин. Досега не е било човек да не смее да отиде на екскурзия в Америка, защото е писал нещо във Facebook. Този обрат неизбежно оказва влияние и на други места по света, включително в Европа, не само заради инструментите за натиск, с които САЩ разполагат, а и защото съвсем доскоро бяха символ на свободния свят.

В по-малка степен обаче такива обрати не са прецедент и случващото се отвъд океана е подходящ повод да се вгледаме в собствената си градинка. Защото и тук мнозина се кълнат в свободата на словото. Но всеки, който казва, че тя е абсолютна, все пак слага границата някъде. Дали например един разследващ журналист ще одобри, ако някой публикува личните му данни – адрес, телефон, ЕГН, номера на банкови карти – и всеки, който му има зъб, може да ги използва, както намери за добре? Дали един борец срещу цензурата ще се радва, ако изтекат сексуални компромати с негово участие?

Джон Стюарт дава пример с хора от най-близкото обкръжение на Тръмп, които заявяват, че е недопустимо да наричаш опонентите си фашисти, врагове на държавата и да ги дехуманизираш. В същото време самият Тръмп прави и трите неща. Но той може, а демократите – не. По подобен начин и у нас се практикуват двойни стандарти: едно и също нещо може да се интерпретира като проява на кураж и като слово на омразата – зависи кой и на кого го казва.

Всичко това не означава, че свободата на словото не бива да бъде отстоявана. Напротив. Но тя е лишена от съдържание, ако не се определят ясно и честно границите ѝ, както и ценностите, които стоят зад нея.

В противен случай рискуваме да станем абсурдни като онези борци против цензурата, които реторично питат: „Аз как ще обясня на децата си…?“ и за които свободата на словото се свежда до изразяване на омраза. И – парадоксално – до стремеж към налагане на цензура.

An American Education | 2. Testing Teachers for ‘Wokeness’

2025-09-25 The Atlantic

Post Syndicated from The Atlantic original https://www.youtube.com/watch?v=LJy7iuu78JM

Да покажа човешката крехкост

2025-09-25 Стефан Иванов

Post Syndicated from Стефан Иванов original https://www.toest.bg/da-pokazha-choveshkata-krehkost/

С Алиса Коваленко разговаря Стефан Иванов

Да покажа човешката крехкост

Алиса Коваленко е украинска документална режисьорка, родена през 1987 г. в Запорожие. Учила е кинодокументалистика в Киевския университет, а по-късно и във Варшава. В нейните филми често се разглеждат последиците от руската агресия от анексията на Крим през 2014-та до войната, започнала осем години по-късно. Сред по-известните ѝ творби са Alisa in Warland (2015), Home Games (2018) и We Will Not Fade Away (2023) – филми, които съчетават личната перспектива и документалното изследване на войната и травмите от нея. През 2022 г. тя се присъединява към Украинската доброволческа армия, а след това довършва We Will Not Fade Away и започва работа върху „Скъпи мой Теo“ – видеодневник, посветен на нейния син. Алиса е и активистка чрез филмите си, но и чрез личната си история и ангажимент, особено по темата за сексуалното насилие по време на война.

Най-новият ѝ филм „Скъпи мой Теo“ (My Dear Théo), интимна творба под формата на писма от фронта до нейния син, показва войната през личната призма. Прожекцията ще се състои в рамките на фестивала Sofia DocuMENTAL ’25 на 2 октомври, 19:00 ч., в Дома на киното и ще бъде последвана от среща с екипа.

„Скъпи мой Теo“ е структуриран като писма до сина Ви. Как писането и снимането в тази форма Ви помогнаха да преработите реалността на войната?

Този филм не беше замислен като филм от самото начало. От първите дни на фронта не се чувствах като режисьор, киното сякаш беше загубило всякакво значение. Камерата ми прекарваше времето в раницата ми, в окопи и траншеи. Снимах спонтанно малки фрагменти от живота. По-скоро имах чувството, че създавам видеоалбум за спомен.

С писмата беше различно – те ми помогнаха да осмисля реалността на войната. Те бяха за моя вътрешен свят, чувства, мисли, емоции, нещо екзистенциално, което камерата не може да улови. И вероятно имаше нещо терапевтично в това. Когато изразиш реалността около теб с думи, това ти помага да я разбереш по-добре. Но най-дълбокото осъзнаване дойде на етапа на монтажа.

Като майка и войник Вие изпълнявате две различни роли – как те съжителстват във Вас и как са повлияли на тона на филма?

Съжителството на двете роли – на майка и на войник – ми носи както огромна болка, така и невероятна сила. По време на силен обстрел, когато мислех за смъртта, осъзнах, че ако умра, умирам два пъти. Умирам и за сина си. И това, което мога да дам на детето си, цялата любов, също умира. Това е ужасно осъзнаване, но ми даде сила, защото исках да оцелея, за да може любовта да продължи.

Войната те прави по-твърд, изсмуква част от нежността ти. Майчинството беше балансиращ фактор за мен. Всички писма и филмът бяха опит да съчетая идентичностите на войник и майка. Истинската идея дойде от размислите ми за преходността на паметта – как можем да я съхраним за нашите деца, особено след като загинаха мои близки братя по оръжие, които също бяха родители.

Филмът Ви улавя тишината между атаките и нежността между войниците. Защо беше важно за Вас да подчертаете тези по-тихи моменти, а не само насилието?

Всъщност животът на фронта е по-скоро чакане, отколкото действие, и често може да бъде адски скучен. Има известно изкривяване на реалността в социалните медии и в новините, където постоянно виждаме изображения на събития и битки. Разбира се, никой няма да гледа новини за група войници, които чакат нещо в окоп в гората. И въпреки че може да умреш, докато чакаш, това не изглежда привлекателно или драматично нито за киното, нито за медийния свят. Важно е да се разбере, че действието не може да продължи интензивно толкова дълго време. Все пак го правиш на скокове, граничещи с очакване. Напредваш, после чакаш. Винаги чакаш. И тишината също е от съществено значение в средата на всичко това, тя е специална категория – всеки войник знае, че тишината може да бъде по-заплашителна от експлозиите.

Исках да говоря за невидимата страна, за рутинната работа, да разширя представата, възприятието за реалността на фронтовата линия. За мен войната е рутина, тя е тежка, дълга, изтощителна работа, която се повтаря отново и отново и не прилича на героична битка, а и не всяка смърт е героична. Пристигаш, ядеш малко консервирано месо, гледаш листата през термовизионна камера в продължение на няколко дни, мина излита над главата ти и умираш. Но има и друго измерение на тези видения – всички тези мисли, спомени и чувства, които те пронизват – това е съвсем друга емоционална, екзистенциална вселена. Исках да се опитам да уловя и да слея тези две измерения на фронтовата линия.

Много военни филми акцентират върху героизма. Вие обаче сте избрала интимността и уязвимостта. Какво Ви насочи към тази перспектива?

Мисля, че истинският героизъм започва с вътрешна човешка чувствителност. И именно чрез нея можем да се разбираме по-добре, да изпитваме съпричастност, приятелство, любов. За мен беше важно да не говоря за някаква абстрактна категория военни или героични воини, а за моите близки, да разкрия лицата ни зад каските, да покажа, че зад тях стоят обикновени хора, които обичат, мечтаят, страхуват се, да покажа човешката крехкост.

И чрез универсалната човешка природа да покажа, че на фронта сме всички ние, цялото сечение на обществото ни, хора от всички професии – артисти, обикновени работници, инженери, ИТ специалисти, както и бащи, синове, дъщери, братя, сестри, майки.

Ние не бяхме смели войници през цялото време, повечето от нас не бяха военни преди руската агресия и не искаха да бъдат такива. Трябваше да го направим, за да защитим дома си, семейството си. И като цяло трябва да бъдем предпазливи с героизма, защото той често може да изтрие истинските черти от лицата на хората.

Кой беше най-трудният момент за снимане – не от техническа, а от емоционална гледна точка – и как решихте да го оставите в окончателния монтаж?

Началото беше най-трудното. Дори преди началото. Току-що бях започнала да гледам заснетия материал, за да имам представа какъв е, и тогава мой близък загина, а скоро след това и друг. Не можех да се справя. Гледах минута, после отивах да плача десет минути и това продължи дълго. Тези материали чакаха година, а аз се страхувах да ги докосна. Дори престанах да вярвам, че ще излезе нещо от това, мислех, че може би трябва да го оставя за след време.

Когато дойдох на първата ни сесия за монтаж във френския град Монсегюр и след първите дни, в които гледахме заедно заснетия материал, почувствах, че не мога да работя по монтажа. Това бяха лицата на моите починали скъпи приятели, които ми бяха близки, и когато мисля за тяхната смърт, ме обзема неописуема болка. Осъзнах, че ако не намеря някакъв метод, някакъв изход от тези емоции, ще полудея. Нервно си мислех, че трябва спешно да измисля някакъв трик. И най-накрая успях. Изведнъж осъзнах, че не става въпрос за болката, а за възможността да се запази този живот, този спомен. Преодолях отчаянието с чувството, че създаваме капсула на времето, в която можеш да дойдеш, когато пожелаеш, и да ни видиш всички живи.

Да покажа човешката крехкост — Кадри от филма „Скъпи мой Теo“

Да пишеш на сина си по време на война е нещо дълбоко лично и в същото време универсално. Представяла ли сте си други майки, деца или бъдещи поколения като част от аудиторията си?

Когато пишех писмата, не можех да си представя никой друг освен сина си. По-късно това започна да се разкрива като нещо универсално, общ глас, мост на спомените, почит към родителите, които са дали живота си за бъдещето на децата си. Че всъщност това е нещо повече от личната ми история и че може да бъде ценно за хора с подобни преживявания, за тези, които са загубили близки, за бъдещите поколения.

Преди да завърша монтажа, показах почти окончателния вариант на един важен и специален човек, колежка и приятелка – Олга Бирзул. Нейният съпруг Виктор Ониско, който беше известен филмов монтажист в Украйна, загина на фронта в края на 2022 г. След като гледа филма, тя ми написа: „Благодаря ти. Чаках точно такъв филм.“ И това беше най-ценната обратна връзка за мен.

Размишлявах и за по-младото поколение, което идва, и се замислих какво можем да им дадем, как трябва да говорим с тях, как можем да се разберем. По време на една от прожекциите, в рамките на сесията с въпроси и отговори, един от моите бойни другари каза нещо, което беше значимо за мен: това не е филм за нашето минало – това е филм, който ни служи като пътеводител за нашето бъдеще, показвайки ни цената, която сме платили за свободата, и как трябва да се погрижим да не бъде пропиляна.

Как войната промени представата Ви за това какво може или трябва да прави киното? Вие сте войник, но и артист. Смятате ли, че изкуството може да бъде форма на съпротива?

В началото на пълномащабната война напълно загубих вярата си в киното и неговата роля в съпротивата. Срещнах тази война във влака, когато пътувах към последната сесия за заснемане на предишния ми документален филм за тийнейджъри от селата на фронтовата линия в Донбас – We Will Not Fade Away. Прекарах първите няколко дни със семейството на един от героите ми в село на няколко километра от фронтовата линия, където всички очакваха руснаците да превземат селото всеки момент. Бях объркана. Снимах някак механично, но в сърцето си бях загубила напълно всякакво чувство за смисъл в това занимание. Какво можеш да направиш с камерата си, когато ракети падат върху твоите близки, семейството ти, приятелите ти, героите в незавършения ти филм? Изглежда абсурдно.

След фронта не беше лесно да възвърна вътрешната си вяра в киното. И вероятно никога не успях да я възвърна напълно, не можах да се върна и към себе си. Но трябваше да завърша филма и всички разбирахме, че е необходимо да говорим за Украйна, че гласовете ни трябва да бъдат чути.

Рационално разбирам, че културата и киното са важни, разбира се, те са кодекс на човешките ценности и чувства, но все още ме измъчват този вътрешен конфликт и фактът, че трябва да се върна на фронта, защото там сега е основната ни линия на отбрана. Но културата също е едно от онези ценни неща, които трябва да защитаваме. Това е като концепцията за отрицателна и положителна свобода: трябва да знаеш срещу какво се бориш, но и за какво се бориш.

Развих ясно отхвърляне на концепцията за „изкуството за изкуството“. Или ме дразни, когато чуя нещо като „културата извън политиката“. Не мисля, че някога ще мога да приема такова кино. Вероятно киното, поне документалното, е станало за мен равностойно на човешкото действие и на отговорността за него – в по-голяма степен, отколкото преди.

Филмът е антивоенен, но е създаден в резултат на войната. Как съчетавате този парадокс в работата си и в живота си?

В момента живеем в пълен парадокс. И макар да е наистина изтощително да съществуваме в тези парадокси, изглежда, че именно те ни позволяват да открием някои дълбоки екзистенциални значения.

Веднъж разговаряхме с колега за погрешното схващане, че птиците не пеят, когато експлодират снаряди, но те пеят и дори пеят като луди. Когато работехме върху звука за филма, слушахме тези невероятни трели в моите оригинални видеозаписи от фронтовата линия по време на ужасни обстрели. Това е парадокс: сред смъртта и разрухата природата се стреми да продължи живота. Отиваме на работа след обстрела, шегуваме се, раждаме деца, пишем книги, правим филми по някакъв начин – като тези птици. Научаваме се да живеем и да балансираме между бездната и болката, да не губим сетивата си, да не губим чувствителността си, моментите на малко щастие, любовта, топлината вътре в нас.

Как искате синът Ви Тео да гледа този филм един ден?

Тео вече е гледал филма, дори може би пет пъти. Смятах, че е важно да го гледа преди световната премиера в Копенхаген, тъй като планирах да го заведа там с мен. Организирахме частна семейна прожекция в малко кино в Киев. Аз държах ръката му от едната страна, а баща му – от другата. Бях притеснена как ще реагира емоционално, защото филмът не беше за малкия Тео, а писмата във филма бяха за бъдещия по-голям Тео, така че не бях сигурна дали ще разбере напълно всичко екзистенциално. Но той го почувства. Реагираше на всичко, което се случваше, дори си спомняше моменти, когато му се обаждах от фронта. В един момент, докато гледаше фрагмент от нашия личен архив във филма, той ме погледна и каза: „Мамо, потъвам в носталгия.“ Един момент ме засегна особено силно. В една сцена във филма говоря за родителите, които отиват на фронта, за да не се налага децата ни да го правят. Тео се обърна към мен и каза:

Мамо, не искам да се връщаш на фронта. Аз ще отида вместо теб.

Надявам се да разбере по-ясно, че този филм е за светлината, човешката чувствителност, нежността, близостта, любовта, която остава в сърцето въпреки смъртта. И че именно любовта и светлината ни дават силата да устоим на мрака и да продължим борбата.

Когато погледнете отвъд войната, какви истории се чувствате длъжна да разкажете след нея?

В момента завършвам нов документален филм – Traces, който разкрива свидетелствата на украински жени, оцелели от сексуално насилие и изтезания, причинени от руската агресия в Украйна. Планираме да завършим постпродукцията до края на годината. От 2019 г. съм и член на SEMA Ukraine – организация на жени, преживели сексуално насилие, защото и аз бях пленена и преживях насилие през 2014 г., когато отидох да снимам събитията в Донбас – тогава всъщност започна руската агресия.

В сърцето си искам да правя филми за деца. Започнах да мисля за важността на филмовото образование за деца. Нашият свят полудява и не изглежда, че ние, възрастните, ще успеем да го оправим, но е наша отговорност да дадем някакви стълбове, нещо, за което новото поколение да се държи и да устои на цялото това цунами от дезинформация, насилие, потъпкване на ценностите и популизъм. Да им дадем инструменти, с които да се защитят. Защото след нас те ще бъдат тези, които ще предпазят света от разруха. Ние трябва да дадем максималното, на което сме способни.

Teaching Experience AI: Lessons from educators in Mexico

2025-09-25 Liz Eaton

Post Syndicated from Liz Eaton original https://www.raspberrypi.org/blog/teaching-experience-ai-lessons-from-educators-in-mexico/

In classrooms across Mexico, a transformation is unfolding. The Experience AI programme isn’t just teaching students about artificial intelligence, it’s empowering teachers and learners to explore, question, and create with it. By equipping educators with accessible tools and sparking curiosity among students, the initiative is shaping a new generation ready to use AI responsibly and creatively.

Educators like Guadalupe Cortes, Lilia Violeta Garvia Sanchez, Ines Martinez, and Ana Judith Zavaleta are at the forefront of this shift. Their experiences reveal just how transformative Experience AI has become.

From fear to fascination: Demystifying AI

For many, AI can feel abstract, something from science fiction. Science and math teacher Lilia Violeta Garvia Sanchez remembers that both she and her students once viewed AI as “robots that would take over the world.” Fear gave way to fascination, however, once Experience AI entered the classroom.

Through hands-on lessons, students quickly discovered AI as a practical tool rather than a threat. “I’ve seen a change in the students,” Lilia explains. “They were afraid at first, but now they’re curious and engaged.”

Technology teacher Ines Martinez admits she was also surprised: “I thought the language would be more technical or complex, but it was pleasantly accessible — and very useful.”

Equipping educators with tools that work

A defining strength of Experience AI is its adaptability. Teachers can tailor materials to fit their classrooms while still leaning on the program’s robust foundation.

Guadalupe Cortes points to the built-in glossary as a game-changer: “It was really helpful for me.” She values being able to choose what fits her teaching to keep it relevant: “I selected certain parts to connect with projects I was already running.”

Sparking critical thinking and ethical awareness

Experience AI pushes students to think deeply about the ethics and implications of AI.

In Ines’s class, students raised concerns about water use in data centres and debated how to protect their digital identities. They weren’t just learning facts, they were making connections to real-world issues.

Educator supporting young learner in the classroom

Another teacher, Ileana Beurini, described an exercise where students asked different AI models the same political question. When answers varied, they discussed bias and the importance of consulting multiple sources. In another activity, searching images of “worker” led to a conversation about gender equity in technology.

As Ines puts it: “They don’t want it to do all the thinking for them. They said it should be a support — a tool to generate better information, not to replace reasoning or reflection.”

Transforming engagement and performance

The impact on student motivation has been striking. For Ana Judith Zavaleta, the shift was clear: “They’re much more hands-on now — they don’t rely as much on textbooks or theory.” One student who typically struggled academically became one of the most enthusiastic participants, even passing where he previously failed.

Guadalupe Cortes has seen similar enthusiasm: “They’re finding a real purpose in using AI for their own benefit.” At an entrepreneurship fair, her students applied AI concepts to improve their projects, proof that these lessons extend far beyond the classroom.

A call to action for educators

The teachers’ message is unanimous: embrace AI.

“We should give it a try,” urges Lilia. “Just because we’re teachers doesn’t mean we have to know everything. The world is evolving every single day, and we need to innovate with our students so they feel motivated to keep learning.”

Young learners work on the classroom wall

For Ines, the takeaway is simple but powerful: “Take the risk — really, take the chance to learn. Just like the internet became essential, AI will become part of our daily lives and necessary for all areas of teaching — and life itself.”

More than just a set of resources

Experience AI is more than a set of resources, it’s a movement preparing students to navigate the future with curiosity, critical thinking, and ethical awareness. By igniting minds in Mexico, it’s helping to cultivate responsible digital citizens who will shape not just their classrooms, but the world beyond them.

For more information about Experience AI, visit our website: rpf.io/experienceai

For more information about our global Experience AI partner in Mexico, visit: educacionparacompartir.org

The post Teaching Experience AI: Lessons from educators in Mexico appeared first on Raspberry Pi Foundation.

Времето е в нас и ние сме под времето

2025-09-25

Post Syndicated from original https://www.toest.bg/vremieto-e-v-nas-i-nie-sme-pod-vremeto/

Времето е в нас и ние сме под времето

В началото на септември ми се наложи да отменя плановете си да отида до Созопол по здравословни причини. Не бях точно болна, но не бях и съвсем здрава, а докато информирах приятелите, с които вече се бях разбрала да се видим там, съжалих, че на български не съществува еквивалент на английския идиом under the weather, който буквално означава „под времето“, но се използва точно за такива случаи на „неразположение“. Освен пространствено неопределена – „(раз)положение“ спрямо какво? – думата и особено производното ѝ прилагателно „неразположен(а)“ звучат и старомодно евфемистично, сякаш са по-скоро от роман на Джейн Остин, отколкото от съвремието.

Английският израз under the weather също е пространствено озадачаващ – как точно човек се намира „под времето“? – и макар да звучи значително по-модерно, се оказва, че също датира от времето на Остин. Фразата всъщност започва да се използва идиоматично през първите няколко десетилетия на XIX век, често като описание на финансови затруднения, неспособност за справяне или повсеместно непрокопсване, понякога по отношение на цели институции, индустрии или градове. С течение на времето обаче словосъчетанието „под времето“ все пак се установява в английския като идиом, описващ индивидуално излизане от форма и с това значение продължава да се употребява и досега.

Теориите за точната първоначална семантика на израза са няколко, но според всички той е свързан с мореплаването, или по-точно с предизвикателствата на неблагоприятните метеорологични условия, пред които мореплавателите често са изправени. Според една от версиите за произхода на идиома той идва от практиката страдащите от морска болест¹ моряци да търсят спасение от лошото време, като се крият на завет под палубата. Един вид, времеубежище – само че не от хронологията на историята, а от атмосферните условия.

Ако оставим литературните неологизми настрана, идиомът under the weather, за съжаление, не съществува в българския език – нито в буквален превод, нито под формата на някакъв приблизителен еквивалент. Отсъствието му обаче е направо пренебрежимо в сравнение с една много по-съществена – и истински любопитна! – езикова липса.

За разлика от английския, където думата weather обозначава метеорологичния феномен, а time се използва за хронологичния, българският разполага само с едно наименование за двете понятия – това е именно думата „време“.

Неизненадващо, нашият език не е единственият, в който се наблюдава тази липса. Двете понятия се означават с една и съща дума (сходна на българската „време“, от старобългарската врѣмѧ²) и в останалите южнославянски езици, в унгарски (idő), както и в романските езици (temps във френски, tiempo в испански, tempo в италиански и португалски), в които думата произлиза от латинската tempus (а тя от своя страна също съдържа двете значения)³.

Изненадата, поне за мен, идва не толкова от несходствата между езиковите групи, а вътре в самите тях. Западно- и източнославянските езици например се разграничават от южнославянските си братовчеди и се присъединяват към английския и останалите германски (а и повечето световни) езици, където двете понятия се обозначават с две думи. В руски, украински и полски тези две думи са съвсем отделни – съответно „время“ и „погода“; „час“ и „погода“; czas и pogoda, – докато в чешки и словашки те са сходни, но все пак различни: čas и počasí(е).

Интересно изключение прави и румънският, чиято ситуация отразява както формалната му принадлежност към романското езиково семейство, така и практическата му свързаност с неговите балкански съседи. За разлика както от едните, така и от другите обаче, румънският си служи с две отделни наименования: за понятията, свързани с хронологичното време, и за времето като философско понятие се използва думата с латински корен timp, а за атмосферните условия – славянската думата vreme.

Оказва се, че разделението на езиците на географски принцип предлага доста по-чиста картина, отколкото принадлежността им към една или друга езикова група.

Ако си представим картата на Европа според държавите, които използват една обща дума за „време“, и тези, които използват две отделни, тя би била ясно разделена на южна и северна част.

Но и в това разделение също има любопитни изключения, сред които са латвийският (за разлика от заобикалящите го езици, тук думата е една – laiks) и албанският (тук пък са две – kohë и mot). Но най-фрапантното изключение идва от гръцкия, където за понятието „време“ съществуват не една, не две, а цели три отделни думи: χρόνος (khrónos) – времето като интервал между две събития, както и понятие, което изразява последователността и продължителността на явленията (носи и значението на „година“); καιρός (kaírios)⁴ – времето като метеорологично понятие, както и като период/отрязък от време или епоха; и ώρα (óra) – освен за „час“ също може да се използва за времето в смисъл на определена част от деня, подходящо за някакви конкретни дейности. (Последната дума служи и за корен на прилагателното ωραίο (oraío), тоест „хубаво“ или „красиво“, което обаче първоначално е значело „навременно“. Според някои теории оттам произлиза и наименованието на бисквитите Oreo.)

Времето е раздел(е)но и в турския език. Там думата за понятието, свързано с продължителност, епоха, или период, е zaman (навлязла в българския заедно с израза „зор заман“), докато тази, обозначаваща метеорологичното време, е hava, буквално „въздух“ (оттам и въпросът „Как е хавата?“).

Очакванията ми да открия някакво особено вълнуващо обяснение за различните подходи на различните езици се оказват неоправдани – лингвистите не предлагат такова и посочват „езикови инерции“ и „предвидими закономерности“: някои езици просто запазват старата полисемия (една дума с няколко значения) и разчитат на контекста, докато други развиват/заемат нова лексика и така отделните значения започват да се назовават с различни думи.

Със или без вълнуващи обяснения, двете понятия очевидно са концептуално свързани – течението на времето и смяната на сезоните неминуемо влияят на атмосферните условия. Същевременно заобикалящият ни свят (и атмосферните условия) несъмнено се отразява на езика, с който ги възприемаме и описваме. Както, струва ми се, е вярно и обратното: езикът със сигурност влияе на начините, по които възприемаме и описваме околния свят.

Ето защо предполагам, че макар и условно и отнасящо се само за Европа, разделението север/юг вероятно не е съвсем случайно. Може би много по-суровите и екстремни условия в северната част на континента изискват специфични думи, с които да бъдат назовани, докато сравнително мекият климат на Средиземноморието и Балканите няма нужда от отделни наименования?

В тази връзка се сещам за твърдението, че ескимосите разполагат с огромен брой – десетки, а даже и стотици – думи за различни видове сняг. Следвайки тази логика, се питам дали фактът, че в английския има отделна дума, с която се обозначават метеорологичните условия, поне донякъде обяснява пословичния афинитет на англичаните към обсъждането на времето. Тоест имат си думата weather – ползват си я⁵.

Така или иначе, англичаните може и да са световни шампиони по обсъждане на времето, но далеч не са единствените първенци в тази дисциплина – според различни изследвания the weather е популярна тема за разговор и в САЩ, Канада, Ирландия, Австралия и Нова Зеландия – все места, където не само английският е официален език, а и климатът, особено в сравнение с южната част на Европа, е променлив, непредвидим и/или екстремен.

Обсъждането на времето (метеорологичното) е любим начин да се запълни времето (хронологичното) и да се разчупи ледът и в ледовитите – както пряко, така и преносно – скандинавски държави. В това се уверих лично неотдавна – това лято така и не стигнах до Созопол, но за сметка на това прекарах няколко дни в Стокхолм. Там научих, че шведският, подобно на английския и на останалите скандинавски езици, разполага с отделни думи за двете понятия, като väder (атмосферното време) е популярна тема за разговор.

Точно по време – а и благодарение – на един ветровит следобед в Стокхолм, докато слушах как едни хора обсъждат времето, ме осени съвсем неочаквано откритие. Българският може и да не разполага с отделна дума, която да обозначава метеорологичните условия, но шведската väder (както и нейните роднини в останалите германски езици) всъщност споделя общ произход – от праиндоевропейската *h₂weh₁- („вея“) – с българското наименование на едно от основните метеорологични явления, а именно с думата „вятър“!⁶ Етимологията, оказва се, хич не е вятър работа.⁷

Другата българска дума, която – още по-изненадващо! – споделя общ праиндоевропейски корен с прагерманската wedrą, откъдето произлизат английската weather, шведската väder, немската Wetter, исландската veður и т.н., е прилагателното „ведро“. Само че с ударение не на втората сричка, както във „вали като из ведро“⁸, а на първата – в смисъл „ясно, хубаво (време)“.

Използвам случая да премина от вятър на дъжд, и то не в метафорично-темпоралния, а в буквално-метеорологичния смисъл. Докато отменях плановете си за ходене до Созопол, въпреки че прогнозата беше за топло и слънчево време, се сетих за още един англоезичен идиом, който щеше добре да пасне на ситуацията – израза [to take a] rain check, тоест отлагането на план за някакъв неконкретизиран по-нататъшен момент. Тъкмо започвах да съжалявам, че на български, както и за under the weather, не съществува негов достоен еквивалент, когато една приятелка, в отговор на моето съобщение, че се налага да отложим уговорката, ми написа: „Не се притеснявай! Пишем го дъждовен!“

А когато (неуспешно) се опитах да върна неизползвания си билет за влака за Бургас, в ума ми изникна краткия, но красноречив български израз „след дъжд качулка“, който чудесно обединява в себе си както темпоралното, така и метеорологичното понятие за време. Този влак вече беше отпътувал. Или както се казва на изобилстващия от мореплавателска лексика английски, този кораб вече беше отплавал.

¹ Етимологията на наименованията на морската болест и различните ѝ симптоми в различни езици сама по себе си е крайно любопитна. На български това „неразположение“ е известно и под общото понятие „кинетоза“ – от гръцката дума κίνησις (kínēsis ‘движение’), откъдето през немски в български навлиза и думата „кино“ в смисъл на „движещо се изображение“. На френски, освен с буквалното и благозвучно название mal de mer, морската болест се нарича и с термина naupathie – от гръцката дума ναῦς (naús ‘кораб’), която е и в корена на английското название на един от основните симптоми на болестта – nausea, тоест „гадене“.

² Славянската дума „време“ няма общо с прагерманската *tīdiz, откъдето произлизат понятията, с които темпоралното време се нарича в съвременните германски езици (time, Zeit, tid и т.н.), но пък споделя изненадваща етимологична родственост с една друга дума в тях, а именно думата за „червей“: worm в английски и нидерландски, orm в норвежки и датски, Wurm в немски и т.н. Оказва се, че всички те произлизат от праиндоевропейския корен *wert- (‘въртя, завъртам, обръщам’). Wurm също така е част от една от любимите ми сложни думи в немския – Ohrwurm, буквално от Ohr (‘ухо’) + Wurm (‘червей’), за която се обзалагам, че ще я преживеете лично, след като прочетете последната бележка към този текст.

³ Самата дума „време“ може и да е със старобългарски корен, но в езика ни все пак са навлезли няколко думи, които произлизат от латинската tempus. Освен очевидните „темпо“ и „темпорално“ сред тях са също „температура“ и „темперамент“.

⁴ Гръцката дума καιρός (kaírios), която описва времето като метеорологично понятие, започва да се използва в този смисъл едва през византийския период. В старогръцкия вместо общо понятие са се използвали думи за конкретните атмосферни явления, например χειμών (cheimón ‘виелица’, ‘зима’) или αἰθήρ (aithḗr ‘ясно небе’), откъдето произлиза и българската думата „етер“, а промените в метеорологичните условия, разбира се, са били приписвани на боговете.

⁵ Според изследване, публикувано това лято, жителите на Великобритания прекарват средно около 57 часа (близо две денонощия и половина) годишно – или четири месеца и половина от целия си живот – в обсъждане на времето.

⁶ Думите за „вятър“ в германските езици (wind/vind), както и в романските езици (vent/vento/viento) произлизат от същия праиндоевропейски корен *h₂weh₁- (‘вея’) като българския им еквивалент. Интересно е, че в някои от германските и романските езици думите за „вятър“ от своя страна са корен в думите за „прозорец“ (например window в английски, vindu в норвежки и ventana в испански). Предположих, че това може да се отнася и за българската дума „витрина“, но се оказва, че тя води началото си – чрез френската vitrine, от латинската vitrum („стъкло“) – от праиндоевропейския корен *wed-, от който произлиза и думата *wódr (‘вода’).

⁷ Откриването на етимологичната връзка между английската дума weather и българската дума „вятър“ хвърля нова светлина и върху идиомa under the weather от началото на този текст. Според една от популярните теории за неговия произход той всъщност е скъсена версия на по-дългата фраза under the weather bow, като weather bow се използва за название на изложената на вятъра страна на кораба. На български тази страна се нарича „наветрена“. По някакво случайно езиково съвпадение, страната, която е защитена от вятъра и е на завет, се нарича „подветрена“.

⁸ Думата „ведрò“ споделя произхода си с „вода“, а оттам и с нейните роднини в германските езици (Water в английски, Wasser в немски, vatten в шведски и т.н), които също произлизат от праиндоевропейския корен *wódr. Водата не е етимологично свързана с времето, но между двете понятия има несъмнена метафорична връзка: неслучайно в английски съществува изразът water under the bridge (буквално „вода под моста“), използван за отминали събития, които вече нямат значение. На български пък казваме, че времето „тече“. Както добре знаем, то няма бряг и ни влече, няма как. И тъй, всяко момче всъщност е бъдещ мъж, поет или моряк.

В рубриката „От дума на дума“ Екатерина Петрова търси актуални, интересни или новопоявили се думи от нашето ежедневие и проследява често изненадващия им произход, развитието на значенията им във времето и взаимовръзките им с близки и далечни езици.

[$] LWN.net Weekly Edition for September 25, 2025

2025-09-25 corbet

Post Syndicated from corbet original https://lwn.net/Articles/1038643/

Inside this week’s LWN.net Weekly Edition:

Front: Debian stable bug; Canceling async Rust; CHERI Linux; Time-slice extension; Multikernel; Revocable references; Blender 4.5.
Briefs: Bluefin LTS; RPM 6.0.0; Tails 7.0; Rust 1.90.0; Infrastructure costs; Quotes; …
Announcements: Newsletters, conferences, security updates, patches, and more.

Investigating a forged PDF

2025-09-25 Matthew Garrett

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/73317.html

I had to rent a house for a couple of months recently, which is long enough in California that it pushes you into proper tenant protection law. As landlords tend to do, they failed to return my security deposit within the 21 days required by law, having already failed to provide the required notification that I was entitled to an inspection before moving out. Cue some tedious argumentation with the letting agency, and eventually me threatening to take them to small claims court.

This post is not about that.

Now, under Californian law, the onus is on the landlord to hold and return the security deposit – the agency has no role in this. The only reason I was talking to them is that my lease didn’t mention the name or address of the landlord (another legal violation, but the outcome is just that you get to serve the landlord via the agency). So it was a bit surprising when I received an email from the owner of the agency informing me that they did not hold the deposit and so were not liable – I already knew this.

The odd bit about this, though, is that they sent me another copy of the contract, asserting that it made it clear that the landlord held the deposit. I read it, and instead found a clause reading SECURITY: The security deposit will secure the performance of Tenant’s obligations. IER may, but will not be obligated to, apply all portions of said deposit on account of Tenant’s obligations. Any balance remaining upon termination will be returned to Tenant. Tenant will not have the right to apply the security deposit in payment of the last month’s rent. Security deposit held at IER Trust Account., where IER is International Executive Rentals, the agency in question. Why send me a contract that says you hold the money while you’re telling me you don’t? And then I read further down and found this:

Ok, fair enough, there’s an addendum that says the landlord has it (I’ve removed the landlord’s name, it’s present in the original).

Except. I had no recollection of that addendum. I went back to the copy of the contract I had and discovered:
The same text as the previous picture, but addendum 1 is empty
Huh! But obviously I could just have edited that to remove it (there’s no obvious reason for me to, but whatever), and then it’d be my word against theirs. However, I’d been sent the document via RightSignature, an online document signing platform, and they’d added a certification page that looked like this:
A Signature Certificate, containing a bunch of data about the document including a checksum or the original
Interestingly, the certificate page was identical in both documents, including the checksums, despite the content being different. So, how do I show which one is legitimate? You’d think given this certificate page this would be trivial, but RightSignature provides no documented mechanism whatsoever for anyone to verify any of the fields in the certificate, which is annoying but let’s see what we can do anyway.

First up, let’s look at the PDF metadata. pdftk has a dump_data command that dumps the metadata in the document, including the creation date and the modification date. My file had both set to identical timestamps in June, both listed in UTC, corresponding to the time I’d signed the document. The file containing the addendum? The same creation time, but a modification time of this Monday, shortly before it was sent to me. This time, the modification timestamp was in Pacific Daylight Time, the timezone currently observed in California. In addition, the data included two ID fields, ID0 and ID1. In my document both were identical, in the one with the addendum ID0 matched mine but ID1 was different.

These ID tags are intended to be some form of representation (such as a hash) of the document. ID0 is set when the document is created and should not be modified afterwards – ID1 initially identical to ID0, but changes when the document is modified. This is intended to allow tooling to identify whether two documents are modified versions of the same document. The identical ID0 indicated that the document with the addendum was originally identical to mine, and the different ID1 that it had been modified.

Well, ok, that seems like a pretty strong demonstration. I had the “I have a very particular set of skills” conversation with the agency and pointed these facts out, that they were an extremely strong indication that my copy was authentic and their one wasn’t, and they responded that the document was “re-sealed” every time it was downloaded from RightSignature and that would explain the modifications. This doesn’t seem plausible, but it’s an argument. Let’s go further.

My next move was pdfalyzer, which allows you to pull a PDF apart into its component pieces. This revealed that the documents were identical, other than page 3, the one with the addendum. This page included tags entitled “touchUp_TextEdit”, evidence that the page had been modified using Acrobat. But in itself, that doesn’t prove anything – obviously it had been edited at some point to insert the landlord’s name, it doesn’t prove whether it happened before or after the signing.

But in the process of editing, Acrobat appeared to have renamed all the font references on that page into a different format. Every other page had a consistent naming scheme for the fonts, and they matched the scheme in the page 3 I had. Again, that doesn’t tell us whether the renaming happened before or after the signing. Or does it?

You see, when I completed my signing, RightSignature inserted my name into the document, and did so using a font that wasn’t otherwise present in the document (Courier, in this case). That font was named identically throughout the document, except on page 3, where it was named in the same manner as every other font that Acrobat had renamed. Given the font wasn’t present in the document until after I’d signed it, this is proof that the page was edited after signing.

But eh this is all very convoluted. Surely there’s an easier way? Thankfully yes, although I hate it. RightSignature had sent me a link to view my signed copy of the document. When I went there it presented it to me as the original PDF with my signature overlaid on top. Hitting F12 gave me the network tab, and I could see a reference to a base.pdf. Downloading that gave me the original PDF, pre-signature. Running sha256sum on it gave me an identical hash to the “Original checksum” field. Needless to say, it did not contain the addendum.

Why do this? The only explanation I can come up with (and I am obviously guessing here, I may be incorrect!) is that International Executive Rentals realised that they’d sent me a contract which could mean that they were liable for the return of my deposit, even though they’d already given it to my landlord, and after realising this added the addendum, sent it to me, and assumed that I just wouldn’t notice (or that, if I did, I wouldn’t be able to prove anything). In the process they went from an extremely unlikely possibility of having civil liability for a few thousand dollars (even if they were holding the deposit it’s still the landlord’s legal duty to return it, as far as I can tell) to doing something that looks extremely like forgery.

There’s a hilarious followup. After this happened, the agency offered to do a screenshare with me showing them logging into RightSignature and showing the signed file with the addendum, and then proceeded to do so. One minor problem – the “Send for signature” button was still there, just below a field saying “Uploaded: 09/22/25”. I asked them to search for my name, and it popped up two hits – one marked draft, one marked completed. The one marked completed? Didn’t contain the addendum.

comments