How AWS Data Lab helped BMW Financial Services design and build a multi-account modern data architecture

Post Syndicated from Rahul Shaurya original

This post is co-written by Martin Zoellner, Thomas Ehrlich and Veronika Bogusch from BMW Group.

BMW Group and AWS announced a comprehensive strategic collaboration in 2020. The goal of the collaboration is to further accelerate BMW Group’s pace of innovation by placing data and analytics at the center of its decision-making. A key element of the collaboration is the further development of the Cloud Data Hub (CDH) of BMW Group. This is the central platform for managing company-wide data and data solutions in the cloud. At the AWS re:Invent 2019 session, BMW and AWS demonstrated the new Cloud Data Hub platform by outlining different archetypes of data platforms and then walking through the journey of building BMW Group’s Cloud Data Hub. To learn more about the Cloud Data Hub, refer to BMW Cloud Data Hub: A reference implementation of the modern data architecture on AWS.

As part of this collaboration, BMW Group is migrating hundreds of data sources across several data domains to the Cloud Data Hub. Several of these sources pertain to BMW Financial Services.

In this post, we talk about how the AWS Data Lab is helping BMW Financial Services build a regulatory reporting application for one of the European BMW market using the Cloud Data Hub on AWS.

Solution overview

In the context of regulatory reporting, BMW Financial Services works with critical financial services data that contains personally identifiable information (PII). We need to provide monthly insights on our financial data to one of the European National Regulator, and we also need to be compliant with the Schrems II and GDPR regulations as we process PII data. This requires the PII to be pseudonymized when it’s loaded into the Cloud Data Hub, and it has to be processed further in pseudonymized form. For an overview of pseudonymization process, check out Build a pseudonymization service on AWS to protect sensitive data .

To address these requirements in a precise and efficient way, BMW Financial Services decided to engage with the AWS Data Lab. The AWS Data Lab has two offerings: the Design Lab and the Build Lab.

Design Lab

The Design Lab is a 1-to-2-day engagement for customers who need a real-world architecture recommendation based on AWS expertise, but aren’t ready to build. In the case of BMW Financial Services, before beginning the build phase, it was key to get all the stakeholders in the same room and record all the functional and non-functional requirements introduced by all the different parties that might influence the data platform—from owners of the various data sources to end-users that would use the platform to run analytics and get business insights. Within the scope of the Design Lab, we discussed three use cases:

  • Regulatory reporting – The top priority for BMW Financial Services was the regulatory reporting use case, which involves collecting and calculating data and reports that will be declared to the National Regulator.
  • Local data warehouse – For this use case, we need to calculate and store all key performance indicators (KPIs) and key value indicators (KVIs) that will be defined during the project. The historical data needs to be stored, but we need to apply a pseudonymization process to respect GDPR directives. Moreover, historical data has to be accessed on a daily basis through a tableau visualization tool. Regarding the structure, it would be valuable to define two levels (at minimum): one at the contract level to justify the calculation of all KPIs, and another at an aggregated level to optimize restitutions. Personal data is limited in the application, but a reidentification process must be possible for authorized consumption patterns.
  • Accounting details – This use case is based on the BMW accounting tool IFT, which provides the accounting balance at the contract level from all local market applications. It must run at least once a month. However, if some issues are identified on IFT during closing, we must be able to restart it and erase the previous run. When the month-end closing is complete, this use case has to keep the last accounting balance version generated during the month and store it. In parallel, all accounting balance versions have to be accessible by other applications for queries and be able to retrieve the information for 24 months.

Design Lab Solution Architecture

Based on these requirements, we developed the following architecture during the Design Lab.

This solution contains the following components:

  1. The main data source that hydrates our three use cases is the already available in the Cloud Data Hub. The Cloud Data Hub uses AWS Lake Formation resource links to grant access to the dataset to the consumer accounts.
  2. For standard, periodic ETL (extract, transform, and load) jobs that involve operations such as converting data types, or creating labels based on numerical values or Boolean flags based on a label, we used AWS Glue ETL jobs.
  3. For historical ETL jobs or more complex calculations such as in the account details use case, which may involve huge joins with custom configurations and tuning, we recommended to use Amazon EMR. This gives you the opportunity to control cluster configurations at a fine-grained level.
  4. To store job metadata that enables features such as reprocessing inputs or rerunning failed jobs, we recommended building a data registry. The goal of the data registry is to create a centralized inventory for any data being ingested in the data lake. A schedule-based AWS Lambda function could be triggered to register data landing on the semantic layer of the Cloud Data Hub in a centralized metadata store. We recommended using Amazon DynamoDB for the data registry.
  5. Amazon Simple Storage Service (Amazon S3) serves as the storage mechanism that powers the regulatory reporting use case using the data management framework Apache Hudi. Apache Hudi is useful for our use cases because we need to develop data pipelines where record-level insert, update, upsert, and delete capabilities are desirable. Hudi tables are supported by both Amazon EMR and AWS Glue jobs via the Hudi connector, along with query engines such as Amazon Athena and Amazon Redshift Spectrum.
  6. As part of the data storing process in the regulatory reporting S3 bucket, we can populate the AWS Glue Data Catalog with the required metadata.
  7. Athena provides an ad hoc query environment for interactive analysis of data stored in Amazon S3 using standard SQL. It has an out-of-the-box integration with the AWS Glue Data Catalog.
  8. For the data warehousing use case, we need to first de-normalize data to create a dimensional model that enables optimized analytical queries. For that conversion, we use AWS Glue ETL jobs.
  9. Dimensional data marts in Amazon Redshift enable our dashboard and self-service reporting needs. Data in Amazon Redshift is organized into several subject areas that are aligned with the business needs, and a dimensional model allows for cross-subject area analysis.
  10. As a by-product of creating an Amazon Redshift cluster, we can use Redshift Spectrum to access data in the regulatory reporting bucket of the architecture. It acts as a front to access more granular data without actually loading it in the Amazon Redshift cluster.
  11. The data provided to the Cloud Data Hub contains personal data that is pseudonymized. However, we need our pseudonymized columns to be re-personalized when visualizing them on Tableau or when generating CSV reports. Both Athena and Amazon Redshift support Lambda UDFs, which can be used to access Cloud Data Hub PII APIs to re-personalize the pseudonymized columns before presenting them to end-users.
  12. Both Athena and Amazon Redshift can be accessed via JDBC (Java Database Connectivity) to provide access to data consumers.
  13. We can use a Python shell job in AWS Glue to run a query against either of our analytics solutions, convert the results to the required CSV format, and store them to a BMW secured folder.
  14. Any business intelligence (BI) tool deployed on premises can connect to both Athena and Amazon Redshift and use their query engines to perform any heavy computation before it receives the final data to fuel its dashboards.
  15. For the data pipeline orchestration, we recommended using AWS Step Functions because of its low-code development experience and its full integration with all the other components discussed.

With the preceding architecture as our long-term target state, we concluded the Design Lab and decided to return for a Build Lab to accelerate solution development.

Preparing for Build Lab

The typical preparation of a Build Lab that follows a Design Lab involves identifying a few examples of common use case patterns, typically the more complex ones. To maximize the success in the Build Lab, we reduce the long-term target architecture to a subset of components that addresses those examples and can be implemented within a 3-to-5-day intense sprint.

For a successful Build Lab, we also need to identify and resolve any external dependencies, such as network connectivity to data sources and targets. If that isn’t feasible, then we find meaningful ways to mock them. For instance, to make the prototype closer to what the production environment would look like, we decided to use separate AWS accounts for each use case, based on the existing team structure of BMW, and use a consumer S3 bucket instead of BMW network-attached storage (NAS).

Build Lab

The BMW team set aside 4 days for their Build Lab. During that time, their dedicated Data Lab Architect worked alongside the team, helping them to build the following prototype architecture.

Build Lab Solution

This solution includes the following components:

  1. The first step was to synchronize the AWS Glue Data Catalog of the Cloud Data Hub and regulatory reporting accounts.
  2. AWS Glue jobs running on the regulatory reporting account had access to the data in the Cloud Data Hub resource accounts. During the Build Lab, the BMW team implemented ETL jobs for six tables, addressing insert, update, and delete record requirements using Hudi.
  3. The result of the ETL jobs is stored in the data lake layer stored in the regulatory reporting S3 bucket as Hudi tables that are catalogued in the AWS Glue Data Catalog and can be consumed by multiple AWS services. The bucket is encrypted using AWS Key Management Service (AWS KMS).
  4. Athena is used to run exploratory queries on the data lake.
  5. To demonstrate the cross-account consumption pattern, we created an Amazon Redshift cluster on it, created external tables from the Data Catalog, and used Redshift Spectrum to query the data. To enable cross-account connectivity between the subnet group of the Data Catalog of the regulatory reporting account and the subnet group of the Amazon Redshift cluster on the local data warehouse account, we had to enable VPC peering. To accelerate and optimize the implementation of these configurations during the Build Lab, we received support from an AWS networking subject matter expert, who ran a valuable session, during which the BMW team understood the networking details of the architecture.
  6. For data consumption, the BMW team implemented an AWS Glue Python shell job that connected to Amazon Redshift or Athena using a JDBC connection, ran a query, and stored the results in the reporting bucket as a CSV file, which would later be accessible by the end-users.
  7. End-users can also directly connect to both Athena and Amazon Redshift using a JDBC connection.
  8. We decided to orchestrate the AWS Glue ETL jobs using AWS Glue Workflows. We used the resulting workflow for the end-of-lab demo.

With that, we completed all the goals we had set up and concluded the 4-day Build Lab.


In this post, we walked you through the journey the BMW Financial Services team took with the AWS Data Lab team to participate in a Design Lab to identify a best-fit architecture for their use cases, and the subsequent Build Lab to implement prototypes for regulatory reporting in one of the European BMW market.

To learn more about how AWS Data Lab can help you turn your ideas into solutions, visit AWS Data Lab.

Special thanks to everyone who contributed to the success of the Design and Build Lab: Lionel Mbenda, Mario Robert Tutunea, Marius Abalarus, Maria Dejoie.

About the authors

Martin Zoellner is an IT Specialist at BMW Group. His role in the project is Subject Matter Expert for DevOps and ETL/SW Architecture.

Thomas Ehrlich is the functional maintenance manager of Regulatory Reporting application in one of the European BMW market.

Veronika Bogusch is an IT Specialist at BMW. She initiated the rebuild of the Financial Services Batch Integration Layer via the Cloud Data Hub. The ingested data assets are the base for the Regulatory Reporting use case described in this article.

George Komninos is a solutions architect for the Amazon Web Services (AWS) Data Lab. He helps customers convert their ideas to a production-ready data product. Before AWS, he spent three years at Alexa Information domain as a data engineer. Outside of work, George is a football fan and supports the greatest team in the world, Olympiacos Piraeus.

Rahul Shaurya is a Senior Big Data Architect with AWS Professional Services. He helps and works closely with customers building data platforms and analytical applications on AWS. Outside of work, Rahul loves taking long walks with his dog Barney.

Customize Amazon QuickSight dashboards with the new bookmarks functionality

Post Syndicated from Mei Qu original

Amazon QuickSight users now can add bookmarks in dashboards to save customized dashboard preferences into a list of bookmarks for easy one-click access to specific views of the dashboard without having to manually make multiple filter and parameter changes every time. Combined with the “Share this view” functionality, you can also now share your bookmark views with other readers for easy collaboration and discussion.

In this post, we introduce the bookmark functionality and its capabilities, and demonstrate typical use cases. We also discuss several scenarios extending the usage of bookmarks for sharing and presentation.

Create a bookmark

Open the published dashboard that you want to view and make changes to the filters, controls, or select the sheet that you want. For example, you can filter to the Region that interests you, or you can select a specific date range using a sheet control on the dashboard.

To create a bookmark of this view, choose the bookmark icon, then choose Add bookmark.

Enter a name, then choose Save.

The bookmark is saved, and the dashboard name on the banner updates with the bookmark name.

You can return to the original dashboard view that the author published at any time by going back to the bookmark icon and choosing Original dashboard.

Rename or delete a bookmark

To rename or delete a bookmark, on the bookmark pane, choose the context menu (three vertical dots) and choose Rename or Delete.

Although you can make any of these edits to bookmarks you have created, the Original dashboard view can’t be renamed, deleted, or updated, because it’s what the author has published.

Update a bookmark

At any time, you can change a bookmark dashboard view and update the bookmark to always reflect those changes.

After you make your edits, on the banner, you can see Modified next to the bookmark you’re currently editing.

To save your updates, on the bookmarks pane, choose the context menu and choose Update.

Make a bookmark the default view

After you create a bookmark, you can set it as the default view of the dashboard that you see when you open the dashboard in a new session. This doesn’t affect anyone else’s view of the dashboard. You can also set the Original dashboard that the author published as the default view. When a default view is set, any time that you open the dashboard, the bookmark view is presented to you, regardless of the changes you made during your last session. However, if no default bookmark has been set, QuickSight will persist your current filter selections when you leave the dashboard. This way, readers can still pick up where they left off and don’t have to re-select filters.

To do this, in the Bookmarks pane, choose the context menu (the three dots) for the bookmark that you want to set as your default view, then choose Set as default.

Share a bookmark

After you create a bookmark, you can share a URL link for the view with others who have permission to view the dashboard. They can then save that view as their own bookmark. The shared view is a snapshot of your bookmark, meaning that if you make any updates to your bookmark after generating the URL link, the shared view doesn’t change.

To do this, choose the bookmark that you want to share so that the dashboard updates to that view. Choose the share icon, then choose Share this view. You can then copy the generated URL to share it with others.

Presentation with bookmarks

One extended use case of the new bookmarks feature is the ability to create a presentation through visuals in a much more elegant fashion. Previously, you may have had to change multiple filters and controls before arriving at your next presentation. Now you can save each presentation as a bookmark beforehand and just go through each bookmark like a slideshow. By creating a snapshot for each of the configurations you want to show, you create a smoother and cleaner experience for your audience.

For example, let’s prepare a presentation through bookmarks using a Restaurant Operations dashboard. We can start with the global view of all the states and their related KPIs.

Best vs. worst performing restaurant

We’re interested in analyzing the difference between the best performing and worst performing store based on revenue. If we had to filter this dashboard on the fly, it would be much more time-consuming because we would have to manually enter both store names into the filter. However, with bookmarks, we can pre-filter to the two addresses, save the bookmark, and be automatically directed to the desired stores when we select it.

Worst performing restaurant analysis

When comparing the best and worst performing stores, we see that the worst performing store (10 Resort Plz) had a below average revenue amount in December 2021. We’re interested in seeing how we can better boost sales around the holiday season and if there is something the owner lacked in that month. We can create a bookmark that directs us to the Owner View with the filters selected to December 2021 and the address and city pre-filtered.

Pennsylvania December 2021 analysis

After looking at this bookmark, we found that in the month of December, the call for support score in the restaurant category was high. Additionally, in the week of December 19, 2021 product sales were at the monthly low. We want to see how these KPIs compare to other stores in the same state for that week. Is it just a seasonal trend, or do we need to look even deeper and take action? Again, we can create a bookmark that gives us this comparison on a state level.

From the last bookmark, we can see that the other store in the same state (Pennsylvania) also had lower than average product sales in the week of December 19, 2021. For the purpose of this demo, we can go ahead and stop here, but we could create even more bookmarks to help answer additional questions, such as if we see this trend across other cities and states.

With these three bookmarks, we have created a top-down presentation looking at the worst performing store and its relevant metrics and comparisons. If we wanted to present this, it would be as simple as cycling through the bookmarks we’ve created to put together a cohesive presentation instead of having distracting onscreen filtering.


In this post, we discussed how we can create, modify, and delete bookmarks, along with setting it as a default view and sharing bookmarks. We also demonstrated an extended use case of presentation with bookmarks. The new bookmarks functionality is now generally available in all supported QuickSight Regions.

We’re looking forward to your feedback and stories on how you apply these calculations for your business needs.

Did you know that you can also ask questions of you data in natural language with QuickSight Q? Watch this video to learn more.

About the Authors

Mei Qu is a Business Intelligence Engineer on the AWS QuickSight ProServe Team. She helps customers and partners leverage their data to develop key business insights and monitor ongoing initiatives. She also develops QuickSight proof of concept projects, delivers technical workshops, and supports implementation projects.

Emily Zhu is a Senior Product Manager at Amazon QuickSight, AWS’s cloud-native, fully managed SaaS BI service. She leads the development of QuickSight analytics and query capabilities. Before joining AWS, she was working in the Amazon Prime Air drone delivery program and the Boeing company as senior strategist for several years. Emily is passionate about potentials of cloud-based BI solutions and looks forward to helping customers advance in their data-driven strategy-making.

AWS IoT FleetWise Now Generally Available – Easily Collect Vehicle Data and Send to the Cloud

Post Syndicated from Channy Yun original

Today we announce the general availability of AWS IoT FleetWise, a fully managed AWS service that makes it easier to collect, transform, and transfer vehicle data to the cloud. Last AWS re:Invent 2021, we previewed AWS IoT FleetWise, heard customer feedback, and improved features for various use cases of near-real-time vehicle data processing.

With AWS IoT FleetWise, automakers, fleet operators, and automotive suppliers can take the complex variability out of collecting data from vehicle fleets at scale. You can access standardized fleet-wide vehicle data and avoid developing custom data collection systems, or you can integrate AWS IoT FleetWise to enhance your existing systems. AWS IoT FleetWise enables intelligent data collection that sends the exact data you need from the vehicle to the cloud. You can use the data to analyze vehicle fleet health to more quickly identify potential maintenance issues or make in-vehicle infotainment systems smarter. Furthermore, you can use it to train machine learning (ML) models that improve autonomous driving and advanced driver assistance systems (ADAS).

For example, electric vehicle (EV) battery temperature is a critical metric that should be continuously analyzed for the entire vehicle fleet. In order to avoid costly continuous data ingestion, you may want to optimize the data collection by setting a threshold on EV battery temperature. The results of this analysis would be provided to the automaker’s quality engineering department, enabling fast assessment of the criticality and possible root causes of any issues identified at certain temperatures. Based on the root cause analysis, the automaker can then take short-term actions to support the driver affected by the issue, as well as midterm actions to improve vehicle quality.

How AWS IoT FleetWise Works
AWS IoT FleetWise provides a vehicle modeling framework that you can use to model your vehicle and its sensors and actuators in the cloud. To enable secure communication between your vehicle and the cloud, AWS IoT FleetWise also provides the AWS IoT FleetWise Edge Agent application that you can use to download and install in-vehicle electronic control units (ECUs) such as the gateway, in-vehicle infotainment controller, etc. You define data collection schemes in the cloud and deploy them to your vehicle.

The AWS IoT FleetWise Edge Agent running in your vehicle uses data collection schemes to control what data to collect and when to transfer it to the cloud. Data collected and ingested through AWS IoT FleetWise Edge Agent software goes directly into your Amazon Timestream table or Amazon Simple Storage Service (Amazon S3) repositories via AWS IoT Core.

AWS IoT FleetWise Features
To get started with AWS IoT FleetWise, you can register your account and configure the settings via the AWS console. AWS IoT FleetWise automatically registers your AWS account, IAM role, and Amazon Timestream resources.

The Edge Agent software is a C++ application distributed as source code and is available on GitHub to collect, decode, normalize, cache, and ingest vehicle data to AWS. It supports multiple deployment options, such as vehicle gateways, infotainment systems, telematics control units (TCUs), or aftermarket devices. When vehicles are connected to the cloud, the Edge Agent continually receives data collection schemes and collects, decodes, normalizes and ingests the transformed vehicle data to AWS.

Let’s see the benefits and features of AWS IoT FleetWise:

Signal catalog
A signal catalog contains a collection of vehicle signals. Signals are fundamental structures that you define to contain vehicle data and its metadata. A signal can be a sensor and its status, an attribute as static information of the manufacturer, a branch to represent a nested structure such as Vehicle.Powertrain.combustionEngine expression, or an actuator such as the state of a vehicle device. For example, you can create a sensor to receive in-vehicle temperature values and store its metadata, including a sensor name, a data type, and a unit.

Signals in a signal catalog can be used to model vehicles that use different protocols and data formats. For example, there are two cars made by different automakers: one uses the Controller Area Network (CAN) to transmit the in-vehicle temperature data and the other uses On-board Diagnostic (OBD) protocol.

You can define a sensor in the signal catalog to receive in-vehicle temperature values. This sensor can be used to represent the thermocouples in both cars, irrespective of how this temperature data is available within the vehicle networks. For more information, see Create and manage signal catalogs in the AWS documentation.

Vehicle models
Vehicle models are virtual declarative representations that standardize the format of your vehicles and define relationships between signals in the vehicles. Vehicle models enforce consistent information across multiple vehicles of the same type so that you can quickly configure and create a vehicle fleet. In each vehicle model, you can add signals, including attributes, branches (signal hierarchies), sensors, and actuators.

You can define condition-based schemes to control what data to collect, such as data in-vehicle temperature values that are greater than 40 degrees. You can also define time-based schemes to control how often to collect data. For more information, see Create and manage vehicle models in the AWS documentation.

When a decoder manifest is associated with a vehicle model, you can create a vehicle. Each vehicle corresponds to an AWS IoT thing. You can use an existing AWS IoT thing to create a vehicle or set AWS IoT FleetWise to automatically create an AWS IoT thing for your vehicle. For more information, see Provision vehicles in the AWS documentation. After you create vehicles, you can create campaigns for them.

A campaign gives the AWS IoT FleetWise Edge Agent instructions on how to select, collect, and transfer data to the cloud. You can make a campaign with vehicle attributes that you added when creating vehicles, and a data collection scheme. You can manually define the data collection scheme either condition-based logical expressions such as $variable.myVehicle.InVehicleTemperature > 40.0, or time-based data collection in milliseconds such as from 10000 – 60000 milliseconds. To learn more, see Create a campaign in the AWS documentation.

After you create and approve the campaign, AWS IoT FleetWise automatically deploys the campaign to the listed vehicles. The AWS IoT FleetWise Edge Agent software doesn’t start collecting data until a running campaign is deployed to the vehicle. If you want to pause collecting data from vehicles connected to the campaign, on the Campaign summary page, choose Suspend. To resume collecting data from vehicles connected to the campaign, choose Resume.

Demo – Visualizing Vehicle Data
Here is a demo that aims to show how AWS IoT FleetWise can make it easy to collect vehicle data and use it to build visualizing applications. In this demo, you can simulate two kinds of vehicles, an NXP GoldBox powered by an Automotive Grade Linux distribution that runs the AWS IoT FleetWise agent as an AWS IoT Greengrass component or a completely virtual vehicle implemented as an AWS Graviton ARM-based Amazon EC2 instance. To learn more, see the getting started guide and source code in the GitHub repository.

The vehicle in CARLA Simulator can self-drive or be driven with a game steering wheel connected to your desktop. You can watch a live demo video.

Data is collected by AWS IoT FleetWise and stored in the Amazon Timestream table, and visualized on a Grafana Dashboard.

Customer and Partner Voices
During the preview period, we heard lots of feedback from our customers and partners in automotive industry such as automakers, fleet operators, and automotive suppliers.

For example, Hyundai Motor Group (HMG) is a global vehicle manufacturer that offers consumers a technology-rich lineup of cars, sport utility vehicles, and electrified vehicles. HMG has used AWS services, such as using Amazon SageMaker, to reduce its ML model training time for autonomous driving models.

Hae Young Kwon, vice president and head of the infotainment development group at HMG, said:

“As a leading global vehicle manufacturer, we have come to appreciate the breadth and depth of AWS services to help create new connected vehicle capabilities. With more data available from our expanding global fleet of connected cars, we look forward to leveraging AWS IoT FleetWise to discover how we can build more personalized ownership experiences for our customers.”

LG CNS is a global IT service provider and AWS Premier Consulting Partner that is transforming smart transportation services by building an advanced transportation system that is convenient and safe by maximizing the operational efficiency of multiple modes of transport, including buses, subways, taxis, railways, and airplanes.

Jae Seung Lee, vice president at LG CNS, said:

“At LG CNS, we are committed to advancing the technology that is powering the future of transportation. By using AWS IoT FleetWise, we are creating a new data platform that allows us to ingest, analyze, and simulate vehicle conditions in real-time. With these advanced insights, our customers can gain a better understanding of their vehicles and, as a result, improve decision-making about their fleets.”

Bridgestone is a global leader in tires and rubber building on its expertise to provide solutions for safe and sustainable mobility. Bridgestone has worked with AWS for several years to develop a system that delivers insights derived from the interaction between a tire and a vehicle using advanced machine learning capabilities on Amazon SageMaker.

Brian Goldstine, president of mobility solutions and fleet management at Bridgestone Americas Inc. said:

“Bridgestone has been working with AWS to transform the digital services we provide to our automotive manufacturer, fleet, and retail customers. We look forward to exploring how AWS IoT FleetWise will make it easier for our customers to collect detailed tire data, which can provide new insights for their products and applications.”

Renesas Electronics Corporation is a global leader in microcontrollers, analog, power, and system on chips (SoC) products. Renesas launched cellular-to-cloud IoT development platforms and its cloud development kits to run on AWS IoT Core and FreeRTOS.

Yusuke Kawasaki, director at Renesas Electronics Corporation, said:

“The volume of connected vehicle data is forecast to increase dramatically over the next few years, driven by new and evolving customer expectations. As a result, Renesas is focused on addressing the needs of automotive engineers facing increasing system complexity. Incorporating AWS IoT FleetWise into our vehicle gateway solution will enable our customers to enjoy our market-ready approach for large-scale data collection and accelerate their cloud development strategy. We look forward to further collaborating with AWS to provide a better and simpler development environment for our customers.”

By working with AWS IoT FleetWise Partners, you can take advantage of solutions to streamline your IoT projects, reduce the risk of your efforts, and accelerate time to value. To learn more how AWS accelerates the automotive industry’s digital transformation, see AWS for Automotive.

Now Available
AWS IoT FleetWise is now generally available in the US East (N. Virginia) and Europe (Frankfurt) Regions. You pay for the vehicles you have created and messages per vehicle per month. Additional services used alongside AWS IoT FleetWise, such as AWS IoT Core and Amazon Timestream, are billed separately. For more detail, see the AWS IoT FleetWise pricing page.

To learn more, see the AWS IoT FleetWise resources page including documentations, videos, and blog posts. Please send feedback to AWS re:Post for AWS IoT FleetWise or through your usual AWS support contacts.


Get a quick start with Apache Hudi, Apache Iceberg, and Delta Lake with Amazon EMR on EKS

Post Syndicated from Amir Shenavandeh original

A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can keep your data as is in your object store or file-based storage without having to first structure the data. Additionally, you can run different types of analytics against your loosely formatted data lake—from dashboards and visualizations to big data processing, real-time analytics, and machine learning (ML) to guide better decisions. Due to the flexibility and cost effectiveness that a data lake offers, it’s very popular with customers who are looking to implement data analytics and AI/ML use cases.

Due to the immutable nature of the underlying storage in the cloud, one of the challenges in data processing is updating or deleting a subset of identified records from a data lake. Another challenge is making concurrent changes to the data lake. Implementing these tasks is time consuming and costly.

In this post, we explore three open-source transactional file formats: Apache Hudi, Apache Iceberg, and Delta Lake to help us to overcome these data lake challenges. We focus on how to get started with these data storage frameworks via real-world use case. As an example, we demonstrate how to handle incremental data change in a data lake by implementing a Slowly Changing Dimension Type 2 solution (SCD2) with Hudi, Iceberg, and Delta Lake, then deploy the applications with Amazon EMR on EKS.

ACID challenge in data lakes

In analytics, the data lake plays an important role as an immutable and agile data storage layer. Unlike traditional data warehouses or data mart implementations, we make no assumptions on the data schema in a data lake and can define whatever schemas required by our use cases. It’s up to the downstream consumption layer to make sense of that data for their own purposes.

One of the most common challenges is supporting ACID (Atomicity, Consistency, Isolation, Durability) transactions in a data lake. For example, how do we run queries that return consistent and up-to-date results while new data is continuously being ingested or existing data is being modified?

Let’s try to understand the data problem with a real-world scenario. Assume we centralize customer contact datasets from multiple sources to an Amazon Simple Storage Service (Amazon S3)-backed data lake, and we want to keep all the historical records for analysis and reporting. We face the following challenges:

  • We keep creating append-only files in Amazon S3 to track the contact data changes (insert, update, delete) in near-real time.
  • Consistency and atomicity aren’t guaranteed because we just dump data files from multiple sources without knowing whether the entire operation is successful or not.
  • We don’t have an isolation guarantee whenever multiple workloads are simultaneously reading and writing to the same target contact table.
  • We track every single activity at source, including duplicates caused by the retry mechanism and accidental data changes that are then reverted. This leads to the creation of a large volume of append-only files. The performance of extract, transform, and load (ETL) jobs decreases as all the data files are read each time.
  • We have to shorten the file retention period to reduce the data scan and read performance.

In this post, we walk through a simple SCD2 ETL example designed for solving the ACID transaction problem with the help of Hudi, Iceberg, and Delta Lake. We also show how to deploy the ACID solution with EMR on EKS and query the results by Amazon Athena.

Custom library dependencies with EMR on EKS

By default, Hudi and Iceberg are supported by Amazon EMR as out-of-the-box features. For this demonstration, we use EMR on EKS release 6.8.0, which contains Apache Iceberg 0.14.0-amzn-0 and Apache Hudi 0.11.1-amzn-0. To find out the latest and past versions that Amazon EMR supports, check out the Hudi release history and the Iceberg release history tables. The runtime binary files of these frameworks can be found in the Spark’s class path location within each EMR on EKS image. See Amazon EMR on EKS release versions for the list of supported versions and applications.

As of this writing, Amazon EMR does not include Delta Lake by default. There are two ways to make it available in EMR on EKS:

  • At the application level – You install Delta libraries by setting a Spark configuration spark.jars or --jars command-line argument in your submission script. The JAR files will be downloaded and distributed to each Spark Executor and Driver pod when starting a job.
  • At the Docker container level – You can customize an EMR on EKS image by packaging Delta dependencies into a single Docker container that promotes portability and simplifies dependency management for each workload

Other custom library dependencies can be managed the same way as for Delta Lake—passing a comma-separated list of JAR files in the Spark configuration at job submission, or packaging all the required libraries into a Docker image.

Solution overview

The solution provides two sample CSV files as the data source: initial_contact.csv and update_contacts.csv. They were generated by a Python script with the Faker package. For more details, check out the tutorial on GitHub.

The following diagram describes a high-level architecture of the solution and different services being used.

The workflow steps are as follows:

  1. Ingest the first CSV file from a source S3 bucket. The data is being processed by running a Spark ETL job with EMR on EKS. The application contains either the Hudi, Iceberg, or Delta framework.
  2. Store the initial table in Hudi, Iceberg, or Delta file format in a target S3 bucket (curated). We use the AWS Glue Data Catalog as the hive metastore. Optionally, you can configure Amazon DynamoDB as a lock manager for the concurrency controls.
  3. Ingest a second CSV file that contains new records and some changes to the existing ones.
  4. Perform SCD2 via Hudi, Iceberg, or Delta in the Spark ETL job.
  5. Query the Hudi, Iceberg, or Delta table stored on the target S3 bucket in Athena

To simplify the demo, we have accommodated steps 1–4 into a single Spark application.


Install the following tools:

curl -fsSL -o \

chmod 700
export DESIRED_VERSION=v3.8.2
helm version

For a quick start, you can use AWS CloudShell which includes the AWS CLI and kubectl already.

Clone the project

Download the sample project either to your computer or the CloudShell console:

git clone
cd emr-on-eks-hudi-iceberg-delta

Set up the environment

Run the following script to set up a test environment. The infrastructure deployment includes the following resources:

  • A new S3 bucket to store sample data and job code.
  • An Amazon Elastic Kubernetes Service (Amazon EKS) cluster (version 1.21) in a new VPC across two Availability Zones.
  • An EMR virtual cluster in the same VPC, registered to the emr namespace in Amazon EKS.
  • An AWS Identity and Access Management (IAM) job execution role contains DynamoDB access, because we use DynamoDB to provide concurrency controls that ensure atomic transaction with the Hudi and Iceberg tables.
export AWS_REGION=us-east-1
export EKSCLUSTER_NAME=eks-quickstart
# Upload sample contact data to S3
export ACCOUNTID=$(aws sts get-caller-identity --query Account --output text)
aws s3 sync data s3://emr-on-eks-quickstart-${ACCOUNTID}-${AWS_REGION}/blog/data

Job execution role

The provisioning includes an IAM job execution role called emr-on-eks-quickstart-execution-role that allows your EMR on EKS jobs access to the required AWS services. It contains AWS Glue permissions because we use the Data Catalog as our metastore.

See the following code:

    "Effect": "Allow",
    "Action": ["glue:Get*","glue:BatchCreatePartition","glue:UpdateTable","glue:CreateTable"],
    "Resource": [

Additionally, the role contains DynamoDB permissions, because we use the service as the lock manager. It provides concurrency controls that ensure atomic transaction with our Hudi and Iceberg tables. If a DynamoDB table with the given name doesn’t exist, a new table is created with the billing mode set as pay-per-request. More details can be found in the following framework examples.

    "Sid": "DDBLockManager",
    "Effect": "Allow",
    "Action": [
    "Resource": [

Example 1: Run Apache Hudi with EMR on EKS

The following steps provide a quick start for you to implement SCD Type 2 data processing with the Hudi framework. To learn more, refer to Build Slowly Changing Dimensions Type 2 (SCD2) with Apache Spark and Apache Hudi on Amazon EMR.

The following code snippet demonstrates the SCD type2 implementation logic. It creates Hudi tables in a default database in the Glue Data Catalog. The full version is in the script

# Read incremental contact CSV file with extra SCD columns
delta_csv_df ="csv")\
.withColumn("ts", lit(current_timestamp()).cast(TimestampType()))\
.withColumn("valid_from", lit(current_timestamp()).cast(TimestampType()))\
.withColumn("valid_to", lit("").cast(TimestampType()))\
.withColumn('iscurrent', lit(1).cast("int"))

## Find existing records to be expired
join_cond = [initial_hudi_df.checksum != delta_csv_df.checksum,
             initial_hudi_df.iscurrent == 1]
contact_to_update_df = (initial_hudi_df.join(delta_csv_df, join_cond)
                      .withColumn('iscurrent', lit(0).cast("int"))
merged_contact_df = delta_csv_df.unionByName(contact_to_update_df)

# Upsert
                    .option('hoodie.datasource.write.operation', 'upsert')\
                    .options(**hudiOptions) \

In the job script, the hudiOptions were set to use the AWS Glue Data Catalog and enable the DynamoDB-based Optimistic Concurrency Control (OCC). For more information about concurrency control and alternatives for lock providers, refer to Concurrency Control.

hudiOptions = {
    # sync to Glue catalog
    # DynamoDB based locking mechanisms
    "hoodie.write.concurrency.mode":"optimistic_concurrency_control", #default is SINGLE_WRITER
    "hoodie.cleaner.policy.failed.writes":"LAZY", #Hudi will delete any files written by failed writes to re-claim space
    "hoodie.write.lock.dynamodb.region": REGION,
    "hoodie.write.lock.dynamodb.endpoint_url": f"dynamodb.{REGION}"
  1. Upload the job scripts to Amazon S3:
    export AWS_REGION=us-east-1
    export ACCOUNTID=$(aws sts get-caller-identity --query Account --output text)
    aws s3 sync hudi/ s3://emr-on-eks-quickstart-${ACCOUNTID}-${AWS_REGION}/blog/

  2. Submit Hudi jobs with EMR on EKS to create SCD2 tables:
    export EMRCLUSTER_NAME=emr-on-eks-quickstart
    export AWS_REGION=us-east-1

    Hudi supports two tables types: Copy on Write (CoW) and Merge on Read (MoR). The following is the code snippet to create a CoW table. For the complete job scripts for each table type, refer to and

    aws emr-containers start-job-run \
      --virtual-cluster-id $VIRTUAL_CLUSTER_ID \
      --name em68-hudi-cow \
      --execution-role-arn $EMR_ROLE_ARN \
      --release-label emr-6.8.0-latest \
      --job-driver '{
      "sparkSubmitJobDriver": {
          "entryPoint": "s3://'$S3BUCKET'/blog/",
          "sparkSubmitParameters": "--jars local:///usr/lib/hudi/hudi-spark-bundle.jar,local:///usr/lib/spark/external/lib/spark-avro.jar --conf spark.executor.memory=2G --conf spark.executor.cores=2"}}' \
      --configuration-overrides '{
        "applicationConfiguration": [
            "classification": "spark-defaults", 
            "properties": {
              "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
              "spark.hadoop.hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"

  3. Check the job status on the EMR virtual cluster console.
  4. Query the output in Athena:

    select * from hudi_contact_cow where id=103

    select * from hudi_contact_mor_rt where id=103

Example 2: Run Apache Iceberg with EMR on EKS

Starting with Amazon EMR version 6.6.0, you can use Apache Spark 3 on EMR on EKS with the Iceberg table format. For more information on how Iceberg works in an immutable data lake, see Build a high-performance, ACID compliant, evolving data lake using Apache Iceberg on Amazon EMR.

The sample job creates an Iceberg table iceberg_contact in the default database of AWS Glue. The full version is in the script. The following code snippet shows the SCD2 type of MERGE operation:

# Read incremental CSV file with extra SCD2 columns\

# Update existing records which are changed in the update file
contact_update_qry = """
    WITH contact_to_update AS (
          SELECT target.*
          FROM glue_catalog.default.iceberg_contact AS target
          JOIN iceberg_contact_update AS source 
          ON =
          WHERE target.checksum != source.checksum
            AND target.iscurrent = 1
          SELECT * FROM iceberg_contact_update
    ),contact_updated AS (
        SELECT *, LEAD(valid_from) OVER (PARTITION BY id ORDER BY valid_from) AS eff_from
        FROM contact_to_update
    SELECT id,name,email,state,ts,valid_from,
        CAST(COALESCE(eff_from, null) AS Timestamp) AS valid_to,
        CASE WHEN eff_from IS NULL THEN 1 ELSE 0 END AS iscurrent,
    FROM contact_updated
# Upsert
    MERGE INTO glue_catalog.default.iceberg_contact tgt
    USING ({contact_update_qry}) src
    ON =
    AND tgt.checksum = src.checksum

As demonstrated earlier when discussing the job execution role, the role emr-on-eks-quickstart-execution-role granted access to the required DynamoDB table myIcebergLockTable, because the table is used to obtain locks on Iceberg tables, in case of multiple concurrent write operations against a single table. For more information on Iceberg’s lock manager, refer to DynamoDB Lock Manager.

  1. Upload the application scripts to the example S3 bucket:
    export AWS_REGION=us-east-1
    export ACCOUNTID=$(aws sts get-caller-identity --query Account --output text)
    aws s3 sync iceberg/ s3://emr-on-eks-quickstart-${ACCOUNTID}-${AWS_REGION}/blog/

  2. Submit the job with EMR on EKS to create an SCD2 Iceberg table:
    export EMRCLUSTER_NAME=emr-on-eks-quickstart
    export AWS_REGION=us-east-1

    The full version code is in the script. The code snippet is as follows:

    aws emr-containers start-job-run \
    --virtual-cluster-id $VIRTUAL_CLUSTER_ID \
    --name em68-iceberg \
    --execution-role-arn $EMR_ROLE_ARN \
    --release-label emr-6.8.0-latest \
    --job-driver '{
        "sparkSubmitJobDriver": {
        "entryPoint": "s3://'$S3BUCKET'/blog/",
        "sparkSubmitParameters": "--jars local:///usr/share/aws/iceberg/lib/iceberg-spark3-runtime.jar --conf spark.executor.memory=2G --conf spark.executor.cores=2"}}' \
    --configuration-overrides '{
        "applicationConfiguration": [
        "classification": "spark-defaults",
        "properties": {
        "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions",
        "spark.sql.catalog.glue_catalog": "org.apache.iceberg.spark.SparkCatalog",
        "spark.sql.catalog.glue_catalog.catalog-impl": "",
        "spark.sql.catalog.glue_catalog.warehouse": "s3://'$S3BUCKET'/iceberg/",
        "": "",
        "spark.sql.catalog.glue_catalog.lock-impl": "",
        "spark.sql.catalog.glue_catalog.lock.table": "myIcebergLockTable"

  3. Check the job status on the EMR on EKS console.
  4. When the job is complete, query the table in Athena:
    select * from iceberg_contact where id=103

Example 3: Run open-source Delta Lake with EMR on EKS

Delta Lake 2.1.x is compatible with Apache Spark 3.3.x. Check out the compatibility list for other versions of Delta Lake and Spark. In this post, we use Amazon EMR release 6.8 (Spark 3.3.0) to demonstrate the SCD2 implementation in a data lake.

The following is the Delta code snippet to load initial dataset; the incremental load MERGE logic is highly similar to the Iceberg example. As a one-off task, there should be two tables set up on the same data:

  • The Delta table delta_table_contact – Defined on the TABLE_LOCATION at ‘s3://{S3_BUCKET_NAME}/delta/delta_contact’. The MERGE/UPSERT operation must be implemented on the Delta destination table. Athena can’t query this table directly, instead it reads from a manifest file stored in the same location, which is a text file containing a list of data files to read for querying a table. It is described as an Athena table below.
  • The Athena table delta_contact – Defined on the manifest location s3://{S3_BUCKET_NAME}/delta/delta_contact/_symlink_format_manifest/. All read operations from Athena must use this table.

The full version code is in the  script. The code snippet is as follows:

# Read initial contact CSV file and create a Delta table with extra SCD2 columns
df_intial_csv =\

spark.sql("GENERATE symlink_format_manifest FOR TABLE delta_table_contact")
spark.sql("ALTER TABLE delta_table_contact SET TBLPROPERTIES(delta.compatibility.symlinkFormatManifest.enabled=true)")

# Create a queriable table in Athena
    CREATE EXTERNAL TABLE IF NOT EXISTS default.delta_contact (
LOCATION '{TABLE_LOCATION}/_symlink_format_manifest/'""")

The SQL statement GENERATE symlink_format_manifest FOR TABLE ... is a required step to set up the Athena for Delta Lake. Whenever the data in a Delta table is updated, you must regenerate the manifests. Therefore, we use ALTER TABLE .... SET TBLPROPERTIES(delta.compatibility.symlinkFormatManifest.enabled=true) to automate the manifest refresh as a one-off setup.

  1. Upload the Delta sample scripts to the S3 bucket:
    export AWS_REGION=us-east-1
    export ACCOUNTID=$(aws sts get-caller-identity --query Account --output text)
    aws s3 sync delta/ s3://emr-on-eks-quickstart-${ACCOUNTID}-${AWS_REGION}/blog/

  2. Submit the job with EMR on EKS:
    export EMRCLUSTER_NAME=emr-on-eks-quickstart
    export AWS_REGION=us-east-1

    The full version code is in the script. The open-source Delta JAR files must be included in the spark.jars. Alternatively, follow the instructions in How to customize Docker images and build a custom EMR on EKS image to accommodate the Delta dependencies.

    "spark.jars": ","

    The code snippet is as follows:

    aws emr-containers start-job-run \
    --virtual-cluster-id $VIRTUAL_CLUSTER_ID \
    --name em68-delta \
    --execution-role-arn $EMR_ROLE_ARN \
    --release-label emr-6.8.0-latest \
    --job-driver '{
       "sparkSubmitJobDriver": {
       "entryPoint": "s3://'$S3BUCKET'/blog/",
       "sparkSubmitParameters": "--conf spark.executor.memory=2G --conf spark.executor.cores=2"}}' \
    --configuration-overrides '{
       "applicationConfiguration": [
        "classification": "spark-defaults",
        "properties": {
        "spark.sql.extensions": "",
    "spark.jars": ","

  3. Check the job status from the EMR on EKS console.
  4. When the job is complete, query the table in Athena:
    select * from delta_contact where id=103

Clean up

To avoid incurring future charges, delete the resources generated if you don’t need the solution anymore. Run the following cleanup script (change the Region if necessary):

export EMRCLUSTER_NAME=emr-on-eks-quickstart
export AWS_REGION=us-east-1


Implementing an ACID-compliant data lake with EMR on EKS enables you focus more on delivering business value, instead of worrying about managing complexities and reliabilities at the data storage layer.

This post presented three different transactional storage frameworks that can meet your ACID needs. They ensure you never read partial data (Atomicity). The read/write isolation allows you to see consistent snapshots of the data, even if an update occurs at the same time (Consistency and Isolation). All the transactions are stored directly to the underlying Amazon S3-backed data lake, which is designed for 11 9’s of durability (Durability).

For more information, check out the sample GitHub repository used in this post and the EMR on EKS Workshop. They will get you started with running your familiar transactional framework with EMR on EKS. If you want dive deep into each storage format, check out the following posts:

About the authors

Amir Shenavandeh is a Sr Analytics Specialist Solutions Architect and Amazon EMR subject matter expert at Amazon Web Services. He helps customers with architectural guidance and optimisation. He leverages his experience to help people bring their ideas to life, focusing on distributed processing and big data architectures.

Melody Yang is a Senior Big Data Solutions Architect for Amazon EMR at AWS. She is an experienced analytics leader working with AWS customers to provide best practice guidance and technical advice in order to assist their success in data transformation. Her areas of interests are open-source frameworks and automation, data engineering, and DataOps.

Amit Maindola is a Data Architect focused on big data and analytics at Amazon Web Services. He helps customers in their digital transformation journey and enables them to build highly scalable, robust, and secure cloud-based analytical solutions on AWS to gain timely insights and make critical business decisions.

Bash 5.2 released

Post Syndicated from corbet original

Version 5.2 of the bash shell has been released.

The most notable new feature is the rewritten command substitution
parsing code, which calls the bison parser recursively. This
replaces the ad-hoc parsing used in previous versions, and allows
better syntax checking and catches syntax errors much earlier. The
shell attempts to do a much better job of parsing and expanding
array subscripts only once; this has visible effects in the `unset’
builtin, word expansions, conditional commands, and other builtins
that can assign variable values as a side effect.

How to Deploy a SIEM That Actually Works

Post Syndicated from Robert Holzer original

I deployed my SIEM in days, not months – here’s how you can too

How to Deploy a SIEM That Actually Works

As an IT administrator at a highly digitized manufacturing company, I spent many sleepless nights with no visibility into the activity and security of our environment before deploying a security information and event management (SIEM) solution. At the company I work for, Schlotterer Sonnenschutz Systeme GmbH, we have a lot of manufacturing machines that rely on internet access and external companies that remotely connect to our company’s environment – and I couldn’t see any of it happening. One of my biggest priorities was to source and implement state-of-the-art security solutions – beginning with a SIEM tool.

I asked colleagues and partners in the IT sector about their experience with deploying and leveraging SIEM technology. The majority of the feedback I received was that deploying a SIEM was a lengthy and difficult process. Then, once stood up, SIEMs were often missing information or difficult to pull actionable data from.

The feedback did not instill much confidence – particularly as this would be the first time I personally had deployed a SIEM. I was prepared for a long deployment road ahead, with the risk of shelfware looming over us. However, to my surprise, after identifying Rapid7’s InsightIDR as our chosen solution, the process was manageable and efficient, and we began receiving value just days after deployment. Rapid7 is clearly an outlier in this space: able to deliver an intuitive and accelerated onboarding experience while still driving actionable insights and sophisticated security results.

3 key steps for successful SIEM deployment

Based on my experience, our team identified three critical steps that must be taken in order to have a successful SIEM deployment:

  1. Identifying core event sources and assets you intend to onboard before deployment
  2. Collecting and correlating relevant and actionable security telemetry to form a holistic and accurate view of your environment while driving reliable early threat detection (not noise)
  3. Putting data to work in your SIEM so you can begin visualizing and analyzing to validate the success of your deployment

1. Identify core event sources and assets to onboard

Before deploying a SIEM, gather as much information as possible about your environment so you can easily begin the deployment process. Rapid7 provided easy-to-understand help documentation all throughout our deployment process in order to set us up for success. The instructions were highly detailed and easy to understand, making the setup quick and painless. Additionally, they provide a wide selection of pre-built event sources out-of-the-box, simplifying my experience. Within hours, I had all the information I needed in front of me.

Based on Rapid7’s recommendations, we set up what is referred to as the six core event sources:

  • Active Directory (AD)
  • Protocol (LDAP) server
  • Dynamic Host Configuration Protocol (DHCP) event logs
  • Domain Name System (DNS)
  • Virtual Private Network (VPN)
  • Firewall

Creating these event sources will get the most information flowing through your SIEM and if your solution has user behavior entity analytics (UEBA) capabilities like InsightIDR. Getting all the data in quickly begins the baselining process so you can identify anomalies and potential user and insider threats down the road.

2. Collecting and correlating relevant and actionable security telemetry

When deployed and configured properly, a good SIEM will unify your security telemetry into a single cohesive picture. When done ineffectively, a SIEM can create an endless maze of noise and alerts. Striking a balance of ingesting the right security telemetry and threat intelligence to drive meaningful, actionable threat detections is critical to effective detection, investigation, and response. A great solution harmonizes otherwise disparate sources to give a cohesive view of the environment and malicious activity.

InsightIDR came with a native endpoint agent, network sensor, and a host of integrations to make this process much easier. To provide some context, at Schlotterer Sonnenschutz Systeme GmbH, we have a large number of mobile devices, laptops, surface devices, and other endpoints that exist outside the company. The combined Insight Network Sensor and Insight Agent monitor our environment beyond the physical borders of our IT for complete visibility across offices, remote employees, virtual devices, and more.

Personally, when it comes to installing any agent, I prefer to take a step-by-step approach to reduce any potential negative effects the agent might have on endpoints. With InsightIDR, I easily deployed the Insight Agent on my own computer; then, I pushed it to an additional group of computers. The Rapid7 Agent’s lightweight software deployment is easy on our infrastructure. It took me no time to deploy it confidently to all our endpoints.

With data effectively ingested, we prepared to turn our attention to threat detection. Traditional SIEMs we had explored left much of the detection content creation to us to configure and manage – significantly swelling the scope of deployment and day-to-day operations. However, Rapid7 comes with an expansive managed library of curated detections out of the box – eliminating the need for upfront customizing and configuring and giving us coverage immediately. The Rapid7 detections are vetted by their in-house MDR SOC, which means they don’t create too much noise, and I had to do little to no tuning so that they aligned with my environment.

3. Putting your data to work in your SIEM

For our resource-constrained team, ensuring that we had relevant dashboarding and reports to track critical systems, activity across our network, and support audits and regulatory requirements was always a big focus. From talking to my peers, we were weary of building dashboards that would require our team to take on complicated query writing to create sophisticated visuals and reports. The prebuilt dashboards included within InsightIDR were again a huge time-saver for our team and helped us mobilize around sophisticated security reporting out of the gate. For example, I am using InsightIDR’s Active Directory Admin Actions dashboard to identify:

  • What accounts were created in the past 24 hours?
  • What accounts were deleted in the past 24 hours?
  • What accounts changed their password?
  • Who was added as a domain administrator?

Because the dashboards are already built into the system, it takes me just a few minutes to see the information I need to see and export that data to an interactive HTML report I can provide to my stakeholders. When deploying your own SIEM, I recommend really digging into the visualization options, seeing what it will take to build your own cards, and exploring any available prebuilt content to understand how long it may take you to begin seeing actionable data.  

I now have knowledge about my environment. I know what happens. I know for sure that if there is anything malicious or suspicious in my environment, Rapid7, the Insight Agent, or any of the sources we have integrated to InsightIDR will catch it, and I can take action right away.

Additional reading:


Get the latest stories, expertise, and news about security today.

Cloudflare named a Leader in WAF by Forrester

Post Syndicated from Michael Tremante original

Cloudflare named a Leader in WAF by Forrester

Cloudflare named a Leader in WAF by Forrester

Forester has recognised Cloudflare as a Leader in The Forrester Wave™: Web Application Firewalls, Q3 2022 report. The report evaluated 12 Web Application Firewall (WAF) providers on 24 criteria across current offering, strategy and market presence.

You can register for a complimentary copy of the report here. The report helps security and risk professionals select the correct offering for their needs.

We believe this achievement, along with recent WAF developments, reinforces our commitment and continued investment in the Cloudflare Web Application Firewall (WAF), one of our core product offerings.

The WAF, along with our DDoS Mitigation and CDN services, has in fact been an offering since Cloudflare’s founding, and we could not think of a better time to receive this recognition: Birthday Week.

We’d also like to take this opportunity to thank Forrester.

Leading WAF in strategy

Cloudflare received the highest score of all assessed vendors in the strategy category. We also received the highest possible scores in 10 criteria, including:

  • Innovation
  • Management UI
  • Rule creation and modification
  • Log4Shell response
  • Incident investigation
  • Security operations feedback loops

According to Forrester, “Cloudflare Web Application Firewall shines in configuration and rule creation”, “Cloudflare stands out for its active online user community and its associated response time metrics”, and “Cloudflare is a top choice for those prioritizing usability and looking for a unified application security platform.”

Protecting web applications

The core value of any WAF is to keep web applications safe from external attacks by stopping any compromise attempt. Compromises can in fact lead to complete application take over and data exfiltration resulting in financial and reputational damage to the targeted organization.

The Log4Shell criterion in the Forrester Wave report is an excellent example of a real world use case to demonstrate this value.

Log4Shell was a high severity vulnerability discovered in December 2021 that affected the popular Apache Log4J software commonly used by applications to implement logging functionality. The vulnerability, when exploited, allows an attacker to perform remote code execution and consequently take over the target application.

Due to the popularity of this software component, many organizations worldwide were potentially at risk after the immediate public announcement of the vulnerability on December 9, 2021.

We believe that we scored the highest possible score in the Log4Shell criterion due to our fast response to the announcement, by ensuring that all customers using the Cloudflare WAF were protected against the exploit in less than 17 hours globally.

We did this by deploying new managed rules (virtual patching) that were made available to all customers. The rules were deployed with a block action ensuring exploit attempts never reached customer applications.

Additionally, our continuous public updates on the subject, including regarding internal processes, helped create clarity and understanding around the severity of the issue and remediation steps.

In the following weeks from the initial announcement, we updated WAF rules several times following discovery of multiple variations of the attack payloads.

The Cloudflare WAF ultimately “bought” valuable time for our customers to patch their back end systems before attackers may have been able to find and attempt compromise of vulnerable applications.

You can read about our response and our actions following the Log4Shell announcement in great detail on our blog.

Use the Cloudflare WAF today

Cloudflare WAF keeps organizations safer while they focus on improving their applications and APIs. We integrate leading application security capabilities into a single console to protect applications with our WAF while also securing APIs, stopping DDoS attacks, blocking unwanted bots, and monitoring for 3rd party JavaScript attacks.

To start using our Cloudflare WAF today, sign up for an account.

Wuyts: Why async Rust

Post Syndicated from corbet original

Yoshua Wuyts gives an overview of async
and why it is interesting.

Conversations around “why async” often focus on performance – a
topic which is highly dependent on workloads, and results with
people wholly talking past each other. While performance is not a
bad reason to choose async Rust, we often we only notice
performance when we experience a lack of it. So I want to instead
on which features async Rust provides which aren’t present in
non-async Rust.

Security updates for Tuesday

Post Syndicated from corbet original

Security updates have been issued by Debian (dovecot and firefox-esr), Fedora (firefox and grafana), Red Hat (firefox and thunderbird), Slackware (dnsmasq and vim), SUSE (dpdk, firefox, kernel, libarchive, libcaca, mariadb, openvswitch, opera, permissions, podofo, snakeyaml, sqlite3, unzip, and vsftpd), and Ubuntu (expat, libvpx, linux-azure-fde, linux-oracle, squid, squid3, and webkit2gtk).

The world’s leading venture capitalist firms commit $1.25 BILLION to back startups built on Cloudflare Workers

Post Syndicated from Mia Wang original

The world’s leading venture capitalist firms commit $1.25 BILLION to back startups built on Cloudflare Workers

The world’s leading venture capitalist firms commit $1.25 BILLION to back startups built on Cloudflare Workers

From our earliest days, Cloudflare has stood for helping build a better Internet that’s accessible to all. It’s core to our mission that anyone who wants to start building on the Internet should be able to do so easily, and without the barriers of prohibitively expensive or difficult to use infrastructure.

Nowhere is this philosophy more important – and more impactful to the Internet – than with our developer platform, Cloudflare Workers. Workers is, quite simply, where developers and entrepreneurs start on Day 1. It’s a full developer platform that includes cloud storage; website hosting; SQL databases; and of course, the industry’s leading serverless product. The platform’s ease-of-use and accessible pricing (all the way down to free) are critical in advancing our mission. For startups, this translates into fast, easy deployment and iteration, that scales seamlessly with predictable, transparent and cost-effective pricing. Building a great business from scratch is hard enough – we ought to know! – and so we’re aiming to take all the complexity out of your application infrastructure.

Announcing the Workers Launchpad funding program

Today, we’re taking things a step further and making it easier for startups to build the business of their dreams. We’re announcing a $1.25 billion Workers Launchpad funding program in partnership with some of the world’s leading venture capital firms. Any startup built on Workers can apply. As is the case with the Workers Platform itself, we’ve tried to make applying dead simple: it should take you less than five minutes to submit your application through the Workers Launchpad portal.

How does it work? The only requirement for being eligible for the funding program is that you’ve built your core infrastructure on Workers. If you’re new to Cloudflare and Cloudflare Workers, check out our Startup Plan to get started. We hope these resources will be helpful to all startups and help level the playing field, no matter where in the world you might be.

Once you submit your application, it will be reviewed by our Launchpad team, several of whom are former entrepreneurs and venture capital folks themselves. They’ll match promising applicants with our VC partners who have the most expertise in your space (more on them below). Every quarter, we’ll announce the winners of our Launchpad program. Winners, our “Workers Founders”, will be guaranteed the opportunity to pitch the VC partner(s) that we’ve determined would be a good match for your business. It’s a win-win all around. VCs get the opportunity to invest in businesses they know are being built on a forward-looking, world-class, development platform. Entrepreneurs get connected to world-class VCs. And for the first class of winners, we’ll have a few added perks that we describe in more detail below.

Who are the VCs that you might get a chance to pitch to?

When we approached our friends in the venture community with our vision for the Workers Launchpad, we received incredibly positive feedback and excitement. Many have seen firsthand the competitive advantages of building on Workers through their own portfolio companies. Moreover, Cloudflare is home to one of the largest developer communities on Earth with approximately 20% of the world’s websites on our network. As such, we can play a unique role in matching great entrepreneurs with great VCs to further not only the Workers platform, but also the Internet ecosystem, for everyone.

We’re honored to announce a world-class group of VC Launch Partners supporting this program and the ecosystem of Workers-based startups:

The world’s leading venture capitalist firms commit $1.25 BILLION to back startups built on Cloudflare Workers

More on why we’re doing this

So why are we doing this? The simple answer is we’re proud of our Workers Developer Platform and think that everyone should be using it. Entrepreneurs who develop on Workers can ship faster; more easily and cost-effectively; and in a way that future proofs your infrastructure:

Speed. Development velocity isn’t just a convenience for an entrepreneur. It’s a massive competitive advantage. In fact, development velocity is one of Cloudflare’s competitive advantages – we’re able to develop quickly because we build on Workers. When you develop on Workers, you don’t need to spend time configuring DNS records, maintaining certificates, scaling up clusters, or building complex deployment pipelines. Focus on developing your application, and Cloudflare will handle the rest.

Ease of use. Startup teams and founders are some of the busiest people on earth. You shouldn’t have to think about – or make complicated decisions about – IT infrastructure. Questions like: “Which availability zone should I choose?”, or “Will I be able to scale up my infrastructure in time for our next viral marketing campaign?” shouldn’t have to cross your mind! And on the Workers Platform, they don’t. The code you and your team writes automatically deploys quickly and consistently across Cloudflare’s global network in 275+ cities in over 100 countries. Cloudflare securely and scalably connects your users to your applications, regardless of where those applications are hosted or how many users suddenly sign up for your product. Developers can easily manage globally distributed applications with a programmable network that easily connects to whatever services they need to talk to.

Future-proofing your infrastructure and your wallet. Cloudflare’s massive global network – that’s distributed across 275+ cities in over 100 countries – is able to scale with your business, no matter how large it grows to become. We also help you remain compliant with local laws and regulations as you expand around the world, with capabilities like Workers’ Jurisdictional Restrictions for Durable Objects. You can sleep soundly at night instead of worrying about how to level up your infrastructure in the midst of shifting regulations, and equally importantly, knowing that you will not wake up to any surprise bills. Many of us have had the experience of being charged unexpected and / or exorbitant fees from our cloud providers. For example, providers will often make it easy and free to onboard your application or data, but charge exorbitant rates when you want to move them out (i.e. egress fees). Cloudflare will never charge for egress. Our pricing is simple, and we constantly aim to be the low-cost provider, no matter how large your business grows to be.

The world’s leading venture capitalist firms commit $1.25 BILLION to back startups built on Cloudflare Workers

Dogfooding our own product

We’re excited about Workers not only because we’ve built our own infrastructure on it, but also because we’re seeing the incredible things others have built on it. In fact, we acquired a company built entirely on Workers at the end of last year, an Isareli start-up named Zaraz, which secures and accelerates third party web tools. Workers allowed Zaraz to replace the multiple network requests of each tool running on a website with one single request, effectively streamlining a messy web of extensions into a single lightweight application. This acquisition opened our eyes to the power of the global community that’s built on our platform, and left us motivated to help startups built on Workers find the funding, mentorship, and support needed to grow.

But wait, there’s more!

To make it even easier for startups to take advantage of all the benefits that Workers has to offer, applicants to the Workers Launchpad program who have raised less than $3 million in total external funding will automatically have the option to receive Cloudflare’s Startup Plan. This plan includes all the elements of Cloudflare’s Pro and Business Plans ($2,400 annual value) plus higher tiers of our Stream video product, our Teams Zero Trust security suite and the Workers platform. To make sure the full range of our developer platform is accessible to startups, we recently more than tripled the number of products available in this plan, which now includes email security, R2, Pages, KV, and many others.

Furthermore, all startups that apply by October 31, 2022, will be eligible to be selected for the Winter 2022 class of Workers Founders, which will unlock additional support, mentorship, and marketing opportunities. Being selected as a Workers Founder will get you a chance to practice your pitch with investors, engage with leaders from Cloudflare, and get advice on how to build a successful business from topics like recruiting to marketing, sales, and beyond during a virtual Workers Founders Bootcamp Week. The program will culminate in a virtual Demo Day, so you can show the world what you’ve been building. We’re leaning in to help promising entrepreneurs join us in our mission to help build a better Internet.

Helping make the Internet better for all

Accessibility and ease of use are core to everything we do at Cloudflare. We will always make our products and platforms so easy to use that even the smallest business or hobbyist can easily use them. We hope the Workers Launchpad funding program encourages entrepreneurs from all around the world, and from all backgrounds, to start building on Workers, and makes it easier for you to find the funding you need to build the business of your dreams.

Head to the Workers Launchpad page to apply and join the Cloudflare Developer Discord to engage with the Workers community. If you’re a VC that is interested in supporting the program, reach out to Workers[email protected].

Cloudflare is not providing any funding or making any funding decisions, and there is no guarantee that any particular company will receive funding through the program. All funding decisions will be made by the venture capital firms that participate in the program. Cloudflare is not a registered broker-dealer, investment adviser, or other similar intermediary.

Introducing workerd: the Open Source Workers runtime

Post Syndicated from Kenton Varda original

Introducing workerd: the Open Source Workers runtime

Introducing workerd: the Open Source Workers runtime

Today I’m proud to introduce the first beta release of workerd, the JavaScript/Wasm runtime based on the same code that powers Cloudflare Workers. workerd is Open Source under the Apache License version 2.0.

workerd shares most of its code with the runtime that powers Cloudflare Workers, but with some changes designed to make it more portable to other environments. The name “workerd” (pronounced “worker dee”) comes from the Unix tradition of naming servers with a “-d” suffix standing for “daemon”. The name is not capitalized because it is a program name, which are traditionally lower-case in Unix-like environments.

What it’s for

Self-hosting Workers

workerd can be used to self-host applications that you’d otherwise run on Cloudflare Workers. It is intended to be a production-ready web server for this purpose. workerd has been designed to be unopinionated about hosting environments, so that it should fit nicely into whatever server/VM/container hosting and orchestration system you prefer. It’s just a web server.

Workers has always been based on standardized APIs, so that code is not locked into Cloudflare, and we work closely with other runtimes to promote compatibility. workerd provides another option to ensure that applications built on Workers can run anywhere, by leveraging the same underlying code to get exact, “bug-for-bug” compatibility.

Local development and testing

workerd is also designed to facilitate realistic local testing of Workers. Up until now, this has been achieved using Miniflare, which simulated the Workers API within a Node.js environment. Miniflare has worked well, but in a number of cases its behavior did not exactly match Workers running on Cloudflare. With the release of workerd, Miniflare and the Wrangler CLI tool will now be able to provide a more accurate simulation by leveraging the same runtime code we use in production.

Programmable proxies

workerd can act as an application host, a proxy, or both. It supports both forward and reverse proxy modes. In all cases, JavaScript code can be used to intercept and process requests and responses before forwarding them on. Traditional web servers and proxies have used bespoke configuration languages with quirks that are hard to master. Programming proxies in JavaScript instead provides more power while making the configuration easier to write and understand.

What it is

workerd is not just another way to run JavaScript and Wasm. Our runtime is uniquely designed in a number of ways.


Many non-browser JavaScript and Wasm runtimes are designed to be general-purpose: you can use them to build command-line apps, local GUI apps, servers, or anything in between. workerd is not. It specifically focuses on servers, in particular (for now, at least) HTTP servers.

This means in particular that workerd-based applications are event-driven at the top level. Applications do not open listen sockets and accept connections from them; instead, the runtime pushes events to the application. It may seem like a minor difference, but this basic change in perspective directly enables many of the features below.

Web standard APIs

Wherever possible, Workers (and workerd in particular) offers the same standard APIs found in web browsers, such as Fetch, URL, WebCrypto, and others. This means that code built on workerd is more likely to be portable to browsers as well as to other standards-based runtimes. When Workers launched five years ago, it was unusual for a non-browser to offer web APIs, but we are pleased to see that the broader JavaScript ecosystem is now converging on them.


workerd is a nanoservice runtime. What does that mean?

Microservices have become popular over the last decade as a way to split monolithic servers into smaller components that could be maintained and deployed independently. For example, a company that offers several web applications with a common user authentication flow might have a separate team that maintains the authentication logic. In a monolithic model, the authentication logic might have been offered to the application teams as a library. However, this could be frustrating for the maintainers of that logic, as making any change might require waiting for every application team to deploy an update to their respective server. By splitting the authentication logic into a separate server that all the others talk to, the authentication team is able to deploy changes on their own schedule.

However, microservices have a cost. What was previously a fast library call instead now requires communicating over a network. In addition to added overhead, this communication requires configuration and administration to ensure security and reliability. These costs become greater as the codebase is split into more and more services. Eventually, the costs outweigh the benefits.

Nanoservices are a new model that achieve the benefits of independent deployment with overhead closer to that of library calls. With workerd, many Workers can be configured to run in the same process. Each Worker runs in a separate “isolate”, which gives the appearance of running independently of the others: each isolate loads separate code and has its own global scope. However, when one Worker explicitly sends a request to another Worker, the destination Worker actually runs in the same thread with zero latency. So, it performs more like a function call.

With nanoservices, teams can now break their code into many more independently-deployed pieces without worrying about the overhead.

(Some in the industry prefer to call nanoservices “functions”, implying that each individual function making up an application could be its own service. I feel, however, that this puts too much emphasis on syntax rather than logical functionality. That said, it is the same concept.)

To really make nanoservices work well, we had to minimize the baseline overhead of each service. This required designing workerd very differently from most other runtimes, so that common resources could be shared between services as much as possible. First, as mentioned, we run many nanoservices within a single process, to share basic process overhead and minimize context switching costs. A second big architectural difference between workerd and other runtimes is how it handles built-in APIs. Many runtimes implement significant portions of their built-in APIs in JavaScript, which must then be loaded separately into each isolate. workerd does not; all the APIs are implemented in native code, so that all isolates may share the same copy of that code. These design choices would be difficult to retrofit into another runtime, and indeed these needs are exactly why we chose to build a custom runtime for Workers from the start.

Homogeneous deployment

In a typical microservices model, you might deploy different microservices to containers running across a cluster of machines, connected over a local network. You might manually choose how many containers to dedicate to each service, or you might configure some form of auto-scaling based on resource usage.

workerd offers an alternative model: Every machine runs every service.

workerd’s nanoservices are much lighter-weight than typical containers. As a result, it’s entirely reasonable to run a very large number of them – hundreds, maybe thousands – on a single server. This in turn means that you can simply deploy every service to every machine in your fleet.

Homogeneous deployment means that you don’t have to worry about scaling individual services. Instead, you can simply load balance requests across the entire cluster, and scale the cluster as needed. Overall, this can greatly reduce the amount of administration work needed.

Cloudflare itself has used the homogeneous model on our network since the beginning. Every one of Cloudflare’s edge servers runs our entire software stack, so any server can answer any kind of request on its own. We’ve found it works incredibly well. This is why services on Cloudflare – including ones that use Workers – are able to go from no traffic at all to millions of requests per second instantly without trouble.

Capability bindings: cleaner configuration and SSRF safety

workerd takes a different approach to most runtimes – indeed, to most software development platforms – in how an application accesses external resources.

Most development platforms start from assuming that the application can talk to the whole world. It is up to the application to figure out exactly what it wants to talk to, and name it in some global namespace, such as using a URL. So, an application server that wants to talk to the authentication microservice might use code like this:

// Traditional approach without capability bindings.
fetch("", {
  method: "POST",
  body: JSON.stringify(authRequest),
  headers: { "Authorization": env.AUTH_SERVICE_TOKEN }

In workerd, we do things differently. An application starts out with no ability to talk to the rest of the world, and must be configured with specific capability bindings that provide it access to specific external resources. So, an application which needs to be able to talk to the authentication service would be configured with a binding called authService, and the code would look something like this:

// Capability-based approach. Hostname doesn't matter; all
// requests to AUTH_SERVICE.fetch() go to the auth service.
env.AUTH_SERVICE.fetch("https://auth/api", {
 method: "POST",
 body: JSON.stringify(authRequest),

This may at first appear to be a trivial difference. In both cases, we have to use configuration to control access to external services. In the traditional approach, we’d provide access tokens (and probably the service’s hostname) as environment variables. In the new approach, the environment goes a bit further to provide a full-fledged object. Is this just syntax sugar?

It turns out, this slight change has huge advantages:

First, we can now restrict the global fetch() function to accept only publicly-routable URLs. This makes applications totally immune to SSRF attacks! You cannot trick an application into accessing an internal service unintentionally if the code to access internal services is explicitly different. (In fact, the global fetch() is itself backed by a binding, which can be configured. workerd defaults to connecting it to the public internet, but you can also override it to permit private addresses if you want, or to route to a specific proxy service, or to be blocked entirely.)

With that done, we now have an interesting property: All internal services which an application uses must be configurable. This means:

  • You can easily see a complete list of the internal services an application talks to, without reading all the code.
  • You can always replace these services with mocks for testing purposes.
  • You can always configure an application to authenticate itself differently (e.g. client certificates) or use a different back end, without changing code.

The receiving end of a binding benefits, too. Take the authentication service example, above. The auth service may be another Worker running in workerd as a nanoservice. In this case, the auth service does not need to be bound to any actual network address. Instead, it may be made available strictly to other Workers through their bindings. In this case, the authentication service doesn’t necessarily need to verify that a request received came from an allowed client – because only allowed clients are able to send requests to it in the first place.

Overall, capability bindings allow simpler code that is secure by default, more composable, easier to test, and easier to understand and maintain.

Always backwards compatible

Cloudflare Workers has a hard rule against ever breaking a live Worker running in production. This same dedication to backwards compatibility extends to workerd.

workerd shares Workers’ compatibility date system to manage breaking changes. Every Worker must be configured with a “compatibility date”. The runtime then ensures that the API behaves exactly as it did on that date. At your leisure, you may check the documentation to see if new breaking changes are introduced at a future date, and update your code for them. Most such changes are minor and most code won’t require any changes. However, you are never obliged to update. Old dates will continue to be supported by newer versions of workerd. It is always safe to update workerd itself without updating your code.

What it’s not

To avoid misleading or disappointing anyone, I need to take a moment to call out what workerd is not.

workerd is not a Secure Sandbox

It’s important to note that workerd is not, on its own, a secure way to run possibly-malicious code. If you wish to run code you don’t trust using workerd, you must enclose it in an additional sandboxing layer, such as a virtual machine configured for sandboxing.

workerd itself is designed such that a Worker should not be able to access any external resources to which it hasn’t been granted a capability. However, a complete sandbox solution not only must be designed to restrict access, but also must account for the possibility of bugs – both in software and in hardware. workerd on its own is not sufficient to protect against hardware bugs like Spectre, nor can it adequately defend against the possibility of vulnerabilities in V8 or in workerd’s own code.

The Cloudflare Workers service uses the same code found in workerd, but adds many additional layers of security on top to harden against such bugs. I described some of these in a past blog post. However, these measures are closely tied to our particular environment. For example, we rely on build automation to push V8 patches to production immediately upon becoming available; we separate customers according to risk profile; we rely on non-portable kernel features and assumptions about the host system to enforce security and resource limits. All of this is very specific to our environment, and cannot be packaged up in a reusable way.

workerd is not an independent project

workerd is the core of Cloudflare Workers, a fast-moving project developed by a dedicated team at Cloudflare. We are not throwing code over the wall to forget about, nor are we expecting volunteers to do our jobs for us. workerd’s GitHub repository will be the canonical source used by Cloudflare Workers and our team will be doing much of their work directly in this repository. Just like V8 is developed primarily by the Chrome team for use in Chrome, workerd will be developed primarily by the Cloudflare Workers team for use in Cloudflare Workers.

This means we cannot promise that external contributions will sit on a level playing field with internal ones. Code reviews take time, and work that is needed for Cloudflare Workers will take priority. We also cannot promise we will accept every feature contribution. Even if the code is already written, reviews and maintenance have a cost. Within Cloudflare, we have a product management team who carefully evaluates what features we should and shouldn’t offer, and plenty of ideas generated internally ultimately don’t make the cut.

If you want to contribute a big new feature to workerd, your best bet is to talk to us before you write code, by raising an issue on GitHub early to get input. That way, you can find out if we’re likely to accept a PR before you write it. We also might be able to give hints on how best to implement.

It’s also important to note that while workerd’s internal interfaces may sometimes appear clean and reusable, we cannot make any guarantee that those interfaces won’t completely change on a whim. If you are trying to build on top of workerd internals, you will need to be prepared either to accept a fair amount of churn, or pin to a specific version.

workerd is not an off-the-shelf edge compute platform

As hinted above, the full Cloudflare Workers service involves a lot of technology beyond workerd itself, including additional security, deployment mechanisms, orchestration, and so much more. workerd itself is a portion of our runtime codebase, which is itself a small (albeit critical) piece of the overall Cloudflare Workers service.

We are pleased, though, that this means it is possible for us to release this code under a permissive Open Source license.

Try the Beta

As of this blog post, workerd is in beta. If you want to try it out,

Build real-time video and audio apps on the world’s most interconnected network

Post Syndicated from Zaid Farooqui original

Build real-time video and audio apps on the world’s most interconnected network

Build real-time video and audio apps on the world’s most interconnected network

In the last two years, there has been a rapid rise in real-time apps that help groups of people get together virtually with near-zero latency. User expectations have also increased: your users expect real-time video and audio features to work flawlessly. We found that developers building real-time apps want to spend less time building and maintaining low-level infrastructure. Developers also told us they want to spend more time building features that truly make their idea special.

So today, we are announcing a new product that lets developers build real-time audio/video apps. Cloudflare Calls exposes a set of APIs that allows you to build things like:

  • A video conferencing app with a custom UI
  • An interactive conversation where the moderators can invite select audience members “on stage” as speakers
  • A privacy-first group workout app where only the instructor can view all the participants while the participants can only view the instructor
  • Remote ‘fireside chats’ where one or multiple people can have a video call with an audience of 10,000+ people in real time (<100ms delay)

The protocol that makes all these possible is WebRTC. And Cloudflare Calls is the product that abstracts away the complexity by turning the Cloudflare network into a “super peer,” helping you build reliable and secure real-time experiences.

What is WebRTC?

WebRTC is a peer-to-peer protocol that enables two or more users’ devices to talk to each other directly and without leaving the browser. In a native implementation, peer-to-peer typically works well for 1:1 calls with only two participants. But as you add additional participants, it is common for participants to experience reliability issues, including video freezes and participants getting out of sync. Why? Because as the number of participants increases, the coordination overhead between users’ devices also increases. Each participant needs to send media to each other participant, increasing the data consumption from each computer exponentially.

A selective forwarding unit (SFU) solves this problem. An SFU is a system that connects users with each other in real-time apps by intelligently managing and routing video and audio data between the participants. Apps that use an SFU reduce the data capacity required from each user because each user doesn’t have to send data to every other user. SFUs are required parts of a real-time application when the applications need to determine who is currently speaking or when they want to send appropriate resolution video when WebRTC simulcast is used.

Beyond SFUs

The centralized nature of an SFU is also its weakness. A centralized WebRTC server needs a region, which means that it will be slow in most parts of the world for most users while being fast for only a few select regions.

Typically, SFUs are built on public clouds. They consume a lot of bandwidth by both receiving and sending high resolution media to many devices. And they come with significant devops overhead requiring your team to manually configure regions and scalability.

We realized that merely offering an SFU-as-a-service wouldn’t solve the problem of cost and bandwidth efficiency.

Biggest WebRTC server in the world

When you are on a five-person video call powered by a classic WebRTC implementation, each person’s device talks directly with each other. In WebRTC parlance, each of the five participants is called a peer. And the reliability of the five-person call will only be as good as the reliability of the person (or peer) with the weakest Internet connection.

We built Calls with a simple premise: “What if Cloudflare could act as a WebRTC peer?”.  Calls is a “super peer” or a “giant server that spans the whole world” allows applications to be built beyond the limitations of the lowest common denominator peer or a centralized SFU. Developers can focus on the strength of their app instead of trying to compensate for the weaknesses of the weakest peer in a p2p topology.

Calls does not use the traditional SFU topology where every participant connects to a centralized server in a single location. Instead, each participant connects to their local Cloudflare data center. When another participant wants to retrieve that media, the datacenter that homes that original media stream is found and the tracks are forwarded between datacenters automatically. If two participants are physically close their media does not travel around the world to a centralized region, instead they use the same datacenter, greatly reducing latency and improving reliability.

Calls is a configurable, global, regionless WebRTC server that is the size of Cloudflare’s ever-growing network. The WebRTC protocol enables peers to send and receive media tracks. When you are on a video call, your computer is typically sending two tracks: one that contains the audio of you speaking and another that contains the video stream from your camera. Calls implements the WebRTC RTCPeerConnection API across the Cloudflare Network where users can push media tracks. Calls also exposes an API where other media tracks can be requested within the same Peer Connection context.

Build real-time video and audio apps on the world’s most interconnected network

Cloudflare Calls will be a good solution if you operate your own WebRTC server such as Janus or MediaSoup. Cloudflare Calls can also replace existing deployments of Janus or MediaSoup, especially in cases where you have clients connecting globally to a single, centralized deployment.

Region: Earth

Building and maintaining your own real-time infrastructure comes with unique architecture and scaling challenges. It requires you to answer and constantly revise your answers to thorny questions such as “which regions do we support?”, “how many users do we need to justify spinning up more infrastructure in yet another cloud region?”, “how do we scale for unplanned spikes in usage?” and “how do we not lose money during low-usage hours of our infrastructure?” when you run your own WebRTC server infrastructure.

Cloudflare Calls eliminates the need to answer these questions. Calls uses anycast for every connection, so every packet is always routed to the closest Cloudflare location. It is global by nature: your users are automatically served from a location close to them. Calls scales with your use and your team doesn’t have to build its own auto-scaling logic.

Calls runs on every Cloudflare location and every single Cloudflare server. Because the Cloudflare network is within 10 milliseconds of 90% of the world’s population, it does not add any noticeable latency.

Answer “where’s the problem?”, only faster

When we talk to customers with existing WebRTC workloads, there is one consistent theme: customers wish it was easier to troubleshoot issues. When a group of people are talking over a video call, the stakes are much higher when users experience issues. When a web page fails to load, it is common for users to simply retry after a few minutes. When a video call is disruptive, it is often the end of the call.

Cloudflare Calls’ focus on observability will help customers get to the bottom of the issues faster. Because Calls is built on Cloudflare’s infrastructure, we have end-to-end visibility from all layers of the OSI model.

Calls provides a server side view of the WebRTC Statistics API, so you can drill into issues each Peer Connection and the flow of media within without depending only on data sent from clients. We chose this because the Statistics API is a standardized place developers are used to getting information about their experience. It is the same API available in browsers, and you might already be using it today to gain insight into the performance of your WebRTC connections.

Privacy and security at the core

Calls eliminates the need for participants to share information such as their IP address with each other. Let’s say you are building an app that connects therapists and patients via video calls. With a traditional WebRTC implementation, both the patient and therapist’s devices would talk directly with each other, leading to exposure of potentially sensitive data such as the IP address. Exposure of information such as the IP address can leave your users vulnerable to denial-of-service attacks.

When using Calls, you are still using WebRTC, but the individual participants are connecting to the Cloudflare network. If four people are on a video call powered by Cloudflare Calls, each of the four participants’ devices will be talking only with the Cloudflare network. To your end users, the experience will feel just like a peer-to-peer call, only with added security and privacy upside.

Finally, all video and audio traffic that passes through Cloudflare Calls is encrypted by default. Calls leverages existing Cloudflare products including Argo to route the video and audio content in a secure and efficient manner. Calls API enables granular controls that cannot be implemented with vanilla WebRTC alone. When you build using Calls, you are only limited by your imagination; not the technology.

What’s next

We’re releasing Cloudflare Calls in closed beta today. To try out Cloudflare Calls, request an invitation and check your inbox in coming weeks.
Calls will be free during the beta period. We’re looking to work with early customers who want to take Calls from beta to generally available with us. If you are building a real-time video app today, having challenges scaling traditional WebRTC infrastructure, or just have a great idea you want to explore, leave a comment when you are requesting an invitation, and we’ll reach out.

Dynamic URL redirects: 301 to the future

Post Syndicated from Sam Marsh original

Dynamic URL redirects: 301 to the future

Dynamic URL redirects: 301 to the future

The Internet is a dynamic place. Websites are constantly changing as technologies and business practices evolve. What was front-page news is quickly moved into a sub-directory. To ensure website visitors continue to see the correct webpage even if it has been moved, administrators often implement URL redirects.

A URL redirect is a mapping from one location on the Internet to another, effectively telling the visitor’s browser that the location of the page has changed, and where they can now find it. This is achieved by providing a virtual ‘link’ between the content’s original and new location.

URL Redirects have typically been implemented as Page Rules within Cloudflare, however Page Rules only match on the URL, rather than other elements such as the visitor’s source country or preferred language. This limitation meant customers with a need for more dynamic URL redirects had to implement alternative solutions such Cloudflare Workers to achieve their goals.

To simplify the management of these more complex use cases we have created Dynamic Redirects. With Dynamic Redirects, users can redirect visitors to another webpage or website based upon hundreds of options such as the visitor’s country of origin or language, without having to write a single line of code.

More than a URL

For nine years users were limited to 125 URL redirects per zone. This limitation meant those with a need for more URL redirects had to implement alternative solutions such Cloudflare Workers to achieve their goals.

In December 2021, we launched Bulk Redirects, allowing up to 100,000 URL redirects per account at the time. In April 2022 we increased this maximum number to over six million URL redirects per account. However, there is still a gap in the ‘URL redirect’ product unfulfilled. Until now.

Bulk Redirects, much like the ‘Forwarding URL’ Page Rule, are prescriptive URL redirects. You tell us what URL to look for, and where to redirect the user to when they visit it. We can support this use case at a huge scale.

If a visitor asks for.. Redirect them to…

That’s a simple concept to understand, however user needs have evolved. What if a user wanted to redirect visitors to a localized version of the requested page based on their preferred language? What if a user wanted to redirect visitors to their local subsidiary on the website? Or direct them to an optimized site when they visit from a mobile device? Suddenly, this well understood concept doesn’t work – and they have to deploy code in Workers to solve what is actually a common problem. And common problems deserve to be productized.

This is where Dynamic Redirects can help. The new product provides the same consistent user interface as Transform Rules, Custom Rules, Bulk Redirects, etc. and provides a new action allowing for the target URL to be dynamically created, much like the dynamic rewrite action offered in Transform Rules.

This dynamic action frees the user from having to define explicitly what the target URL should look like, and instead provides them with a full gamut of fields and functions to custom generate the target URL based upon the parameters of the request. For example, rather than redirecting all traffic for to, users can conceptually redirect the traffic to{PREFERRED_LANGUAGE}/shop (not actual syntax!). With this, traffic from a browser with a preferred language of French will be redirected to, likewise those with a preferred language of German will be redirected to

Dynamic URL redirects: 301 to the future

The other big difference between Dynamic Redirects and  Page Rules is in the filtering. Page Rules are limited to filtering on a URL, or a URL with asterisks as wildcards. Dynamic Redirects is built atop our lightning-fast Rulesets Engine, which also runs products such as Transform Rules, Custom Rules (WAF), Bulk Redirects and API Shield.

Due to this, Dynamic Redirects offers almost the entire suite of Ruleset Engine fields for use in filtering; from http.request.full_uri for the whole URL, to (where is the visitor located) and http.request.accepted_languages[] (the language preferred by the visitor). The possibilities are endless.

Users can also now use logical operators such as ‘OR’. Where previously, if a user wanted to redirect five distinct URLs to the same URL they would need to deploy five Page Rules. Today, they can simply use an ‘OR’ to consolidate this use case into just one Dynamic Redirect rule:

# URL Destination URL
1 https:/
2 https:/
3 https:/
4 https:/
5 https:/

# Expression Destination URL
1 (http.request.full_uri eq “https:/”) or (http.request.full_uri eq “https:/”) or (http.request.full_uri eq “https:/”) or (http.request.full_uri eq “https:/”) or (http.request.full_uri eq “https:/”)

We can further simplify this use case in the future by adding hostname lists, allowing users to add URLs to a list and reference it from within the rule expression, similar to IP Lists. This allows an expression like (http.request.full_uri in $vanity_urls), for example.

A dedicated quota, just for U(RL)

Page Rules are used to set everything from configuration and caching behaviour to header modification and also URL redirection (Forwarding URL). This means that users tend to run out of available rules quickly.

To address this, we’re matching the Page Rule quota in each of the four new products that are being announced today. This means where in Page Rules an Enterprise customer would get 125 Page Rules to share amongst the aforementioned functions, in Dynamic Redirects they have 125 rules just for redirects. This number can be increased for Enterprise customers, also.

We’re also increasing the quota for free plans; where today free plans get three Page Rules, they will now get 10 rules for dynamic redirects, along with each of the other three products (cache, origin, config rules). That’s a net increase of 37 additional rules!

Plan Page Rules Dynamic Redirects
Enterprise 125 125+
Business 50 50
Pro 20 25
Free 3 10

Users can now get more out of their Cloudflare setup, being more specific about when traffic is redirected, optimizing cache and adjusting which settings are and aren’t applied – without having to trade off between these areas due to a limit in rules quota.

Localized redirects

One of the examples covered earlier is being able to redirect visitors to localized content depending on their preferred language.

This can be done by analyzing the ‘Accept-Language’ header sent by the browser, which is stored as an array in the field http.request.accepted_languages[]. This field is an array of the values received by Cloudflare within the Accept-Language HTTP request header, sorted in descending weight order. This header is calculated based on the preferences set by the visitor in the ‘Language’ section of their web browser.

For example, if the visitors browser sends an ‘Accept-Language’ header containing “Accept-Language: fr-CH, fr;q=0.8, en;q=0.9, de;q=0.7, *;q=0.5”, then the field http.request.accepted_languages[0] would contain “en”, with http.request.accepted_languages[1] containing “fr” and so forth.

With this information, we can create a dynamic redirect using the action:

Dynamic URL redirects: 301 to the future

With this rule in place, traffic from visitors with a preferred language of English (en) will be redirected to The rule can be duplicated for other languages also, ensuring those with a preferred language of French (fr) will be redirected to

There are so many use cases for Dynamic Redirects we couldn’t fit them all in this blog.

Other possible use cases include mobile redirects, redirects based on cookies, redirects to different endpoints based on request headers for live testing. The potential list is huge, and we can’t wait to hear what you come up with. Try it today!

Build your next startup on Cloudflare with our comprehensive Startup Plan, v2.0

Post Syndicated from Jade Q. Wang original

Build your next startup on Cloudflare with our comprehensive Startup Plan, v2.0

Build your next startup on Cloudflare with our comprehensive Startup Plan, v2.0

Starting a business is hard. And we know that the first few years of your business are crucial to your success.

Cloudflare’s Startup Plan is here to help.

Last year, we piloted a program to a select group of startups for free, with a selection of products that are very high leverage for young startups, early in their product development, like Workers, Stream, and Zero Trust.

Over the past year, startup founders repeatedly wrote into [email protected], and most of these emails followed one of 2 patterns:

  1. A startup would like to request additional products that are not a part of the startup plan, often Workers KV, Pages, Cloudflare for SaaS, R2, Argo, etc.
  2. A startup that is not a part of any accelerator program but would like to get on the startup plan.

Based on this feedback, we are thrilled to announce that today we will be increasing the scope of the program to also include popularly requested products! Beyond that, we’re also super excited to be broadening the eligibility criteria, so more startups can qualify for the plan.

What does the Cloudflare Startup Plan include?

There’s a lot of additional value that’s in the latest version of the plan. Here’s how things have evolved from v1.0:

Product Description v1.0 v2.0*
Content Delivery Network (CDN) Cloudflare’s CDN is a distributed network of servers that provides several advantages for a website. By caching website content, Cloudflare helps improve page load speeds, reduce bandwidth usage, and reduce CPU usage on the server.
Web Application Firewall (WAF) Cloudflare WAF protects a customer’s Internet properties from common vulnerabilities like SQL injection attacks, cross-site scripting, and cross-site forgery requests, with no changes to its existing infrastructure.
DNS Cloudflare DNS is an enterprise-grade authoritative DNS service that offers the fastest response time, unparalleled redundancy, and advanced security with built-in DDoS mitigation and DNSSEC.
Advanced DDoS Protection Cloudflare DDoS protection secures websites, applications, and entire networks while ensuring the performance of legitimate traffic is not compromised.
Page Rules Page Rules allows you to conditionally change zone settings such as security level or rocket loader settings. It is also how cache settings are configured for traffic passing through the zone.
Workers Cloudflare Workers allows developers to augment existing applications or create entirely new ones through a lightweight execution environment without configuring or maintaining infrastructure.
Stream Cloudflare Stream provides out-of-the-box video infrastructure that handles storage, encoding, and delivery. Using the Stream API, customers can build video functionality affordably and with minimal engineering effort.
Zero Trust Cloudflare Zero Trust replaces legacy security perimeters with our global edge, making the Internet faster and safer for teams around the world.
R2 Cloudflare R2 Storage allows developers to store large amounts of unstructured data without the costly egress bandwidth fees associated with typical cloud storage services.

Pages Cloudflare Pages allows frontend developers to quickly and easily build, collaborate on, and deploy websites.
Workers KV Workers KV is a global, low-latency, key-value data store. It supports high read volumes with low-latency, making it possible to build highly dynamic APIs and websites which respond as quickly as a cached static file would.
Workers Unbound Workers Unbound is a serverless platform for compute-intensive workloads, supporting up to 30s CPU time (GA), and 15 minute cron triggers (beta).
Cloudflare for SaaS Offers the benefits of Cloudflare’s SSL management to SaaS company’s customers. Allows SaaS customers to use their custom vanity domains while maintaining a branded customer experience, improved SEO rankings, and all the perks of dedicated certificates.

Advanced Certificate Manager (ACM) Advanced Certificate Manager is a flexible and customizable way to issue and manage certificates in Cloudflare. Customers use Advanced Certificate Manager when they want something more customizable than Universal SSL but still want the convenience of SSL certificate issuance and renewal.

Image Resizing Resize images for a variety of device types and connections from a single-source master image. Images can be manipulated by dimensions, compression ratios, and format (WebP conversion where supported). It is a feature of Speed→Optimization on the Dashboard.
Images Cloudflare Images lets you set up an image pipeline in minutes. Build a scalable image pipeline to store, resize, optimize and deliver images in a fast and secure manner.
Argo Smart Routing Argo Smart Routing improves Internet performance by intelligently routing end users through less congested and more reliable paths over the Internet using our network.
Spectrum DDoS protection, firewalling features and performance benefits for any TCP or UDP based application (such as gaming, FTP, email etc.)
Load Balancing Improve application performance and availability by steering traffic away from unhealthy origin servers and dynamically distributing it to the most available and responsive server pools.
Rate Limiting Protect your site or API from malicious traffic by blocking client IP addresses that hit a URL pattern and exceed a threshold you define.
Waiting Room Cloudflare Waiting Room allows organizations to route excess users to a custom-branded waiting room, helping preserve customer experience and protect origin servers from being overwhelmed with requests.
Area 1 Email Security Comprehensive email security solution to eliminate the risk from phishing, which is the root cause of damage in 95% of all cybersecurity incidents. Configuration with Google and Microsoft in 5 minutes.

*Product-specific usage limits apply

With the expanded product basket, the Startup Plan v2.0 provides four times the value of the original program. As an added bonus when new Cloudflare products are in beta, participants on the Startup Plan are welcome to share their use case and potentially receive early access.

How to join the Cloudflare Startup Program

Eligibility criteria for startups new to the program

To be eligible for the Startup Plan, your startup must be at an early stage (i.e. raised less than $3 million) and enrolled or recently graduated from a participating accelerator program.

If you are a founder in a participating accelerator program, navigate to the Cloudflare perk from your program’s vendor perk page and follow the instructions there. Those instructions will lead you to this plan page where you can select your program from the list.

If your accelerator program is not on the list, please email us at [email protected] and include the details of your perk manager. We’ll get in touch with them directly.

What if I’m already on the V1 Startup Plan? Can I get access to the products on the V2 plan?

Yes! Startups currently on the Startup Plan will automatically have the additional products added to your accounts in the coming weeks, no action required. This upgrade will apply for the remainder of the year that your company is on the Startup Plan.

If you have any questions about making the most of your plan, drop a line to [email protected] with any questions.

Special Eligibility: Referral Program, Alumni Program, Workers Launchpad

Over our years at Cloudflare, we have had the privilege of working with some remarkable technologists who have gone on to work on extraordinary projects after their time at Cloudflare. Moreover, there are startup founders who interface with teams at Cloudflare, whether through participation in the Cloudflare Community or because they were former coworkers elsewhere.

These startups are now also eligible:

  • Workers Launchpad funding program: If you are building on Cloudflare Workers, simultaneously apply for the Workers Launchpad funding program here, which will make you eligible for introductions to leading investors, additional mentorship opportunities, and other perks.
  • Referral Program: “Referred by (include details in comments)” is now an option in the Select Accelerator dropdown menu on the Startup Plan landing page. Please include the email address of your contact at Cloudflare in the comments field. Your account will be upgraded once your contact validates the referral.
  • Alumni Program: “Cloudflare Alumni” is now also an option in the Select Accelerator dropdown menu.

Questions about the program? Drop us a line at [email protected], and we’ll be in touch.

D1: our quest to simplify databases

Post Syndicated from Nevi Shah original

D1: our quest to simplify databases

D1: our quest to simplify databases

When we announced D1 in May of this year, we knew it would be the start of something new – our first SQL database with Cloudflare Workers. Prior to D1 we’ve announced storage options like KV (key-value store), Durable Objects (single location, strongly consistent data storage) and R2 (blob storage). But the question always remained “How can I store and query relational data without latency concerns and an easy API?”

The long awaited “Cloudflare Database” was the true missing piece to build your application entirely on Cloudflare’s global network, going from a blank canvas in VSCode to a full stack application in seconds. Compatible with the popular SQLite API, D1 empowers developers to build out their databases without getting bogged down by complexity and having to manage every underlying layer.

Since our launch announcement in May and private beta in June, we’ve made great strides in building out our vision of a serverless database. With D1 still in private beta but an open beta on the horizon, we’re excited to show and tell our journey of building D1 and what’s to come.

The D1 Experience

We knew from Cloudflare Workers feedback that using Wrangler as the mechanism to create and deploy applications is loved and preferred by many. That’s why when Wrangler 2.0 was announced this past May alongside D1, we took advantage of the new and improved CLI for every part of the experience from data creation to every update and iteration. Let’s take a quick look on how to get set up in a few easy steps.

Create your database

With the latest version of Wrangler installed, you can create an initialized empty database with a quick

npx wrangler d1 create my_database_name

To get your database up and running! Now it’s time to add your data.

Bootstrap it

It wouldn’t be the “Cloudflare way” if you had to sit through an agonizingly long process to get set up. So we made it easy and painless to bring your existing data from an old database and bootstrap your new D1 database.  You can run

wrangler d1 execute my_database-name --file ./filename.sql

and pass through an existing SQLite .sql file of your choice. Your database is now ready for action.

Develop & Test Locally

With all the improvements we’ve made to Wrangler since version 2 launched a few months ago, we’re pleased to report that D1 has full remote & local wrangler dev support:

D1: our quest to simplify databases

When running wrangler dev -–local -–persist, an SQLite file will be created inside .wrangler/state. You can then use a local GUI program for managing it, like SQLiteFlow ( or Beekeeper (

Or you can simply use SQLite directly with the SQLite command line by running sqlite3 .wrangler/state/d1/DB.sqlite3:

D1: our quest to simplify databases

Automatic backups & one-click restore

No matter how much you test your changes, sometimes things don’t always go according to plan. But with Wrangler you can create a backup of your data, view your list of backups or restore your database from an existing backup. In fact, during the beta, we’re taking backups of your data every hour automatically and storing them in R2, so you will have the option to rollback if needed.

D1: our quest to simplify databases

And the best part – if you want to use a production snapshot for local development or to reproduce a bug, simply copy it into the .wrangler/state directory and wrangler dev –-local –-persist will pick it up!

Let’s download a D1 backup to our local disk. It’s SQLite compatible.

D1: our quest to simplify databases

Now let’s run our D1 worker locally, from the backup.

D1: our quest to simplify databases

Create and Manage from the dashboard

However, we realize that CLIs are not everyone’s jam. In fact, we believe databases should be accessible to every kind of developer – even those without much database experience! D1 is available right from the Cloudflare dashboard giving you near total command parity with Wrangler in just a few clicks. Bootstrapping your database, creating tables, updating your database, viewing tables and triggering backups are all accessible right at your fingertips.

D1: our quest to simplify databases

Changes made in the UI are instantly available to your Worker — no deploy required!

We’ve told you about some of the improvements we’ve landed since we first announced D1, but as always, we also wanted to give you a small taste (with some technical details) of what’s ahead. One really important functionality of a database is transactions — something D1 wouldn’t be complete without.

Sneak peek: how we’re bringing JavaScript transactions to D1

With D1, we strive to present a dramatically simplified interface to creating and querying relational data, which for the most part is a good thing. But simplification occasionally introduces drawbacks, where a use-case is no longer easily supported without introducing some new concepts. D1 transactions are one example.

Transactions are a unique challenge

You don’t need to specify where a Cloudflare Worker or a D1 database run—they simply run everywhere they need to. For Workers, that is as close as possible to the users that are hitting your site right this second. For D1 today, we don’t try to run a copy in every location worldwide, but dynamically manage the number and location of read-only replicas based on how many queries your database is getting, and from where. However, for queries that make changes to a database (which we generally call “writes” for short), they all have to travel back to the single Primary D1 instance to do their work, to ensure consistency.

But what if you need to do a series of updates at once? While you can send multiple SQL queries with .batch() (which does in fact use database transactions under the hood), it’s likely that, at some point, you’ll want to interleave database queries & JS code in a single unit of work.

This is exactly what database transactions were invented for, but if you try running BEGIN TRANSACTION in D1 you’ll get an error. Let’s talk about why that is.

Why native transactions don’t work
The problem arises from SQL statements and JavaScript code running in dramatically different places—your SQL executes inside your D1 database (primary for writes, nearest replica for reads), but your Worker is running near the user, which might be on the other side of the world. And because D1 is built on SQLite, only one write transaction can be open at once. Meaning that, if we permitted BEGIN TRANSACTION, any one Worker request, anywhere in the world, could effectively block your whole database! This is a quite dangerous thing to allow:

  • A Worker could start a transaction then crash due to a software bug, without calling ROLLBACK. The primary would be blocked, waiting for more commands from a Worker that would never come (until, probably, some timeout).
  • Even without bugs or crashes, transactions that require multiple round-trips between JavaScript and SQL could end up blocking your whole system for multiple seconds, dramatically limiting how high an application built with Workers & D1 could scale.

But allowing a developer to define transactions that mix both SQL and JavaScript makes building applications with Workers & D1 so much more flexible and powerful. We need a new solution (or, in our case, a new version of an old solution).

A way forward: stored procedures
Stored procedures are snippets of code that are uploaded to the database, to be executed directly next to the data. Which, at first blush, sounds exactly like what we want.

However, in practice, stored procedures in traditional databases are notoriously frustrating to work with, as anyone who’s developed a system making heavy use of them will tell you:

  • They’re often written in a different language to the rest of your application. They’re usually written in (a specific dialect of) SQL or an embedded language like Tcl/Perl/Python. And while it’s technically possible to write them in JavaScript (using an embedded V8 engine), they run in such a different environment to your application code it still requires significant context-switching to maintain them.
  • Having both application code and in-database code affects every part of the development lifecycle, from authoring, testing, deployment, rollbacks and debugging. But because stored procedures are usually introduced to solve a specific problem, not as a general purpose application layer, they’re often managed completely manually. You can end up with them being written once, added to the database, then never changed for fear of breaking something.

With D1, we can do better.

The point of a stored procedure was to execute directly next to the data—uploading the code and executing it inside the database was simply a means to that end. But we’re using Workers, a global JavaScript execution platform, can we use them to solve this problem?

It turns out, absolutely! But here we have a few options of exactly how to make it work, and we’re working with our private beta users to find the right API. In this section, I’d like to share with you our current leading proposal, and invite you all to give us your feedback.

When you connect a Worker project to a D1 database, you add the section like the following to your wrangler.toml:

[[ d1_databases ]]
# What binding name to use (e.g. env.DB):
binding = "DB"
# The name of the DB (used for wrangler d1 commands):
database_name = "my-d1-database"
# The D1's ID for deployment:
database_id = "48a4224e-...3b09"
# Which D1 to use for `wrangler dev`:
# (can be the same as the previous line)
preview_database_id = "48a4224e-...3b09"

# NEW: adding "procedures", pointing to a new JS file:
procedures = "./src/db/procedures.js"

That D1 Procedures file would contain the following (note the new db.transaction() API, that is only available within a file like this):

export default class Procedures {
  constructor(db, env, ctx) {
    this.db = db

  // any methods you define here are available on env.DB.Procedures
  // inside your Worker
  async Checkout(cartId: number) {
    // Inside a Procedure, we have a new db.transaction() API
    const result = await this.db.transaction(async (txn) => {
      // Transaction has begun: we know the user can't add anything to
      // their cart while these actions are in progress.
      const [cart, user] = Helpers.loadCartAndUser(cartId)

      // We can update the DB first, knowing that if any of the later steps
      // fail, all these changes will be undone.
      await this.db
        .prepare(`UPDATE cart SET status = ?1 WHERE cart_id = ?2`)
        .bind('purchased', cartId)
      const newBalance = user.balance - cart.total_cost
      await this.db
        .prepare(`UPDATE user SET balance = ?1 WHERE user_id = ?2`)
        // Note: the DB may have a CHECK to guarantee 'user.balance' can not
        // be negative. In that case, this statement may fail, an exception
        // will be thrown, and the transaction will be rolled back.
        .bind(newBalance, cart.user_id)

      // Once all the DB changes have been applied, attempt the payment:
      const { ok, details } = await PaymentAPI.processPayment(
      if (!ok) {
        // If we throw an Exception, the transaction will be rolled back
        // and result.error will be populated:
        // throw new PaymentFailedError(details)
        // Alternatively, we can do both of those steps explicitly
        await txn.rollback()
        // The transaction is rolled back, our DB is now as it was when we
        // started. We can either move on and try something new, or just exit.
        return { error: new PaymentFailedError(details) }

      // This is implicitly called when the .transaction() block finishes,
      // but you can explicitly call it too (potentially committing multiple
      // times in a single db.transaction() block).
      await txn.commit()

      // Anything we return here will be returned by the 
      // db.transaction() block
      return {
        amount_charged: cart.total_cost,
        remaining_balance: newBalance,

    if (result.error) {
      // Our db.transaction block returned an error or threw an exception.

    // We're still in the Procedure, but the Transaction is complete and
    // the DB is available for other writes. We can either do more work
    // here (start another transaction?) or return a response to our Worker.
    return result

And in your Worker, your DB binding now has a “Procedures” property with your function names available:

const { error, amount_charged, remaining_balance } =
  await env.DB.Procedures.Checkout(params.cartId)

if (error) {
  // Something went wrong, `error` has details
} else {
  // Display `amount_charged` and `remaining_balance` to the user.

Multiple Procedures can be triggered at one time, but only one db.transaction() function can be active at once: any other write queries or other transaction blocks will be queued, but all read queries will continue to hit local replicas and run as normal. This API gives you the ability to ensure consistency when it’s essential but with the minimal impact on total overall performance worldwide.

Request for feedback

As with all our products, feedback from our users drives the roadmap and development. While the D1 API is in beta testing today, we’re still seeking feedback on the specifics. However, we’re pleased that it solves both the problems with transactions that are specific to D1 and the problems with stored procedures described earlier:

  • Code is executing as close as possible to the database, removing network latency while a transaction is open.
  • Any exceptions or cancellations of a transaction cause an instant rollback—there is no way to accidentally leave one open and block the whole D1 instance.
  • The code is in the same language as the rest of your Worker code, in the exact same dialect (e.g. same TypeScript config as it’s part of the same build).
  • It’s deployed seamlessly as part of your Worker. If two Workers bind to the same D1 instance but define different procedures, they’ll only see their own code. If you want to share code between projects or databases, extract a library as you would with any other shared code.
  • In local development and test, the procedure works just like it does in production, but without the network call, allowing seamless testing and debugging as if it was a local function.
  • Because procedures and the Worker that define them are treated as a single unit, rolling back to an earlier version never causes a skew between the code in the database and the code in the Worker.

The D1 ecosystem: contributions from the community

We’ve told you about what we’ve been up to and what’s ahead, but one of the unique things about this project is all the contributions from our users. One of our favorite parts of private betas is not only getting feedback and feature requests, but also seeing what ideas and projects come to fruition. While sometimes this means personal projects, with D1, we’re seeing some incredible contributions to the D1 ecosystem. Needless to say, the work on D1 hasn’t just been coming from within the D1 team, but also from the wider community and other developers at Cloudflare. Users have been showing off their D1 additions within our Discord private beta channel and giving others the opportunity to use them as well. We wanted to take a moment to highlight them.


Dealing with raw SQL syntax is powerful (and using the D1 .bind() API, safe against SQL injections) but it can be a little clumsy. On the other hand, most existing query builders assume direct access to the underlying DB, and so aren’t suitable to use with D1. So Cloudflare developer Gabriel Massadas designed a small, zero-dependency query builder called workers-qb:

import { D1QB } from 'workers-qb'
const qb = new D1QB(env.DB)

const fetched = await qb.fetchOne({
    tableName: "employees",
    fields: "count(*) as count",
    where: {
      conditions: "active = ?1",
      params: [true]

Check out the project homepage for more information:

D1 console

While you can interact with D1 through both Wrangler and the dashboard, Cloudflare Community champion, Isaac McFadyen created the very first D1 console where you can quickly execute a series of queries right through your terminal. With the D1 console, you don’t need to spend time writing the various Wrangler commands we’ve created – just execute your queries.

This includes all bells and whistles you would expect from a modern database console including multiline input, command history, validation for things D1 may not yet support, and ability to save your Cloudflare credentials for later use.

Check out the full project on GitHub or NPM for more information.

Miniflare test Integration

The Miniflare project, which powers Wrangler’s local development experience, also provides fully-fledged test environments for popular JavaScript test runners, Jest and Vitest. With this comes the concept of Isolated Storage, allowing each test to run independently, so that changes made in one don’t affect the others. Brendan Coll, creator of Miniflare, guided the D1 test implementation to give the same benefits:

import Worker from ‘../src/index.ts’
const { DB } = getMiniflareBindings();

beforeAll(async () => {
  // Your D1 starts completely empty, so first you must create tables
  // or restore from a schema.sql file.
  await DB.exec(`CREATE TABLE entries (id INTEGER PRIMARY KEY, value TEXT)`);

// Each describe block & each test gets its own view of the data.
describe(‘with an empty DB’, () => {
  it(‘should report 0 entries’, async () => {
    await Worker.fetch(...)
  it(‘should allow new entries’, async () => {
    await Worker.fetch(...)

// Use beforeAll & beforeEach inside describe blocks to set up particular DB states for a set of tests
describe(‘with two entries in the DB’, () => {
  beforeEach(async () => {
    await DB.prepare(`INSERT INTO entries (value) VALUES (?), (?)`)
            .bind(‘aaa’, ‘bbb’)
  // Now, all tests will run with a DB with those two values
  it(‘should report 2 entries’, async () => {
    await Worker.fetch(...)
  it(‘should not allow duplicate entries’, async () => {
    await Worker.fetch(...)

All the databases for tests are run in-memory, so these are lightning fast. And fast, reliable testing is a big part of building maintainable real-world apps, so we’re thrilled to extend that to D1.

Want access to the private beta?

Feeling inspired?

We love to see what our beta users build or want to build especially when our products are at an early stage. As we march toward an open beta, we’ll be looking specifically for your feedback. We are slowly letting more folks into the beta, but if you haven’t received your “golden ticket” yet with access, sign up here! Once you’ve been invited in, you’ll receive an official welcome email.

As always, happy building!

Introducing Cache Rules: precision caching at your fingertips

Post Syndicated from Alex Krivit original

Introducing Cache Rules: precision caching at your fingertips

Introducing Cache Rules: precision caching at your fingertips

Ten years ago, in 2012, we released a product that put “a powerful new set of tools” in the hands of Cloudflare customers, allowing website owners to control how Cloudflare would cache, apply security controls, manipulate headers, implement redirects, and more on any page of their website. This product is called Page Rules and since its introduction, it has grown substantially in terms of popularity and functionality.

Page Rules are a common choice for customers that want to have fine-grained control over how Cloudflare should cache their content. There are more than 3.5 million caching Page Rules currently deployed that help websites customize their content. We have spent the last ten years learning how customers use those rules to cache content, and it’s clear the time is ripe for evolving rules-based caching on Cloudflare. This evolution will allow for greater flexibility in caching different types of content through additional rule configurability, while providing more visibility into when and how different rules interact across Cloudflare’s ecosystem.

Today, we’ve announced that Page Rules will be re-imagined into four product-specific rule sets: Origin Rules, Cache Rules, Configuration Rules, and Redirect Rules.

In this blog we’re going to discuss Cache Rules, and how we’re applying ten years of product iteration and learning from Page Rules to give you the tools and options to best optimize your cache.

Activating Page Rules, then and now

Adding a Page Rule is very simple: users either make an API call or navigate to the dashboard, enter a full or wildcard URL pattern (e.g. or*), and tell us which actions to perform when we see that pattern. For example a Page Rule could tell browsers– keep a copy of the response longer via “Browser Cache TTL”, or tell our cache that via “Edge Cache TTL”. Low effort, high impact. All this is accomplished without fighting origin configuration or writing a single line of code.

Under the hood, a lot is happening to make that rule scale: we turn every rule condition into regexes, matching them against the tens of millions of requests per second across 275+ data centers globally. The compute necessary to process and apply new values on the fly across the globe is immense and corresponds directly to the number of rules we are able to offer to users. By moving cache actions from Page Rules to Cache Rules we can allow for users to not only set more rules, but also to trigger these rules more precisely.

More than a URL

Users of Page Rules are limited to specific URLs or URL patterns to define how browsers or Cloudflare cache their websites files. Cache Rules allows users to set caching behavior on additional criteria, such as the HTTP request headers or the requested file type. Users can continue to match on the requested URL also, as used in our Page Rules example earlier. With Cache Rules, users can now define this behavior on one or more fields available.

For example, if a user wanted to specify cache behavior for all image/png content-types, it’s now as simple as pushing a few buttons in the UI or writing a small expression in the API. Cache Rules give users precise control over when and how Cloudflare and browsers cache their content. Cache Rules allow for rules to be triggered on request header values that can be simply defined like

any(http.request.headers["content-type"][*] == "image/png")

Which triggers the Cache Rule to be applied to all image/png media types. Additionally, users may also leverage other request headers like cookie values, user-agents, or hostnames.

As a plus, these matching criteria can be stacked and configured with operators like AND and OR, providing additional simplicity in building complex rules from many discrete blocks, e.g. if you would like to target both image/png AND image/jpeg.

For the full list of fields available conditionals you can apply Cache Rules to, please refer to the Cache Rules documentation.

Introducing Cache Rules: precision caching at your fingertips

Visibility into how and when Rules are applied

Our current offerings of Page Rules, Workers, and Transform Rules can all manipulate caching functionality for our users’ content. Often, there is some trial and error required to make sure that the confluence of several rules and/or Workers are behaving in an expected manner.

As part of upgrading Page Rules we have separated it into four new products:

  1. Origin Rues
  2. Cache Rules
  3. Configuration Rules
  4. Redirect Rules

This gives users a better understanding into how and when different parts of the Cloudflare stack are activated, reducing the spin-up and debug time. We will also be providing additional visibility in the dashboard for when rules are activated as they go through Cloudflare. As a sneak peek please see:

Introducing Cache Rules: precision caching at your fingertips

Our users may take advantage of this strict precedence by chaining the results of one product into another. For example, the output of URL rewrites in Transform Rules will feed into the actions of Cache Rules, and the output of Cache Rules will feed into IP Access Rules, and so on.

In the future, we plan to increase this visibility further to allow for inputs and outputs across the rules products to be observed so that the modifications made on our network can be observed before the rule is even deployed.

Cache Rules. What are they? Are they improved? Let’s find out!

To start, Cache Rules will have all the caching functionality currently available in Page Rules. Users will be able to:

  • Tell Cloudflare to cache an asset or not,
  • Alter how long Cloudflare should cache an asset,
  • Alter how long a browser should cache an asset,
  • Define a custom cache key for an asset,
  • Configure how Cloudflare serves stale, revalidates, or otherwise uses header values to direct cache freshness and content continuity,

And so much more.

Cache Rules are intuitive and work similarly to our other ruleset engine-based products announced today: API or UI conditionals for URL or request headers are evaluated, and if matching, Cloudflare and browser caching options are configured on behalf of the user. For all the different options available, see our Cache Rules documentation.

Under the hood, Cache Rules apply targeted rule applications so that additional rules can be supported per user and across the whole engine. What this means for our users is that by consuming less CPU for rule evaluations, we’re able to support more rules per user. For specifics on how many additional Cache Rules you’ll be able to use, please see the Future of Rules Blog.

Introducing Cache Rules: precision caching at your fingertips

How can you use Cache Rules today?

Cache Rules are available today in beta and can be configured via the API, Terraform, or UI in the Caching tab of the dashboard. We welcome you to try the functionality and provide us feedback for how they are working or what additional features you’d like to see via community posts, or however else you generally get our attention 🙂.

If you have Page Rules implemented for caching on the same path, Cache Rules will take precedence by design. For our more patient users, we plan on releasing a one-click migration tool for Page Rules in the near future.

What’s in store for the future of Cache Rules?

In addition to granular control and increased visibility, the new rules products also opens the door to more complex features that can recommend rules to help customers achieve better cache hit ratios and reduce their egress costs, adding additional caching actions and visibility, so you can see precisely how Cache Rules will alter headers that Cloudflare uses to cache content, and allowing customers to run experiments with different rule configurations and see the outcome firsthand. These possibilities represent the tip of the iceberg for the next iteration of how customers will use rules on Cloudflare.

Try it out!

We look forward to you trying Cache Rules and providing feedback on what you’d like to see us build next.

WebRTC live streaming to unlimited viewers, with sub-second latency

Post Syndicated from Kyle Boutette original

WebRTC live streaming to unlimited viewers, with sub-second latency

WebRTC live streaming to unlimited viewers, with sub-second latency

Creators and broadcasters expect to be able to go live from anywhere, on any device. Viewers expect “live” to mean “real-time”. The protocols that power most live streams are unable to meet these growing expectations.

In talking to developers building live streaming into their apps and websites, we’ve heard near universal frustration with the limitations of existing live streaming technologies. Developers in 2022 rightly expect to be able to deliver low latency to viewers, broadcast reliably, and use web standards rather than old protocols that date back to the era of Flash.

Today, we’re excited to announce in open beta that Cloudflare Stream now supports live video streaming over WebRTC, with sub-second latency, to unlimited concurrent viewers. This is a new feature of Cloudflare Stream, and you can start using it right now in the Cloudflare Dashboard — read the docs to get started.

WebRTC with Cloudflare Stream leapfrogs existing tools and protocols, exclusively uses open standards with zero dependency on a specific SDK, and empowers any developer to build both low latency live streaming and playback into their website or app.

The status quo of streaming live video is broken

The status quo of streaming live video has high latency, depends on archaic protocols and is incompatible with the way developers build apps and websites. A reasonable person’s expectations of what the Internet should be able to deliver in 2022 are simply unmet by the dominant set of protocols carried over from past eras.

Viewers increasingly expect “live” to mean “real-time”. People want to place bets on sports broadcasts in real-time, interact and ask questions to presenters in real-time, and never feel behind their friends at a live event.

In practice, the HLS and DASH standards used to deliver video have 10+ seconds of latency. LL-HLS and LL-DASH bring this down to closer to 5 seconds, but only as a hack on top of the existing protocol that delivers segments of video in individual HTTP requests. Sending mini video clips over TCP simply cannot deliver video in real-time. HLS and DASH are here to stay, but aren’t the future of real-time live video.

Creators and broadcasters expect to be able to go live from anywhere, on any device.

In practice, people creating live content are stuck with a limited set of native apps, and can’t go live using RTMP from a web browser. Because it’s built on top of TCP, the RTMP broadcasting protocol struggles under even the slightest network disruption, making it a poor or often unworkable option when broadcasting from mobile networks. RTMP, originally built for use with Adobe Flash Player, was last updated in 2012, and while Stream supports the newer SRT protocol, creators need an option that works natively on the web and can more easily be integrated in native apps.

Developers expect to be able to build using standard APIs that are built into web browsers and native apps.

In practice, RTMP can’t be used from a web browser, and creating a native app that supports RTMP broadcasting typically requires diving into lower-level programming languages like C and Rust. Only those with expertise in both live video protocols and these languages have full access to the tools needed to create novel live streaming client applications.

We’re solving this by using new open WebRTC standards: WHIP and WHEP

WebRTC is the real-time communications protocol, supported across all web browsers, that powers video calling services like Zoom and Google Meet. Since inception it’s been designed for real-time, ultra low-latency communications.

While WebRTC is well established, for most of its history it’s lacked standards for:

  • Ingestion — how broadcasters should send media content (akin to RTMP today)
  • Egress — how viewers request and receive media content (akin to DASH or HLS today)

As a result, developers have had to implement this on their own, and client applications on both sides are often tightly coupled to provider-specific implementations. Developers we talk to often express frustration, having sunk months of engineering work into building around a specific vendor’s SDK, unable to switch without a significant rewrite of their client apps.

At Cloudflare, our mission is broader — we’re helping to build a better Internet. Today we’re launching not just a new feature of Cloudflare Stream, but a vote of confidence in new WebRTC standards for both ingestion and egress. We think you should be able to start using Stream without feeling locked into an SDK or implementation specific to Cloudflare, and we’re committed to using open standards whenever possible.

For ingestion, WHIP is an IETF draft on the Standards Track, with many applications already successfully using it in production. For delivery (egress), WHEP is an IETF draft with broad agreement. Combined, they provide a standardized end-to-end way to broadcast one-to-many over WebRTC at scale.

Cloudflare Stream is the first cloud service to let you both broadcast using WHIP and playback using WHEP — no vendor-specific SDK needed. Here’s how it works:

WebRTC live streaming to unlimited viewers, with sub-second latency

Cloudflare Stream is already built on top of the Cloudflare developer platform, using Workers and Durable Objects running on Cloudflare’s global network, within 50ms of 95% of the world’s Internet-connected population.

Our WebRTC implementation extends this to relay WebRTC video through our network. Broadcasters stream video using WHIP to the point of presence closest to their location, which tells the Durable Object where the live stream can be found. Viewers request streaming video from the point of presence closest to them, which asks the Durable Object where to find the stream, and video is routed through Cloudflare’s network, all with sub-second latency.

Using Durable Objects, we achieve this with zero centralized state. And just like the rest of Cloudflare Stream, you never have to think about regions, both in terms of pricing and product development.

While existing ultra low-latency streaming providers charge significantly more to stream over WebRTC, because Stream runs on Cloudflare’s global network, we’re able to offer WebRTC streaming at the same price as delivering video over HLS or DASH. We don’t think you should be penalized with higher pricing when choosing which technology to rely on to stream live video. Once generally available, WebRTC streaming will cost $1 per 1000 minutes of video delivered, just like the rest of Stream.

What does sub-second latency let you build?

Ultra low latency unlocks interactivity within your website or app, removing the time delay between creators, in-person attendees, and those watching remotely.

Developers we talk to are building everything from live sports betting, to live auctions, to live viewer Q&A and even real-time collaboration in video post-production. Even streams without in-app interactivity can benefit from real-time — no sports fan wants to get a text from their friend at the game that ruins the moment, before they’ve had a chance to watch the final play. Whether you’re bringing an existing app or have a new idea in mind, we’re excited to see what you build.

If you can write JavaScript, you can let your users go live from their browser

While hobbyist and professional creators might take the time to download and learn how to use an application like OBS Studio, most Internet users won’t get past this friction of new tools, and copying RTMP keys from one tool to another. To empower more people to go live, they need to be able to broadcast from within your website or app, just by enabling access to the camera and microphone.

Cloudflare Stream with WebRTC lets you build live streaming into your app as a front-end developer, without any special knowledge of video protocols. And our approach, using the WHIP and WHEP open standards, means you can do this with zero dependencies, with 100% your code that you control.

Go live from a web browser with just a few lines of code

You can go live right now, from your web browser, by creating a live input in the Cloudflare Stream dashboard, and pasting a URL into the example linked below.

Read the docs or run the example code below in your browser using Stackbltiz.

<video id="input-video" autoplay autoplay muted></video>

import WHIPClient from "./WHIPClient.js";

const videoElement = document.getElementById("input-video");
const client = new WHIPClient(url, videoElement);

This example uses an example WHIP client, written in just 100 lines of Javascript, using APIs that are native to web browsers, with zero dependencies. But because WHIP is an open standard, you can use any WHIP client you choose. Support for WHIP is growing across the video streaming industry — it has recently been added to Gstreamer, and one of the authors of the WHIP specification has written a Javascript client implementation. We intend to support the full WHIP specification, including supporting Trickle ICE for fast NAT traversal.

Play a live stream in a browser, with sub-second latency, no SDK required

Once you’ve started streaming, copy the playback URL from the live input you just created, and paste it into the example linked below.

Read the docs or run the example code below in your browser using Stackbltiz.

<video id="playback" controls autoplay muted></video>

import WHEPClient from './WHEPClient.js';
const videoElement = document.getElementById("playback");
const client = new WHEPClient(url, videoElement);

Just like the WHIP example before, this one uses an example WHEP client we’ve written that has zero dependencies. WHEP is an earlier IETF draft than WHIP, published in July of this year, but adoption is moving quickly. People in the community have already written open-source client implementations in both Javascript, C, with more to come.

Start experimenting with real-time live video, in open beta today

WebRTC streaming is in open beta today, ready for you to use as an integrated feature of Cloudflare Stream. Once Generally Available, WebRTC streaming will be priced like the rest of Cloudflare Stream, based on minutes of video delivered and minutes stored.

Read the docs to get started.

Where to? Introducing Origin Rules

Post Syndicated from Matt Bullock original

Where to? Introducing Origin Rules

Host headers are key

Where to? Introducing Origin Rules

The host header of an HTTP request tells the receiving server (‘origin’) which website or application a client wants to access.

When an origin receives an HTTP request, it checks the value of this ‘host’ header to see if it is responsible for that traffic. If it finds a match the request will be routed appropriately and the correct data will be returned to the visitor. If it doesn’t find a match, it will return an error telling the visitor it doesn’t have an application or website that matches what they are asking for.

In simple setups this is often not an issue. All requests for are sent to the same origin, which sees the host header and returns the relevant files. However, not all setups are as straightforward. SaaS (Software-as-a-Service) platforms use host headers to route visitors to the correct instance or S3-compatible bucket.

To ensure the correct content is still loaded, the host header must equal the name of this instance or bucket to allow the receiving origin to route it correctly. This means at some point in the traffic flow, the host header must be changed to match the instance or bucket name, before being sent to the SaaS platform.

Another common issue is when web applications on an origin are listening on a non-standard port, e.g. 8001.  Requests sent via HTTPS will by default arrive on port 443. To ensure the traffic isn’t subsequently sent to port 443 on the origin the traffic must be intercepted and have the destination port changed to 8001. This ensures the origin is receiving traffic where it expects it. Previously this would be done as a Cloudflare Worker, Cloudflare Spectrum application or by running a dedicated application on the origin.

Both of these scenarios require customers to write and maintain code to intercept HTTP requests and parse them to ensure they go to the correct origin location, the correct port on that origin, and with the correct host header. This is a burden for administrators to maintain, particularly as legacy applications are migrated away from on-premise and into SaaS.

Cloudflare users want more control on where their traffic goes to – when it goes there – and what it looks like when it arrives. And they want this to be simple to set up and maintain.

To meet those demands we are today announcing Origin Rules, a new product which allows for overriding the host header, the Server Name Indication (SNI), destination port and DNS resolution of matching HTTP requests.

Origin Rules is now the one-stop destination for users who want to change which origin traffic goes to, when this should happen, and what that traffic looks like when it arrives – all without ever having to write a single line of code.

One hostname, many origins

Setting up your service on Cloudflare is very simple. You tell us your domain name,, and where traffic should be sent to when we receive requests that match it. Often this is an IP address. You can also create subdomains, e.g., and follow the same pattern.

This allows for the web server running to live on the IP address, and the web server responsible for running to live on a different IP address, e.g. When Cloudflare receives a request for, we send that traffic to the web server at with the host header

Where to? Introducing Origin Rules

As most web servers commonly serve multiple websites, this host header is used to ensure the correct content is loaded. The web server looks at the request it receives, checks the host header, and tries to match it against websites it’s been told to serve. If it finds a match, it will route this request to the corresponding website’s configuration and the correct files are returned to the visitor.

This has been a foundational principle of the Internet for many years now. Unsurprisingly however, new solutions emerge and user needs evolve.

We have heard from users who want to be able to send different URLs to different origins, such as a SaaS provider for their ecommerce platform and a SaaS provider for their support desk. To achieve this, user’s could, and do, decide to run and manage their own reverse proxy running at this IP address to act as a router. This allows a user to send all traffic for to a single IP address, and let the reverse proxy determine where it goes next:

    location ~ ^/shop { 
        proxy_set_header   Host $http_host;
        proxy_pass         "$1";

This reverse proxy would detect the traffic sent with the host header with a URI path starting with /shop, and send those matching HTTP requests to the correct SaaS application.

This is potentially a complex system to maintain, however, and as it is an ‘extra hop’, there is an increase in latency as requests first go through Cloudflare, to the origin server, then back to the SaaS provider – who may also be on Cloudflare. In a world rapidly migrating away from on-premise software to SaaS platforms, running your own server to do this specific function goes against the grain.

Users therefore want a way to tell Cloudflare – ‘for all traffic to, send it to BUT, if you see any traffic to, send it to’. This is what we call a resolve override. It is essentially a DNS override.

With a resolve override in place, HTTP requests to are now correctly sent by Cloudflare to as requested. And they fail. The web server says it doesn’t know what to do with the HTTP request. This is because the host header is still, and the web server does not have any knowledge of that website.

Where to? Introducing Origin Rules

To fix this, we need to make sure these requests are sent to with a host header of This is what is known as a host header override. Now, requests to are not only correctly routed to, but the host header is changed to one that the ecommerce software is expecting – and thus the request is correctly routed, and the visitors sees the correct content.

Where to? Introducing Origin Rules

The management of these selective overrides, and other overrides, is achieved via Origin Rules.

Origin Rules allow users to route HTTP traffic to different destinations and override certain request characteristics based on a number of criteria such as the visitor’s country, IP address or HTTP request headers.

Route on more than a URL

Origin Rules is built on top of our ruleset engine. This gives users the ability to perform routing decisions based on many fields including the requested URL, and also the visitors country, specific request headers, and more.

Using a combination of one or more of these available fields, users can ensure traffic is routed to specific backends, only when specific criteria are met such as host, URI path, visitor’s country, and HTTP request headers.

Historically, host header override and resolve override were achieved with the setting of a Page Rule.

Where to? Introducing Origin Rules

Page Rules is the ‘If This Then That’ of Cloudflare. Where the ‘If…’ is a URL, and the ‘Then That’ is changing how we handle traffic to specific parts of a ‘zone’. It allows users to selectively change how traffic is handled, or in this case, where traffic is sent. It is very well adopted, with over one million Page Rules in the past three months alone.

Page Rules, however, are limited to performing actions based upon the requested URL. This means if users want to change the backend a HTTP request goes to, they need to make that decision based on the URL alone. This can be challenging for users who may want to perform this decision-making on more nuanced aspects, like the user agent of the visitor or on the presence of a specific cookie.

With Origin Rules, users can perform host header override, resolve override, destination port override and SNI overrides – based on any number of criteria – not only the requested URL. This unlocks a number of interesting use cases.

Example use case: integration with cloud storage endpoints

One such use case is using a cloud storage provider as a backend for static assets, such as images. Enterprise zones can use a combination of host header override and resolve override actions to override the destination of outgoing HTTP requests. This allows for all traffic to to be sent to, but requests to*.jpg be sent to a publicly accessible S3-compatible bucket.

Where to? Introducing Origin Rules

To do this, the user would create an Origin Rule setting the resolve override value to be a DNS record on their own zone, pointing to the S3 provider’s URL. This ensures that requests matching the pattern are routed to the S3 URL. However, when the cloud storage provider receives the request it will drop it – as it does not know how to route requests for the host Therefore, users also need to deploy a host header override, changing this value to match the bucket name – e.g.

Combined, this ensures requests matching the pattern correctly reach the cloud storage provider – with a host header it can use to correctly route the request to the correct bucket.

Origin Rules also enable new use cases. For example, a user can use Origin Rules to A/B test different cloud providers prior to a cut over. This is possible by using the field http.request.cookies and routing traffic to a new, test bucket or cloud provider based on the presence of a specific cookie on the request.

Users with multiple storage regions can also use the field within a filter expression to route users to the closest storage instance, reducing latency and time to load for these requests.

Destination port override

Cloudflare listens on 13 ports; seven ports for HTTP, six ports for HTTPS. This means if a request is sent to a URL with the destination port of 443, as is standard for HTTPS, it will be sent to the origin server with a destination port of 443. The same 1:1 mapping applies to the other twelve ports.

But what if a user wanted to change that mapping? For example, when the backend origin server is listening on port 8001. In this scenario, an intermediate service is required to listen for requests on port 443 and create a sub-request with the destination port set to 8001.

Historically this was done on the origin server itself – with a reverse proxy server listening for requests on 443 and other ports and proxying those requests to another port.

 <VirtualHost *:*>
        ProxyPreserveHost On
        ProxyPass /
        ProxyPassReverse /

server {
  listen 443;
    location / {

More recently, users have deployed Cloudflare Workers to perform this service, modifying the destination port before HTTP requests ever reach their servers.

Origin Rules simplifies destination port modifications, letting users change the destination port via a simple rules experience without ever having to write a single line of code or configuration:

Where to? Introducing Origin Rules

This destination port modification can also be triggered on almost any field available in the ruleset engine, also, allowing users to change which port to send requests to based on URL, URI path, the presence of HTTP request header and more.

Server Name Indication

Server Name Indication (SNI) is an addition to the TLS encryption protocol. It enables a client device to specify the domain name it is trying to reach in the first step of the TLS handshake, preventing common “name mismatch” errors. Customers using Cloudflare for SaaS may have millions of hostnames pointing to Cloudflare. However, the origin that these requests are sent to may not have an individual certificate for each of the hostnames.

Users today have the option of doing this on a per custom hostname basis using custom origins in SSL for SaaS, however for Enterprise customers not using this setup it was an impossible task.

Enterprise users can use Origin Rules to override the value of the SNI, providing it matches any other zone in their account. This removes the need for users to manage multiple certificates on the origin or choose not to encrypt connections from Cloudflare to the origin.

Try it now

Origin Rules are available to use now via API, Terraform, and our dashboard. Further details can be found on our Developers Docs. Currently, destination port rewriting is available for all our customers as part of Origin Rules. Resolve Override, Host Header Override and SNI overrides are available to our Enterprise users.

Introducing Configuration Rules

Post Syndicated from Matt Bullock original

Introducing Configuration Rules

A powerful new set of tools

Introducing Configuration Rules

In 2012, we introduced Page Rules to the world, announcing:

“Page Rules is a powerful new set of tools that allows you to control how CloudFlare works on your site on a page-by-page basis.”

Ten years later, and with all F’s lowercase, we are excited to introduce Configuration Rules — a Page Rules successor and a much improved way of controlling Cloudflare features and settings. With Configuration Rules, users can selectively turn on/off features which would typically be applied to every HTTP request going through the zone. They can do this based on URLs – and more, such as cookies or country of origin.

Configuration Rules opens up a wide range of use cases for our users that previously were impossible without writing custom code in a Cloudflare Worker. Such use cases as A/B testing configuration or only enabling features for a set of file extensions are now made possible thanks to the rich filtering capabilities of the product.

Configuration Rules are available for use immediately across all plan levels.

Turn it on, but only when…

As each HTTP request enters a Cloudflare zone we apply a configuration. This configuration tells the Cloudflare server handling the HTTP request which features the HTTP request should ‘go’ through, and with what settings/options. This is defined by the user, typically via the dashboard.

The issue arises when users want to enable these features, such as Polish as Auto Minify, only on a subset of the traffic to their website. For example, users may want to disable Email Obfuscation but only for a specific page on their website so that contact information is shown correctly to visitors. To do this, they can deploy a Configuration Rule.

Introducing Configuration Rules

Configuration Rules lets users selectively enable or disable features based on one or more ruleset engine fields.

Currently, there are 16 available actions within Configuration Rules. These actions range from Disable Apps, Disable Railgun and Disable Zaraz to Auto Minify, Polish and Mirage.

These actions effectively ‘override’ the corresponding zone-wide setting for matching traffic. For example, Rocket Loader may be enabled for the zone

Introducing Configuration Rules

If the user, however, does not want Rocket Loader to be enabled on their checkout page due to an issue it causes with a specific JavaScript element, they could create a Configuration Rule to selectively disable Rocket Loader:

Introducing Configuration Rules

This interplay between zone level settings and Configuration Rules allows users to selectively enable features, allowing them to test Rocket Loader on prior to flipping the zone-level toggle.

With Configuration Rules, users also have access to various other non-URL related fields. For example, users could use the field to ensure that visitors for specific countries always have the ‘Security Level’ set to ‘I’m under attack’.

Historically, these configuration overrides were achieved with the setting of a Page Rule.

Page Rules is the ‘If This Then That’ of Cloudflare. Where the ‘If…’ is a URL, and the ‘Then That’ is changing how we handle traffic to specific parts of a ‘zone’. It allows users to selectively change how traffic is handled, and in this case specifically, which settings are and aren’t applied. It is very well adopted, with over one million Page Rules in the past three months alone.

Page Rules, however, are limited to performing actions based upon the requested URL. This means if users want to disable Rocket Loader for certain traffic, they need to make that decision based on the URL alone. This can be challenging for users who may want to perform this decision-making on more nuanced aspects, like the user agent of the visitor or on the presence of a specific cookie.

For example, users might want to set the ‘Security Level’ to ‘I’m under attack’ when the HTTP request originates in certain countries. This is where Configuration Rules help.

Use case: A/B testing

A/B testing is the term used to describe the comparison of two versions of a single website or application. It allows users to create a copy of their current website (‘A’), change it (‘B’) and compare the difference.

In a Cloudflare context, users might want to A/B test the effect of features such as Mirage or Polish prior to enabling them for all traffic to the website. With Page Rules, this was impractical. Users would have to create Page Rules matching on specific URI query strings and A/B test by appending those query strings to every HTTP request.

Introducing Configuration Rules

With Configuration Rules, this task is much simpler. Leveraging one or more fields, users can filter on other parameters of a HTTP request to define which features and products to enable.

For example, by using the expression any(http.request.cookies["app"][*] == "test") a user can ensure that Auto Minify, Mirage and Polish are enabled only when this cookie is present on the HTTP request. This allows comparison testing to happen before enabling these products either globally, or on a wider set of traffic. All without impacting existing production traffic.

Introducing Configuration Rules

Use case: augmenting URLs

Configuration Rules can be used to augment existing requirements, also. One example given in ‘The Future of Page Rules’ blog is increasing the Security Level to ‘High’ for visitors trying to access the contact page of a website, to reduce the number of malicious visitors to that page.

In Page Rules, this would be done by simply specifying the contact page URL and specifying the security level, e.g. URL:*. This ensures that any “visitors that exhibit threatening behavior within the last 14 days” are served with a challenge prior to having that page load.

Configuration Rules can take this use case and augment it with additional fields, such as whether the source IP address is in a Cloudflare Managed IP List. This allows users to be more specific about when the security level is changed to ‘High’, such as only when the request also is marged as coming from an open HTTP and SOCKS proxy endpoints, which are frequently used to launch attacks and hide attackers identity:

Introducing Configuration Rules

This reduces the chance of a false positive, and a genuine visitor to the contact form being served with a challenge.

Try it now

Configuration Rules are available now via API, UI, and Terraform for all Cloudflare plans! We are excited to see how you will use them in conjunction with all our new rules releases from this week.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.