Tag Archives: Birthday Week

Birthday week: Cloudflare turns 10

2020-10-04 James Allworth

Post Syndicated from James Allworth original https://blog.cloudflare.com/birthday-week-cloudflare-turns-10/

Birthday week: Cloudflare turns 10

2020 marks a major milestone for Cloudflare: it’s our 10th birthday.

We’ve always used birthdays as an opportunity to give back to the Internet. But this year — a year in which the Internet has been so central to giving us all some degree of connectedness and normalcy — it feels like giving back to the Internet has been more important than ever.

And while we couldn’t celebrate in person, we were humbled by some of the incredible minds that joined us online to talk about how the Internet has changed over the last ten years — and what we might see over the next ten.

With that, let’s recap the key announcements from Birthday Week 2020.

Day 1, Monday: Workers

During Birthday Week in 2017, Cloudflare announced Workers — a serverless platform that represented a completely new way to build applications: by writing your code directly onto our network edge. On Monday of this year’s Birthday Week, we announced Durable Objects and Cron Triggers — both of which continue to expand the use cases that Workers can address.

Many folks associate the serverless paradigm with functions as a service — which, at its core, is stateless. Workers KV started down the path of changing this, providing high availability storage on the edge. However, there are use cases where consistency (a client making a request to a database will get the same view of data) is more important than availability (a client making a request to a database requests always receives a response). Say you want to sell tickets to a concert — you don’t want to allow two people to be able to purchase the same ticket. With a traditional application, with a database running in one location, that’s relatively easy to ensure. But with Workers running in Cloudflare’s data centers all over the world, ensuring consistency is a little bit more challenging. Workers Durable Objects solves for this for developers: giving them access to high consistency storage when they’re building on the Workers platform.

Similarly, triggering Workers has historically needed a user to do something, A user visiting a URL, for example. But developers have use cases when they want a Worker to run, independent of a user doing something right now. Syncing for example. Batch jobs. Or perhaps doing something 24 hours after a user has done something. And this is where Cron Triggers come in — now, for developers on the Workers platform, there’s no more need to rely on an eyeball to get things rolling.

Day 2, Tuesday: Analytics

There are a lot of website analytics products out there on the market. Many of those products are, not surprisingly, very good.

But the way they’ve been implemented often leaves a lot to be desired. Most of them operate by tracking individual users, using client-side state like cookies or localStorage — or even fingerprints. This is increasingly a problem. There’s the principle of it: we don’t want to be tracked individually — why would we want visitors to our web properties to feel tracked either? Beyond that though, because so many people are feeling uncomfortable with how they’re being tracked around the web, they’re simply blocking a lot of these analytics products. As a result, all these analytics products are increasingly becoming less accurate.

On Tuesday, we announced a new Web Analytics product that allows you to get the best of both worlds — detailed and accurate analytics, without compromising on the privacy of your users. We don’t use any client-side state, like cookies or localStorage, for the purposes of tracking users. And we don’t “fingerprint” individuals via their IP address, User Agent string, or any other data for the purpose of displaying analytics (we consider fingerprinting even more intrusive than cookies, because users have no way to opt out). Because Cloudflare’s business has never been built around tracking users or selling advertising, we don’t do it. Just the metrics, ma’am.

That wasn’t all on Tuesday, though. Another crucial aspect of owning a web property is website performance. Not only does it impact user experience, Google uses a blended measure of performance to inform site ranking in their search results. Google’s Chrome team has been doing some great work on metricizing site performance, and that’s culminated in Web Vitals. We’ve worked with the Chrome team to integrate Web Vitals in our Browser Insights product. You’ve always gotten edge-side performance analytics from Cloudflare, but now, you’re not just seeing the server side view of your web performance: it’s blended with how your users perceive performance, too. We take all that data and present it in a pragmatic way to help you figure out what you need to do to optimize the performance of your site.

Day 3, Wednesday: Cloudflare Radar and Speeding up HTTPS/HTTP3

As of today, Cloudflare sits in front of 14.5% of the world’s top 10 million websites. The privilege of getting to serve so many different customers means we get visibility into a lot of things on the web. Wednesday of birthday week was about us taking advantage of that for everyone who is out on the web today.

If you think about the traffic flowing through a city at any given time, it’s like a living, breathing creature. It ebbs and flows; it has rhythms that follow the sun and moon. Unusual events can cause traffic jams; as can accidents. Many cities have traffic reporting services for exactly this reason; knowing what’s going on can help immensely those that need to navigate the city streets. The web is like a global version of this, and given the role that the Internet now plays for humanity, understanding what’s going on probably equals in importance to all those city traffic reports all around the world.

And yet, when you want to get the equivalent of that traffic report, where do you go?

Cloudflare Radar is our answer to that question. Each second, Cloudflare handles on average 18 million HTTP requests and 6 million DNS requests. We block 72 billion cyberthreats every day. Add to that 1 billion unique IP addresses connecting to Cloudflare’s network, we have one of the most representative views on Internet traffic worldwide. Before Radar, all this activity, good and bad, was only available internally at Cloudflare: we used it to help improve our service and protect our customers. With the release of Radar, however, we’re exposing it externally: shining a light on the Internet’s patterns for the world to see.

On the subject of spotting interesting patterns. Back in late June, our team noticed a weird spike in DNS requests for the 65479 Resource Record. It turns out, these spikes were a part of Apple’s iOS14 beta release — Apple were testing out a new SVCB/HTTPS record type. The aim: to patch a limitation that’s been inherent in the HTTPS and HTTP3 protocol. When a user types in a URL without specifying the protocol (e.g. HTTPS), the initial negotiation happens in plaintext because browsers will start with HTTP. Only once it’s established that an HTTPS or HTTP3 resource exists will the browser transition over to that. The problem here is twofold: latency, and also security.

But you know what happens before any HTTP negotiation can happen? A DNS request. And that’s what Apple had implemented that created this interesting pattern: the DNS request was effectively asking whether the site supported HTTPS, or HTTP3. As of Wednesday during birthday week, Cloudflare’s DNS servers will now automatically generate HTTPS records on the fly to advertise whether a particular zone supports HTTP/3 and/or HTTP/2, based on whether those features are enabled on the zone. The result: better performance, and improved security. Who says you need to pick just one?

Day 4, Thursday: API day

Nobody has ever doubted the importance of user interfaces. Finding ways for humans and computers to engage each other has been an area of focus since the very first computers were invented. But as the web has grown, data has become the new oil, and applications have proliferated, there’s another interface that has grown in importance: the interface between different types of applications. Day 4 of Birthday Week was all about APIs.

The first announcement was beta support for gRPC: a new type of protocol that’s intended for building APIs at scale. Most REST APIs use HTTPS and JSON to communicate values. The problem with these is that they’re really designed for that other type of interface mentioned above: for humans to talk to computers. The upside is it makes things human readable; the downside is they’re really inefficient, and as the use of APIs only continues to explode this inefficiency proliferates. The gRPC protocol is an answer to this: it’s an efficient protocol for computers to talk to each other. But up until now, that also came at a price: because gRPC uses newer technology (like HTTP/2) under the covers, existing security and performance tools did not support gRPC traffic out of the box. This meant that customers adopting gRPC to power their APIs had to pick between modernity on one hand, and things like security, performance, and reliability on the other.

Cloudflare’s announcement of support of gPRC fixes this trade-off: when you put your gPRC APIs on Cloudflare, you get all the traditional benefits of Cloudflare along with it. Apprehensive of exposing your APIs to bad actors? Need more performance? Turn on Argo Smart Routing to decrease time to first byte. Increase reliability by adding a Load Balancer. Or add security features such as Bot Management and the WAF.

Speaking of the WAF. If you think about the way our WAF works, it secures web application from attacks by looking for attack patterns — say, bot patterns that try to imitate human patterns, or abuse of how a browser interacts with a site; in both instances, the attack is intended to break something. But because what computers need to talk to each other is different from what computers need to talk to humans, the attack vectors are different. Therefore protecting APIs isn’t quite the same as protecting websites.

API Shield is purpose-built for just this. It makes it simple to secure APIs through the use of strong client certificate-based authentication, and strict schema-based validation. On the authentication side, API shield uses mutual TLS — which is not vulnerable to the reuse or sharing of passwords or tokens. And once developers can be sure that only legitimate clients (with SSL certificates in hand) are connecting to their APIs, the next step in API Shield is making sure that those clients are making valid requests. It works by matching the contents of API requests—the query parameters that come after the URL and contents of the POST body—against a contract or “schema” that contains the rules for what is expected. If validation fails, the API call is blocked protecting the origin from an invalid request or a malicious payload.

And, as you’d expect from Cloudflare, gRPC and API Shield support each other out of the box.

Day 5, Friday: Automatic Platform Optimization (starting with WordPress)

The idea of caching static assets is not new, and it’s something Cloudflare has supported from its inception. It works wonders in speeding up websites: particularly if your origin is slow and/or your user is far from the origin server, then all your performance metrics will be affected. Caching also also has the added benefit of reducing load on origin servers.

However, things get a little more tricky when it comes to dynamic assets: if the asset could change, shouldn’t you go back to the origin just to make sure? For this reason, by default, Cloudflare doesn’t cache HTML content: there’s a chance it’s going to change for each user. The reality is though, most HTML isn’t really dynamic. It needs to be able to change relatively quickly when the site is updated but for a huge portion of the web, the content is static for months or years at a time. There are special cases like when a user is logged-in (as the admin or otherwise) where the content needs to differ but the vast majority of visits are of anonymous users.

Automatic Platform Optimization, which was announced on Friday, brings more intelligence to this — allowing us to figure out when we should be caching HTML, and when we shouldn’t. The advantage of this is it moves more content closer to the user, and it does it automagically — there’s no configuration required. The benefits aren’t trivial: a 72% reduction in Time to First Byte (TTFB), 23% reduction to First Contentful Paint, and 13% reduction in Speed Index for desktop users at the 90th percentile. We’re starting off with support for WordPress — 38% of all websites, but the plan is to expand this to other platforms in the near future.

All day, every day: Cloudflare TV

Ten years is a long time. The milestone for Cloudflare seemed to be the perfect opportunity to look back over the last ten years of the Internet — what’s changed, what’s surprised us? And more than that: what’s coming over the next ten years?

To look back and then peer out into the future, we were humbled to be joined by some of the most celebrated names in tech and beyond. Among the highlights: Apple co-founder Steve Wozniak, Zoom CEO Eric Yuan, OpenTable CEO Debby Soo, Stripe co-founder and President John Collison, Former CEO & Executive Chairman of Google and Co-Founder of Schmidt Futures Eric Schmidt, former McAfee CEO Chris Young, former Seal Team 6 Commander Dave Cooper, Project Include CEO Ellen Pao, and so many more. All told, it was 24 hours of live discussions over the course of the week.

And with that, it’s a wrap! To everyone who has been a part of the Cloudflare journey over the past 10 years: our customers, folks on the team, friends and supporters, and our partners all around the world: thank you. It’s been an incredible ride.

And, as our co-founder Michelle likes to say, we’re just getting started.

Introducing Automatic Platform Optimization, starting with WordPress

2020-10-02 Garrett Galow

Post Syndicated from Garrett Galow original https://blog.cloudflare.com/automatic-platform-optimizations-starting-with-wordpress/

Introducing Automatic Platform Optimization, starting with WordPress

Today, we are announcing a new service to serve more than just the static content of your website with the Automatic Platform Optimization (APO) service. With this launch, we are supporting WordPress, the most popular website hosting solution serving 38% of all websites. Our testing, as detailed below, showed a 72% reduction in Time to First Byte (TTFB), 23% reduction to First Contentful Paint, and 13% reduction in Speed Index for desktop users at the 90th percentile, by serving nearly all of your website’s content from Cloudflare’s network. This means visitors to your website see not only the first content sooner but all content more quickly.

With Automatic Platform Optimization for WordPress, your customers won’t suffer any slowness caused by common issues like shared hosting congestion, slow database lookups, or misbehaving plugins. This service is now available for anyone using WordPress. It costs $5/month for customers on our Free plan and is included, at no additional cost, in our Professional, Business, and Enterprise plans. No usage fees, no surprises, just speed.

How to get started

The easiest way to get started with APO is from your WordPress admin console.

1. First, install the Cloudflare WordPress plugin on your WordPress website or update to the latest version (3.8.2 or higher).
2. Authenticate the plugin (steps here) to talk to Cloudflare if you have not already done that.
3. From the Home screen of the Cloudflare section, turn on Automatic Platform Optimization.

Free customers will first be directed to the Cloudflare Dashboard to purchase the service.

Why We Built This

At Cloudflare, we jump at the opportunity to make hard problems for our customers disappear with the click of a button. Running a consistently fast website is challenging. Many businesses don’t have the time nor money to spend on complicated and expensive performance solutions for their site. Even if they do, it can be extremely costly to pay for specialized attention to ensure you get the best performance possible. Having a fast website doesn’t have to be complicated, though. The closer your content is to your customers, the better your site will perform. Static content caching does this for files like images, CSS and JavaScript, but that is only part of the equation. Dynamic content is still fetched from the origin incurring costly round trips and additional processing time. For more info on dynamic versus static content, see our learning center.

WordPress is one of the most open platforms in the world, but that means you are always at risk of incurring performance penalties because of plugins or other sources that, while necessary, may be hard to pinpoint and resolve. With the Automatic Platform Optimization service, we put your website into our network that is within 10 milliseconds of 99% of the Internet-connected population in the developed world, all without having to change your existing hosting provider. This means that for most requests your customers won’t even need to go to your origin, reducing many costly round trips and server processing time. These optimizations run on our edge network, so they also will not impact render or interactivity since no additional JavaScript is run on the client.

How We Measure Web Performance

Evaluating performance of a website is difficult. There are many different metrics you can track and it is not always obvious which metrics most meaningfully represent a user’s experience. As discussed when we blogged about our new Speed page, we aim to simplify this for customers by automating tests using the infrastructure of webpagetest.org, and summarizing both the results visually and numerically in one place.

The visualization gives you a clear idea of what customers are going to see when they come to your site, and the Critical Loading Times provide the most important metrics to judge your performance. On top of seeing your site’s performance, we provide a list of recommendations for ways to even further increase your performance. If you are using WordPress, then we will test your site with Automatic Platform Optimizations to estimate the benefit you will get with the service.

The Benefits of Automatic Platform Optimization

We tested APO on over 500 Cloudflare customer websites that run on WordPress to understand what the performance improvements would be. The results speak for themselves:

Test Results

Metric	Percentiles	Baseline Cloudflare	APO Enabled	Improvement (%)
Time to First Byte (TTFB)	90th	1252 ms	351 ms	71.96%
Time to First Byte (TTFB)	10th	254 ms	261 ms	-2.76%
First Contentful Paint (FCP)	90th	2655 ms	2056 ms	22.55%
First Contentful Paint (FCP)	10th	894 ms	783 ms	12.46%
Speed Index (SI)	90th	6428	5586	13.11%
Speed Index (SI)	10th	1301	1242	4.52%

Note: Results are based on test results of 505 randomly selected websites that are cached by Cloudflare. Tests were run using WebPageTest from South Carolina, USA and the following environment: Chrome, Cable connection speed.

Most importantly, with APO, a site’s TTFB is made both fast and consistent. Because we now serve the html from Cloudflare’s edge with 0 origin processing time, getting the first byte to the eyeball is consistently fast. Under heavy load, a WordPress origin can suffer delays in building the html and returning it to visitors. APO removes the variance due to load resulting in consistent TTFB <400 ms.

Additionally, between faster TTFB and additional caching of third party fonts, we see performance improvements in both FCP and SI for both the fastest and slowest of the sites we tested. Some of this comes naturally from reducing the TTFB, since every millisecond you shave off of TTFB is a potential millisecond gain for other metrics. Caching additional third party fonts allows us to reduce the time it takes to fetch that content. Given fonts can often block paints due to text rendering, this improves the rate at which the page paints, and improves the Speed Index.

We asked the folks at Kinsta to try out APO, given their expertise in WordPress, and tell us what they think. Brian Li, a Website Content Manager at Kinsta, ran a set of tests from around the world on a website hosted in Tokyo. I’ll let him explain what they did and the results:

At Kinsta, WordPress performance is something that’s near and dear to our hearts. So, when Cloudflare reached out about testing their new Automatic Platform Optimization (APO) service for WordPress, we were all ears.

This is what we did to test out the new service:

We set up a test site in Tokyo, Japan – one of the 24 high-performance data center locations available for Kinsta customers.

We ran several speed tests from six different locations around the world with and without Cloudflare’s APO.

The results were incredible!

By caching static HTML on Cloudflare’s edge network, we saw a 70-300% performance increase. As expected, the testing locations furthest away from Tokyo saw the biggest reduction in load time.

If your WordPress site uses a traditional CDN that only caches CSS, JS, and images, upgrading to Cloudflare’s WordPress APO is a no-brainer and will help you stay competitive with modern Jamstack and static sites that live on the edge by default.

Brian’s test results are summarized in this image:

One of the clear benefits, from Kinsta’s testing of APO, is the consistency of performance for serving your site no matter where your visitors are in the world. The consistent sub-second performance shown with APO versus two or three second load times in other setups makes it clear that if you have a global customer base, APO delivers an improved experience for all visitors.

How Automatic Platform Optimization Works

Automatic Platform Optimization is the result of being able to use the power of Cloudflare Workers to intelligently cache dynamic content. By caching dynamic content, we can serve the entire website from our edge network. Think ‘static site’ but without any of the work of having to build or maintain a static site. Customers can keep managing and updating content on their website in the same way and leave the hard work for performance to us. Serving both static and dynamic content from our network results, generally, in no origin requests or origin processing time. This means all the communication occurs between the user’s device and our edge. Reducing the multitude of round trips typically required from our edge to the origin for dynamic content is what makes this service so effective. Let’s first see what it normally looks like to load a WordPress site for a visitor.

In a regular request flow, Cloudflare is able to cache some of the content like images, CSS, or JS, while other requests go to either the origin or a third party service in order to fetch the content. Most importantly the first request to fetch the HTML for the site needs to go to the origin which is a typical cause of long TTFB, since no other requests get made until the client can receive the HTML and parse it to make subsequent requests.

Once APO is enabled, all those trips to the origin are removed. TTFB benefits greatly because the first hop starts and ends at Cloudflare’s network. This also means the browser starts working on fetching and painting the webpage sooner meaning each paint event occurs earlier. Last by caching third party fonts, we remove additional requests that would need to leave Cloudflare’s network and extend the time to display text to the user. Often, websites use fonts hosted on third-party domains. While this saves bandwidth costs that would be incurred from hosting it on the origin, depending on where those fonts are hosted, it can still be a costly operation to fetch them. By rehosting the fonts and serving them from our cache, we can reduce one of the remaining costly round trips.

With APO for WordPress, you can say bye bye to database congestion or unwieldy plugins slowing down your customers’ experience. These benefits are stacked on top of our already fast TLS connection times and industry leading protocol support like HTTP/2 that ensure we are using the most efficient and the fastest way to connect and deliver your website to your customers.

For customers with WordPress sites that support authenticated sessions, you do not have to worry about us caching content from authenticated users and serving it to others. We bypass the cache on standard WordPress and WooCommerce cookies for authenticated users. This ensures customized content for a specific user is only visible to that user. While this has been available to customers with our Business-level service, it is now available for any WordPress customer that enables APO.

You might be wondering: “This all sounds great, but what about when I change content on my site?” Because this service works in tandem with our WordPress plugin, we are able to understand when you make changes and ensure we quickly purge the content in Cloudflare’s edge and refresh it with the new content. With the plugin installed, we detect content changes and update our edge network worldwide with automatic cache purges. As part of this release, we have updated our WordPress plugin, so whether or not you use APO, you should upgrade to the latest version today. If you do not or cannot use our WordPress plugin, then APO will still provide the same performance benefits, but may serve stale content for up to 30 minutes and when the content is requested again.

This service was built on the prototype work originally blogged about here and here. For a more in depth look at the technical side of the service and how Cloudflare Workers allowed us to build the Automatic Platform Optimization service, see the accompanying blog post.

WordPress Today, Other Platforms Coming Soon

While today’s announcement is focused on supporting WordPress, this is just the start. We plan to bring these same capabilities to other popular platforms used for web hosting. If you operate a platform and are interested in how we may be able to work together to improve things for all your customers, please get in touch. If you are running a website, let us know what platform you want to see us bring Automatic Platform Optimization to next.

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

2020-10-02 Yevgen Safronov

Post Syndicated from Yevgen Safronov original https://blog.cloudflare.com/building-automatic-platform-optimization-for-wordpress-using-cloudflare-workers/

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

This post explains how we implemented the Automatic Platform Optimization for WordPress. In doing so, we have defined a new place to run WordPress plugins, at the edge written with Cloudflare Workers. We provide the feature as a Cloudflare service but what’s exciting is that anyone could build this using the Workers platform.

The service is an evolution of the ideas explained in an earlier zero-config edge caching of HTML blog post. The post will explain how Automatic Platform Optimization combines the best qualities of the regular Cloudflare cache with Workers KV to improve cache cold starts globally.

The optimization will work both with and without the Cloudflare for WordPress plugin integration. Not only have we provided a zero config edge HTML caching solution but by using the Workers platform we were also able to improve the performance of Google font loading for all pages.

We are launching the feature first for WordPress specifically but the concept can be applied to any website and/or content management system (CMS).

A new place to run WordPress plugins?

There are many individual WordPress plugins for performance that use similar optimizations to existing Cloudflare services. Automatic Platform Optimization is bringing them all together into one easy to use solution, deployed at the edge.

Traditionally you have to maintain server plugins with your WordPress installation. This comes with maintenance costs and can require a deep understanding of how to fine tune performance and security for each and every plugin. Providing the optimizations on the client side can also lead to performance problems due to the costs of JavaScript execution. In contrast most of the optimizations could be built-in in Cloudflare’s edge rather than running on the server or the client. Automatic Platform Optimization will be always up to date with the latest performance and security best practices.

How to optimize for WordPress

By default Cloudflare CDN caches assets based on file extension and doesn’t cache HTML content. It is possible to configure HTML caching with a Cache Everything Page rule but it is a manual process and often requires additional features only available on the Business and Enterprise plans. So for the majority of the WordPress websites even with a CDN in front them, HTML content is not cached. Requests for a HTML document have to go all the way to the origin.

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

Even if a CDN optimizes the connection between the closest edge and the website’s origin, the origin could be located far away and also be slow to respond, especially under load.

Move content closer to the user

One of the primary recommendations for speeding up websites is to move content closer to the end-user. This reduces the amount of time it takes for packets to travel between the end-user and the web server – the round-trip time (RTT). This improves the speed of establishing a connection as well as serving content from a closer location.

We have previously blogged about the benefits of edge caching HTML. Caching and serving from HTML from the Cloudflare edge will greatly improve the time to first byte (TTFB) by optimizing DNS, connection setup, SSL negotiation, and removing the origin server response time.If your origin is slow in generating HTML and/or your user is far from the origin server then all your performance metrics will be affected.

Most HTML isn’t really dynamic. It needs to be able to change relatively quickly when the site is updated but for a huge portion of the web, the content is static for months or years at a time. There are special cases like when a user is logged-in (as the admin or otherwise) where the content needs to differ but the vast majority of visits are of anonymous users.

Zero config edge caching revisited

The goal is to make updating content to the edge happen automatically. The edge will cache and serve the previous version content until there is new content available. This is usually achieved by triggering a cache purge to remove existing content. In fact using a combination of our WordPress plugin and Cloudflare cache purge API, we already support Automatic Cache Purge on Website Updates. This feature has been in use for many years.

Building automatic HTML edge caching is more nuanced than caching traditional static content like images, styles or scripts. It requires defining rules on what to cache and when to update the content. To help with that task we introduced a custom header to communicate caching rules between Cloudflare edge and origin servers.

The Cloudflare Worker runs from every edge data center, the serverless platform will take care of scaling to our needs. Based on the request type it will return HTML content from Cloudflare Cache using Worker’s Cache API or serve a response directly from the origin. Specifically designed custom header provides information from the origin on how the script should handle the response. For example worker script will never cache responses for authenticated users.

HTML Caching rules

With or without Cloudflare for WordPress plugin, HTML edge caching requires all of the following conditions to be met:

Origin responds with 200 status
Origin responds with "text/html" content type
Request method is GET.
Request path doesn’t contain query strings
Request doesn’t contain any WordPress specific cookies: "wp-*", "wordpress*", "comment_*", "woocommerce_*" unless it’s "wordpress_eli" or "wordpress_test_cookie".
Request doesn’t contain any of the following headers:
- "Cache-Control: no-cache"
- "Cache-Control: private"
- "Pragma:no-cache"
- "Vary: *"

Note that the caching is bypassed if the devtools are open and the “Disable cache” option is active.

Edge caching with plugin

The preferred solution requires a configured Cloudflare for WordPress plugin. We provide the following features set when the plugin is activated:

HTML edge caching with 30 days TTL
30 seconds or faster cache invalidation
Bypass HTML caching for logged in users
Bypass HTML caching based on presence of WordPress specific cookies
Decrease load on origin servers. If a request is fetched from Cloudflare CDN Cache we skip the request to the origin server.

How is this implemented?

When an eyeball requests a page from a website and Cloudflare doesn’t have a copy of the content it will be fetched from the origin. As the response is sent from the origin and goes through Cloudflare’s edge, Cloudflare for WordPress plugin adds a custom header: cf-edge-cache. It allows an origin to configure caching rules applied on responses.

Based on the X-HTML-Edge-Cache proposal the plugin adds a cf-edge-cache header to every origin response. There are 2 possible values:

cf-edge-cache: no-cache

The page contains private information that shouldn’t be cached by the edge. For example, an active session exists on the server.

cf-edge-cache: cache, platform=wordpress

This combination of cache and platform will ensure that the HTML page is cached. In addition, we ran a number of checks against the presence of WordPress specific cookies to make sure we either bypass or allow caching on the Edge.

If the header isn’t present we assume that the Cloudflare for WordPress plugin is not installed or up-to-date. In this case the feature operates without a plugin support.

Edge caching without plugin

Using the Automatic Platform Optimization feature in combination with Cloudflare for WordPress plugin is our recommended solution. It provides the best feature set together with almost instant cache invalidation. Still, we wanted to provide performance improvements without the need for any installation on the origin server.

We provide the following features set when the plugin is not activated:

HTML edge caching with 30 days TTL
Cache invalidation may take up to 30 minutes. A manual cache purge could be triggered to speed up cache invalidation
Bypass HTML caching based on presence of WordPress specific cookies
No decreased load on origin servers. If a request is fetched from Cloudflare CDN Cache we still require an origin response to apply cache invalidation logic.

Without Cloudflare for WordPress plugin we still cache HTML on the edge and serve the content from the cache when possible. The logic of cache revalidation happens after serving the response to the eyeball. Worker’s waitUntil() callback allows the user to run code without affecting the response to the eyeball and is run in background.

We rely on the following headers to detect whether the content is stale and requires cache update:

ETag. If the cached version and origin response both include ETag and they are different we replace cached version with origin response. The behavior is the same for strong and weak ETag values.
Last-Modified. If the cached version and origin response both include Last-Modified and origin has a later Last-Modified date we replace cached version with origin response.
Date. If no ETag or Last-Modified header is available we compare cached version and origin response Date values. If there was more than a 30 minutes difference we replace cached version with origin response.

Getting content everywhere

Cloudflare Cache works great for the frequently requested content. Regular requests to the site make sure the content stays in cache. For a typical personal blog, it will be more common that the content stays in cache only in some parts of our vast edge network. With the Automatic Platform Optimization release we wanted to improve loading time for cache cold start from any location in the world. We explored different approaches and decided to use Workers KV to improve Edge Caching.

In addition to Cloudflare’s CDN cache we put the content into Workers KV. It only requires a single request to the page to cache it and within a minute it is made available to be read back from KV from any Cloudflare data center.

Updating content

After an update has been made to the WordPress website the plugin makes a request to Cloudflare’s API which both purges cache and marks content as stale in KV. The next request for the asset will trigger revalidation of the content. If the plugin is not enabled cache revalidation logic is triggered as detailed previously.

We serve the stale copy of the content still present in KV and asynchronously fetch new content from the origin, apply possible optimizations and then cache it (both regular local CDN cache and globally in KV).

To store the content in KV we use a single namespace. It’s keyed with a combination of a zone identifier and the URL. For instance:

1:example.com/blog-post-1.html => "transformed & cached content"

For marking content as stale in KV we write a new key which will be read from the edge. If the key is present we will revalidate the content.

stale:1:example.com/blog-post-1.html => ""

Once the content was revalidated the stale marker key is deleted.

Moving optimizations to the edge

On top of caching HTML at the edge, we can pre-process and transform the HTML to make the loading of websites even faster for the user. Moving the development of this feature to our Cloudflare Workers environment makes it easy to add performance features such as improving Google Font loading. Using Google Fonts can cause significant performance issues as to load a font requires loading the HTML page; then loading a CSS file and finally loading the font. All of these steps are using different domains.

The solution is for the worker to inline the CSS and serve the font directly from the edge minimizing the number of connections required.

If you read through the previous blog post’s implementation it required a lot of manual work to provide streaming HTML processing support and character encodings. As the set of worker APIs have improved over time it is now much simpler to implement. Specifically the addition of a streaming HTML rewriter/parser with CSS-selector based API and the ability to suspend the parsing to asynchronously fetch a resource has reduced the code required to implement this from ~600 lines of source code to under 200.

export function transform(request, res) {
  return new HTMLRewriter()
    .on("link", {
      async element(e) {
        const src = e.getAttribute("href");
        const rel = e.getAttribute("rel");
        const isGoogleFont =
          src.startsWith("https://fonts.googleapis.com")

        if (isGoogleFont && rel === "stylesheet") {
          const media = e.getAttribute("media") || "all";
          const id = e.getAttribute("id") || "";
          try {
            const content = await fetchCSS(src, request);
            e.replace(styleTag({ media, id }, content), {
              html: true
            });
          } catch (e) {
            console.error(e);
          }
        }
      }
    })
    .transform(res);
}

The HTML transformation doesn’t block the response to the user. It’s running as a background task which when complete will update kv and replace the global cached version.

Making edge publishing generic

We are launching the feature for WordPress specifically but the concept can be applied to any website and content management system (CMS).

Introducing API Shield

2020-10-01 Patrick R. Donahue

Post Syndicated from Patrick R. Donahue original https://blog.cloudflare.com/introducing-api-shield/

Introducing API Shield

APIs are the lifeblood of modern Internet-connected applications. Every millisecond they carry requests from mobile applications—place this food delivery order, “like” this picture—and directions to IoT devices—unlock the car door, start the wash cycle, my human just finished a 5k run—among countless other calls.

They’re also the target of widespread attacks designed to perform unauthorized actions or exfiltrate data, as data from Gartner increasingly shows: “by 2021, 90% of web-enabled applications will have more surface area for attack in the form of exposed APIs rather than the UI, up from 40% in 2019, and “Gartner predicted that, by 2022, API abuses will move from an infrequent to the most-frequent attack vector, resulting in data breaches for enterprise web applications”. Of the 18 million requests per second that traverse Cloudflare’s network, 50% are directed towards APIs—with the majority of these requests blocked as malicious.

To combat these threats, Cloudflare is making it simple to secure APIs through the use of strong client certificate-based identity and strict schema-based validation. As of today, these capabilities are available free for all plans within our new “API Shield” offering. And as of today, the security benefits also extend to gRPC-based APIs, which use binary formats such as protocol buffers rather than JSON, and have been growing in popularity with our customer base.

Continue reading to learn more about the new capabilities, or jump right to the “Demonstration” paragraph for examples of how to get started configuring your first API Shield rule.

Positive security models and client certificates

A “positive security” model is one that allows only known behavior and identities, while rejecting everything else. It is the opposite of the traditional “negative security” model enforced by a Web Application Firewall (WAF) that allows everything except for requests coming from problematic IPs, ASNs, countries or requests with problematic signatures (SQL injection attempts, etc.).

Implementing a positive security model for APIs is the most direct way to eliminate the noise of credential stuffing attacks and other automated scanning tools. And the first step towards a positive model is deploying strong authentication such as mutual TLS authentication, which is not vulnerable to the reuse or sharing of passwords.

Just as we simplified the issuance of server certificates back in 2014 with Universal SSL, API Shield reduces the process of issuing client certificates to clicking a few buttons in the Cloudflare Dashboard. By providing a fully hosted private public key infrastructure (PKI), you can focus on your applications and features—rather than operating and securing your own certificate authority (CA).

Enforcing valid requests with schema validation

Once developers can be sure that only legitimate clients (with SSL certificates in hand) are connecting to their APIs, the next step in implementing a positive security model is making sure that those clients are making valid requests. Extracting a client certificate from a device and reusing elsewhere is difficult, but not impossible, so it’s also important to make sure that the API is being called as intended.

Requests containing extraneous input may not have been anticipated by the API developer, and can cause problems if processed directly by the application, so these should be dropped at the edge if possible. API Schema validation works by matching the contents of API requests—the query parameters that come after the URL and contents of the POST body—against a contract or “schema” that contains the rules for what is expected. If validation fails, the API call is blocked protecting the origin from an invalid request or a malicious payload.

Schema validation is currently in closed beta for JSON payloads, with gRPC/protocol buffer support on the roadmap. If you would like to join the beta please open a support ticket with the subject “API Schema Validation Beta”. After the beta has ended, we plan to make schema validation available as part of the API Shield user interface.

Demonstration

To demonstrate how the APIs powering IoT devices and mobile applications can be secured, we have built an API Shield demonstration using client certificates and schema validation.

Temperatures are captured by an IoT device, represented in the demo by a Raspberry Pi 3 Model B+ with an external infrared temperature sensor, and then transmitted via a POST request to a Cloudflare-protected API. Temperatures are subsequently retrieved by GET requests and then displayed in a mobile application built in Swift for iOS.

In both cases, the API was actually built using Cloudflare Workers® and Workers KV, but can be replaced by any Internet-accessible endpoint.

1. API Configuration

Before configuring the IoT device and mobile application to communicate securely with the API, we need to bootstrap the API endpoints. To keep the example simple, while also allowing for additional customization, we’ve implemented the API as a Cloudflare Worker (borrowing code from the To-Do List tutorial).

In this particular example the temperatures are stored in Workers KV using the source IP address as a key, but this could easily be replaced by a value from the client certificate, e.g., the fingerprint. The code below saves a temperature and timestamp into KV when a POST is made, and returns the most recent 5 temperatures when a GET request is made.

const defaultData = { temperatures: [] }

const getCache = key => TEMPERATURES.get(key)
const setCache = (key, data) => TEMPERATURES.put(key, data)

async function addTemperature(request) {

    // pull previously recorded temperatures for this client
    const ip = request.headers.get('CF-Connecting-IP')
    const cacheKey = `data-${ip}`
    let data
    const cache = await getCache(cacheKey)
    if (!cache) {
        await setCache(cacheKey, JSON.stringify(defaultData))
        data = defaultData
    } else {
        data = JSON.parse(cache)
    }

    // append the recorded temperatures with the submitted reading (assuming it has both temperature and a timestamp)
    try {
        const body = await request.text()
        const val = JSON.parse(body)

        if (val.temperature && val.time) {
            data.temperatures.push(val)
            await setCache(cacheKey, JSON.stringify(data))
            return new Response("", { status: 201 })
        } else {
            return new Response("Unable to parse temperature and/or timestamp from JSON POST body", { status: 400 })
        }
    } catch (err) {
        return new Response(err, { status: 500 })
    }
}

function compareTimestamps(a,b) {
    return -1 * (Date.parse(a.time) - Date.parse(b.time))
}

// return the 5 most recent temperature measurements
async function getTemperatures(request) {
    const ip = request.headers.get('CF-Connecting-IP')
    const cacheKey = `data-${ip}`

    const cache = await getCache(cacheKey)
    if (!cache) {
        return new Response(JSON.stringify(defaultData), { status: 200, headers: { 'content-type': 'application/json' } })
    } else {
        data = JSON.parse(cache)
        const retval = JSON.stringify(data.temperatures.sort(compareTimestamps).splice(0,5))
        return new Response(retval, { status: 200, headers: { 'content-type': 'application/json' } })
    }
}

async function handleRequest(request) {

    if (request.method === 'POST') {
        return addTemperature(request)
    } else {
        return getTemperatures(request)
    }

}

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

Before adding mutual TLS authentication, we’ll test POST’ing a random temperature reading:

$ TEMPERATURE=$(echo $((361 + RANDOM %11)) | awk '{printf("%.2f",$1/10.0)}')
$ TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

$ echo -e "$TEMPERATURE\n$TIMESTAMP"
36.30
2020-09-28T02:57:49Z

$ curl -v -H "Content-Type: application/json" -d '{"temperature":'''$TEMPERATURE''', "time": "'''$TIMESTAMP'''"}' https://shield.upinatoms.com/temps 2>&1 | grep "< HTTP/2"
< HTTP/2 201

And here’s a subsequent read of that temperature, along with the previous 4 that were submitted:

$ curl -s https://shield.upinatoms.com/temps | jq .
[
  {
    "temperature": 36.3,
    "time": "2020-09-28T02:57:49Z"
  },
  {
    "temperature": 36.7,
    "time": "2020-09-28T02:54:56Z"
  },
  {
    "temperature": 36.2,
    "time": "2020-09-28T02:33:08Z"
  },
    {
    "temperature": 36.5,
    "time": "2020-09-28T02:29:22Z"
  },
  {
    "temperature": 36.9,
    "time": "2020-09-28T02:27:19Z"
  } 
]

2. Client certificate issuance

With our API in hand, it’s time to lock it down to require a valid client certificate. Before doing so we’ll want to generate those certificates. To do so, you can either go to the SSL/TLS → Client Certificates tab of the Cloudflare Dashboard and click “Create Certificate” or you can automate the process via API calls.

Because most developers at scale will be generating their own private keys and CSRs and requesting that they be signed via API, we’ll show that process here. Using Cloudflare’s PKI toolkit CFSSL we’ll first create a bootstrap certificate fo the iOS application, and then we’ll create a certificate for the IoT device:

$ cat <<'EOF' | tee -a csr.json
{
    "hosts": [
        "ios-bootstrap.devices.upinatoms.com"
    ],
    "CN": "ios-bootstrap.devices.upinatoms.com",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [{
        "C": "US",
        "L": "Austin",
        "O": "Temperature Testers, Inc.",
        "OU": "Tech Operations",
        "ST": "Texas"
    }]
}
EOF

$ cfssl genkey csr.json | cfssljson -bare certificate
2020/09/27 21:28:46 [INFO] generate received request
2020/09/27 21:28:46 [INFO] received CSR
2020/09/27 21:28:46 [INFO] generating key: rsa-2048
2020/09/27 21:28:47 [INFO] encoded CSR

$ mv certificate-key.pem ios-key.pem
$ mv certificate.csr ios.csr

// and do the same for the IoT sensor
$ sed -i.bak 's/ios-bootstrap/sensor-001/g' csr.json
$ cfssl genkey csr.json | cfssljson -bare certificate
...
$ mv certificate-key.pem sensor-key.pem
$ mv certificate.csr sensor.csr

Generate a private key and CSR for the IoT device and iOS application

// we need to replace actual newlines in the CSR with ‘\n’ before POST’ing
$ CSR=$(cat ios.csr | perl -pe 's/\n/\\n/g')
$ request_body=$(< <(cat <<EOF
{
  "validity_days": 3650,
  "csr":"$CSR"
}
EOF
))

// save the response so we can view it and then extract the certificate
$ curl -H 'X-Auth-Email: YOUR_EMAIL' -H 'X-Auth-Key: YOUR_API_KEY' -H 'Content-Type: application/json' -d “$request_body” https://api.cloudflare.com/client/v4/zones/YOUR_ZONE_ID/client_certificates > response.json

$ cat response.json | jq .
{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "id": "7bf7f70c-7600-42e1-81c4-e4c0da9aa515",
    "certificate_authority": {
      "id": "8f5606d9-5133-4e53-b062-a2e5da51be5e",
      "name": "Cloudflare Managed CA for account 11cbe197c050c9e422aaa103cfe30ed8"
    },
    "certificate": "-----BEGIN CERTIFICATE-----\nMIIEkzCCA...\n-----END CERTIFICATE-----\n",
    "csr": "-----BEGIN CERTIFICATE REQUEST-----\nMIIDITCCA...\n-----END CERTIFICATE REQUEST-----\n",
    "ski": "eb2a48a19802a705c0e8a39489a71bd586638fdf",
    "serial_number": "133270673305904147240315902291726509220894288063",
    "signature": "SHA256WithRSA",
    "common_name": "ios-bootstrap.devices.upinatoms.com",
    "organization": "Temperature Testers, Inc.",
    "organizational_unit": "Tech Operations",
    "country": "US",
    "state": "Texas",
    "location": "Austin",
    "expires_on": "2030-09-26T02:41:00Z",
    "issued_on": "2020-09-28T02:41:00Z",
    "fingerprint_sha256": "84b045d498f53a59bef53358441a3957de81261211fc9b6d46b0bf5880bdaf25",
    "validity_days": 3650
  }
}

$ cat response.json | jq .result.certificate | perl -npe 's/\\n/\n/g; s/"//g' > ios.pem

// now ask that the second client certificate signing request be signed
$ CSR=$(cat sensor.csr | perl -pe 's/\n/\\n/g')
$ request_body=$(< <(cat <<EOF
{
  "validity_days": 3650,
  "csr":"$CSR"
}
EOF
))

$ curl -H 'X-Auth-Email: YOUR_EMAIL' -H 'X-Auth-Key: YOUR_API_KEY' -H 'Content-Type: application/json' -d "$request_body" https://api.cloudflare.com/client/v4/zones/YOUR_ZONE_ID/client_certificates | perl -npe 's/\\n/\n/g; s/"//g' > sensor.pem

Ask Cloudflare to sign the CSRs with the private CA issued for your zone

3. API Shield rule creation

With certificates in hand we can now configure the API endpoint to require their use. Below is a demonstration of how to create such a rule.

The steps include specifying which hostnames to prompt for certificates, e.g., shield.upinatoms.com, and then creating the API Shield rule.

4. IoT Device Communication

To prepare the IoT device for secure communication with our API endpoint we need to embed the certificate on the device, and then point our application to it so it can be used when making the POST request to the API endpoint.

We securely copied the private key and certificate into /etc/ssl/private/sensor-key.pem and /etc/ssl/certs/sensor.pem, and then modified our sample script to point to these files:

import requests
import json
from datetime import datetime

def readSensor():

    # Takes a reading from a temperature sensor and store it to temp_measurement 

    dateTimeObj = datetime.now()
    timestampStr = dateTimeObj.strftime(‘%Y-%m-%dT%H:%M:%SZ’)

    measurement = {'temperature':str(36.5),'time':timestampStr}
    return measurement

def main():

    print("Cloudflare API Shield [IoT device demonstration]")

    temperature = readSensor()
    payload = json.dumps(temperature)
    
    url = 'https://shield.upinatoms.com/temps'
    json_headers = {'Content-Type': 'application/json'}
    cert_file = ('/etc/ssl/certs/sensor.pem', '/etc/ssl/private/sensor-key.pem')
    
    r = requests.post(url, headers = json_headers, data = payload, cert = cert_file)
    
    print("Request body: ", r.request.body)
    print("Response status code: %d" % r.status_code)

When the script attempts to connect to https://shield.upinatoms.com/temps, Cloudflare requests that a ClientCertificate is sent, and our script sends the contents of sensor.pem before demonstrating it has possession of sensor-key.pem as required to complete the SSL/TLS handshake.

If we fail to send the client certificate or attempt to include extraneous fields in the API request, the schema validation (configuration not shown) fails and the request is rejected:

Cloudflare API Shield [IoT device demonstration]
Request body:  {"temperature": "36.5", "time": "2020-09-28T15:52:19Z"}
Response status code: 403

If instead a valid certificate is presented and the payload follows the schema previously uploaded, our script POSTs the latest temperature reading to the API.

Cloudflare API Shield [IoT device demonstration]
Request body:  {"temperature": "36.5", "time": "2020-09-28T15:56:45Z"}
Response status code: 201

5. Mobile Application (iOS) Communication

Now that temperature requests have been sent to our API endpoint, it’s time to read them securely from our mobile application using one of the client certificates.

For purposes of brevity, we’re going to embed a “bootstrap” certificate and key as a PKCS#12 file within the application bundle. In a real world deployment, this bootstrap certificate should only be used alongside users’ credentials to authenticate to an API endpoint that can return a unique user certificate. Corporate users will want to use MDM to distribute certificates so that the underlying mobile

Package the certificate and private key

Before adding the bootstrap certificate and private key, we need to combine them into a binary PKCS#12 file. This binary file will then be added to our iOS application bundle.

$ openssl pkcs12 -export -out bootstrap-cert.pfx -inkey ios-key.pem -in ios.pem
Enter Export Password:
Verifying - Enter Export Password:

Add the certificate bundle to your iOS application

Within XCode, click File → Add Files To “[Project Name]” and select your .pfx file. Make sure to check “Add to target” before confirming.

Modify your URLSession code to use the client certificate

This article provides a nice walkthrough of using a PKCS#11 class and URLSessionDelegate to modify your application to complete mutual TLS authentication when connecting to an API that requires it.

Looking Forward

In the coming months, we plan to expand API Shield with a number of additional features designed to protect API traffic. For customers that want to use their own PKI, we will provide the ability to import their own CAs, something available today as part of Cloudflare Access.

As we receive feedback on our schema validation beta, we will look to make the capability generally available to all customers. If you’re trying out the beta and have thoughts to share, we’d love to hear your feedback.

Beyond certificates and schema validation, we’re excited to layer on additional API security capabilities as well as deep analytics to help you better understand your APIs. If you there are features you’d like to see, let us know in the comments below!

Announcing support for gRPC

2020-10-01 Achiel van der Mandele

Post Syndicated from Achiel van der Mandele original https://blog.cloudflare.com/announcing-grpc/

Announcing support for gRPC

Today we’re excited to announce beta support for proxying gRPC, a next-generation protocol that allows you to build APIs at scale. With gRPC on Cloudflare, you get access to the security, reliability and performance features that you’re used to having at your fingertips for traditional APIs. Sign up for the beta today in the Network tab of the Cloudflare dashboard.

gRPC has proven itself to be a popular new protocol for building APIs at scale: it’s more efficient and built to offer superior bi-directional streaming capabilities. However, because gRPC uses newer technology, like HTTP/2, under the covers, existing security and performance tools did not support gRPC traffic out of the box. This meant that customers adopting gRPC to power their APIs had to pick between modernity on one hand, and things like security, performance, and reliability on the other. Because supporting modern protocols and making sure people can operate them safely and performantly is in our DNA, we set out to fix this.

When you put your gRPC APIs on Cloudflare, you immediately gain all the benefits that come with Cloudflare. Apprehensive of exposing your APIs to bad actors? Add security features such as WAF and Bot Management. Need more performance? Turn on Argo Smart Routing to decrease time to first byte. Or increase reliability by adding a Load Balancer.

And naturally, gRPC plugs in to API Shield, allowing you to add more security by enforcing client authentication and schema validation at the edge.

What is gRPC?

Protocols like JSON-REST have been the bread and butter of Internet facing APIs for several years. They’re great in that they operate over HTTP, their payloads are human readable, and a large body of tooling exists to quickly set up an API for another machine to talk to. However, the same things that make these protocols popular are also weaknesses; JSON, as an example, is inefficient to store and transmit, and expensive for computers to parse.

In 2015, Google introduced gRPC, a protocol designed to be fast and efficient, relying on binary protocol buffers to serialize messages before they are transferred over the wire. This prevents (normal) humans from reading them but results in much higher processing efficiency. gRPC has become increasingly popular in the era of microservices because it neatly addresses the shortfalls laid out above.

JSON	Protocol Buffers
{ “foo”: “bar” }	0b111001001100001011000100000001100001010

gRPC relies on HTTP/2 as a transport mechanism. This poses a problem for customers trying to deploy common security technologies like web application firewalls, as most reverse proxy solutions (including Cloudflare’s HTTP stack, until today) downgrade HTTP requests down to HTTP/1.1 before sending them off to an origin.

Beyond microservices in a datacenter, the original use case for gRPC, adoption has grown in many other contexts. Many popular mobile apps have millions of users, that all rely on messages being sent back and forth between mobile phones and servers. We’ve seen many customers wire up API connectivity for their mobile apps by using the same gRPC API endpoints they already have inside their data centers for communication with clients in the outside world.

While this solves the efficiency issues with running services at scale, it exposes critical parts of these customers’ infrastructure to the Internet, introducing security and reliability issues. Today we are introducing support for gRPC at Cloudflare, to secure and improve the experience of running gRPC APIs on the Internet.

How does gRPC + Cloudflare work?

The engineering work our team had to do to add gRPC support is composed of a few pieces:

Changes to the early stages of our request processing pipeline to identify gRPC traffic coming down the wire.
Additional functionality in our WAF to “understand” gRPC traffic, ensuring gRPC connections are handled correctly within the WAF, including inspecting all components of the initial gRPC connection request.
Adding support to establish HTTP/2 connections to customer origins for gRPC traffic, allowing gRPC to be proxied through our edge. HTTP/2 to origin support is currently limited to gRPC traffic, though we expect to expand the scope of traffic proxied back to origin over HTTP/2 soon.

What does this mean for you, a Cloudflare customer interested in using our tools to secure and accelerate your API? Because of the hard work we’ve done, enabling support for gRPC is a click of a button in the Cloudflare dashboard.

Using gRPC to build mobile apps at scale

Why does Cloudflare supporting gRPC matter? To dig in on one use case, let’s look at mobile apps. Apps need quick, efficient ways of interacting with servers to get the information needed to show on your phone. There is no browser, so they rely on APIs to get the information. An API stands for application programming interface and is a standardized way for machines (say, your phone and a server) to talk to each other.

Let’s say we’re a mobile app developer with thousands, or even millions of users. With this many users, using a modern protocol, gRPC, allows us to run less compute infrastructure than would be necessary with older, less efficient protocols like JSON-REST. But exposing these endpoints, naked, on the Internet is really scary. Up until now there were very few, if any, options for protecting gRPC endpoints against application layer attacks with a WAF and guarding against volumetric attacks with DDoS mitigation tools. That changes today, with Cloudflare adding gRPC to it’s set of supported protocols.

With gRPC on Cloudflare, you get the full benefits of our security, reliability and performance products:

WAF for inspection of incoming gRPC requests. Use managed rules or craft your own.
Load Balancing to increase reliability: configure multiple gRPC backends to handle the load, let Cloudflare distribute the load across them. Backend selection can be done in round-robin fashion, based on health checks or load.
Argo Smart Routing to increase performance by transporting your gRPC messages faster than the Internet would be able to route them. Messages are routed around congestion on the Internet, resulting in an average reduction of time to first byte by 30%.

And of course, all of this works with API Shield, an easy way to add mTLS authentication to any API endpoint.

Enabling gRPC support

To enable gRPC support, head to the Cloudflare dashboard and go to the Network tab. From there you can sign up for the beta.

We have limited seats available at launch, but will open up more broadly over the next few weeks. After signing up and toggling gRPC support, you’ll have to enable Cloudflare proxying on your domain on the DNS tab to activate Cloudflare for your gRPC API.

We’re excited to bring gRPC support to the masses, allowing you to add the security, reliability and performance benefits that you’re used to getting with Cloudflare. Enabling is just a click away. Take it for a spin and let us know what you think!

Introducing Cloudflare Radar

2020-09-30 Marc Lamik

Post Syndicated from Marc Lamik original https://blog.cloudflare.com/introducing-cloudflare-radar/

Introducing Cloudflare Radar

Unlike the tides, Internet use ebbs and flows with the motion of the sun not the moon. Across the world usage quietens during the night and picks up as morning comes. Internet use also follows patterns that humans create, dipping down when people stopped to applaud healthcare workers fighting COVID-19, or pausing to watch their country’s president address them, or slowing for religious reasons.

And while humans leave a mark on the Internet, so do automated systems. These systems might be doing useful work (like building search engine databases) or harm (like scraping content, or attacking an Internet property).

All the while Internet use (and attacks) is growing. Zoom into any day and you’ll see the familiar daily wave of Internet use reflecting day and night, zoom out and you’ll likely spot weekends when Internet use often slows down a little, zoom out further and you might spot the occasional change in use caused by a holiday, zoom out further and you’ll see that Internet use grows inexorably.

And attacks don’t only grow, they change. New techniques are invented while old ones remain evergreen. DDoS activity continues day and night roaming from one victim to another. Automated scanning tools look for vulnerabilities in anything, literally anything, connected to the Internet.

Sometimes the Internet fails in a country, perhaps because of a cable cut somewhere beneath the sea, or because of government intervention. That too is something we track and measure.

All this activity, good and bad, shows up in the trends and details that Cloudflare tracks to help improve our service and protect our customers. Until today this insight was only available internally at Cloudflare, today we are launching a new service, Cloudflare Radar, that shines a light on the Internet’s patterns.

Each second, Cloudflare handles on average 18 million HTTP requests and 6 million DNS requests. With 1 billion unique IP addresses connecting to Cloudflare’s network we have one of the most representative views on Internet traffic worldwide.

And by blocking 72 billion cyberthreats every day Cloudflare also has a unique position in understanding and mitigating Internet threats.

Our goal is to help build a better Internet and we want to do this by exposing insights, threats and trends based on the aggregated data that we have. We want to help anyone understand what is happening on the Internet from a security, performance and usage perspective. Every Internet user should have easy access to answer the questions that they have.

There are three key components that we’re launching today: Radar Internet Insights, Radar Domain Insights and Radar IP Insights.

Radar Internet Insights

At the top of Cloudflare Radar we show the latest news about events that are currently happening on the Internet. This includes news about the adoption of new technologies, browsers or operating systems. We are also keeping all users up to date with interesting events around developments in Internet traffic. This could be traffic patterns seen in specific countries or patterns related to events like the COVID-19 pandemic.

Below the news section users can find rapidly updated trend data. All of which can be viewed worldwide or by country. The data is available for several time frames: last hour, last 24 hours, last 7 days. We’ll soon make available the 30 days time frame to help explore longer term trends.

Change in Internet traffic

You can drill down on specific countries and Cloudflare Radar will show you the change in aggregate Internet traffic seen by our network for that country. We also show an info box on the right with a snapshot of interesting data points.

Worldwide and for individual countries we have an algorithm calculating which domains are most popular and have recently started trending (i.e. have seen a large change in popularity). Services with multiple domains and subdomains are aggregated to ensure best comparability. We show here the relative rank of domains and are able to spot big changes in ranking to highlight new trends as they appear.

The trending domains section are still in beta as we are training our algorithm to best detect the next big things as they emerge.

There is also a search bar that enables a user to search for a specific domain or IP address to get detailed information about it. More on that below.

Attack activity

The attack activity section gives information about different types of cyberattacks observed by Cloudflare. First we show the attacks mitigated by our Layer 3 and 4 Denial of Service prevention systems. We show the used attack protocol as well as the change in attack volume over the selected time frame.

Secondly, we show Layer 7 threat information based on requests that we blocked. Layer 7 requests get blocked by a variety of systems (such as our WAF, our layer 7 DDoS mitigation system and our customer configurable firewall). We show the system responsible for blocking as well as the change of blocked requests over the selected time frame.

Technology Trends

Based on the analytics we handle on HTTP requests we are able to show trends over a diverse set of data points. This includes the distribution of mobile vs. desktop traffic, or the percentage of traffic detected as coming from bots. We also dig into longer term trends like the use of HTTPS or the share of IPv6.

The bottom section shows the top browsers worldwide or for the selected country. In this example we selected Vietnam and you can see that over 6% of users are using Cốc Cốc a local browser.

Radar Domain Insights

We give users the option to dig in deeper on an individual domain. Giving the opportunity to get to know the global ranking as well as security information. This enables everyone to identify potential threats and risks.

To look up a domain or hostname in Radar by typing it in the search box within the top domains on the Radar Internet Insights Homepage.

For example, suppose you search for cloudflare.com. You’ll get sent to a domain-specific page with information about cloudflare.com.

At the top we provide an overview of the domain’s configuration with Domain Badges. From here you can, at a glance, understand what technologies the domain is using. For cloudflare.com you can see that it supports TLS, IPv6, DNSSEC and eSNI. There’s also an indication of the age of the domain (since registration) and its worldwide popularity.

Below you find the domain’s content categories. If you find a domain that is in the wrong category, please use our Domain Categorization Feedback to let us know.

We also show global popularity trends from our domain ranking formula. For domains with a global audience there’s also a map giving information about popularity by country.

Radar IP Insights

For an individual IP address (instead of a domain) we show different information. To look up an IP address simply insert it in the search bar within the top domains on the Radar Internet Insights. For a quick lookup of your own IP just open radar.cloudflare.com/me.

For IPs we show the network (the ASN) and geographic information. For your own IP we also show more detailed location information as well as an invitation to check the speed of your Internet connection using speed.cloudflare.com.

Next Steps

The current product is just the beginning of Cloudflare’s approach to making knowledge about the Internet more accessible. Over the next few weeks and months we will add more data points and the 30 days time frame functionality. And we’ll allow users to filter the charts not only by country but also by categorization (such as by industry).

Stay tuned for more to come.

Free, Privacy-First Analytics for a Better Web

2020-09-29 Jon Levine

Post Syndicated from Jon Levine original https://blog.cloudflare.com/free-privacy-first-analytics-for-a-better-web/

Free, Privacy-First Analytics for a Better Web

Everyone with a website needs to know some basic facts about their website: what pages are people visiting? Where in the world are they? What other sites sent traffic to my website?

There are “free” analytics tools out there, but they come at a cost: not money, but your users’ privacy. Today we’re announcing a brand new, privacy-first analytics service that’s open to everyone — even if they’re not already a Cloudflare customer. And if you’re a Cloudflare customer, we’ve enhanced our analytics to make them even more powerful than before.

The most important analytics feature: Privacy

The most popular analytics services available were built to help ad-supported sites sell more ads. But, a lot of websites don’t have ads. So if you use those services, you’re giving up the privacy of your users in order to understand how what you’ve put online is performing.

Cloudflare’s business has never been built around tracking users or selling advertising. We don’t want to know what you do on the Internet — it’s not our business. So we wanted to build an analytics service that gets back to what really matters for web creators, not necessarily marketers, and to give web creators the information they need in a simple, clean way that doesn’t sacrifice their visitors’ privacy. And giving web creators these analytics shouldn’t depend on their use of Cloudflare’s infrastructure for performance and security. (More on that in a bit.)

What does it mean for us to make our analytics “privacy-first”? Most importantly, it means we don’t need to track individual users over time for the purposes of serving analytics. We don’t use any client-side state, like cookies or localStorage, for the purposes of tracking users. And we don’t “fingerprint” individuals via their IP address, User Agent string, or any other data for the purpose of displaying analytics. (We consider fingerprinting even more intrusive than cookies, because users have no way to opt out.)

Counting visits without tracking users

One of the most essential stats about any website is: “how many people went there”? Analytics tools frequently show counts of “unique” visitors, which requires tracking individual users by a cookie or IP address.

We use the concept of a visit: a privacy-friendly measure of how people have interacted with your website. A visit is defined simply as a successful page view that has an HTTP referer that doesn’t match the hostname of the request. This tells you how many times people came to your website and clicked around before navigating away, but doesn’t require tracking individuals.

Free, Privacy-First Analytics for a Better Web

A visit has slightly different semantics from a “unique”, and you should expect this number to differ from other analytics tools.

All of the details, none of the bots

Our analytics deliver the most important metrics about your website, like page views and visits. But we know that an essential analytics feature is flexibility: the ability to add arbitrary filters, and slice-and-dice data as you see fit. Our analytics can show you the top hostnames, URLs, countries, and other critical metrics like status codes. You can filter on any of these metrics with a click and see the whole dashboard update.

I’m especially excited about two features in our time series charts: the ability to drag-to-zoom into a narrower time range, and the ability to “group by” different dimensions to see data in a different way. This is a super powerful way to drill into an anomaly in traffic and quickly see what’s going on. For example, you might notice a spike in traffic, zoom into that spike, and then try different groupings to see what contributed the extra clicks. A GIF is worth a thousand words:

And for customers of our Bot Management product, we’re working on the ability to detect (and remove) automated traffic. Coming very soon, you’ll be able to see which bots are reaching your website — with just a click, block them by using Firewall Rules.

This is all possible thanks to our ABR analytics technology, which enables us to serve analytics very quickly for websites large and small. Check out our blog post to learn more about how this works.

Edge or Browser analytics? Why not both?

There are two ways to collect web analytics data: at the edge (or on an origin server), or in the client using a JavaScript beacon.

Historically, Cloudflare has collected analytics data at our edge. This has some nice benefits over traditional, client-side analytics approaches:

It’s more accurate because you don’t miss users who block third-party scripts, or JavaScript altogether
You can see all of the traffic back to your origin server, even if an HTML page doesn’t load
We can detect (and block bots), apply Firewall rules, and generally scrub traffic of unwanted noise
You can measure the performance of your origin server

More commonly, most web analytics providers use client-side measurement. This has some benefits as well:

You can understand performance as your users see it — e.g. how long did the page actually take to render
You can detect errors in client-side JavaScript execution
You can define custom event types emitted by JavaScript frameworks

Ultimately, we want our customers to have the best of both worlds. We think it’s really powerful to get web traffic numbers directly from the edge. We also launched Browser Insights a year ago to augment our existing edge analytics with more performance information, and today Browser Insights are taking a big step forward by incorporating Web Vitals metrics.

But, we know not everyone can modify their DNS to take advantage of Cloudflare’s edge services. That’s why today we’re announcing a free, standalone analytics product for everyone.

How do I get it?

For existing Cloudflare customers on our Pro, Biz, and Enterprise plans, just go to your Analytics tab! Starting today, you’ll see a banner to opt-in to the new analytics experience. (We plan to make this the default in a few weeks.)

But when building privacy-first analytics, we realized it’s important to make this accessible even to folks who don’t use Cloudflare today. You’ll be able to use Cloudflare’s web analytics even if you can’t change your DNS servers — just add our JavaScript, and you’re good to go.

We’re still putting on the finishing touches on our JavaScript-based analytics, but you can sign up here and we’ll let you know when it’s ready.

The evolution of analytics at Cloudflare

Just over a year ago, Cloudflare’s analytics consisted of a simple set of metrics: cached vs uncached data transfer, or how many requests were blocked by the Firewall. Today we provide flexible, powerful analytics across all our products, including Firewall, Cache, Load Balancing and Network traffic.

While we’ve been focused on building analytics about our products, we realized that our analytics are also powerful as a standalone product. Today is just the first step on that journey. We have so much more planned: from real-time analytics, to ever-more performance analysis, and even allowing customers to add custom events.

We want to hear what you want most out of analytics — drop a note in the comments to let us know what you want to see next.

Explaining Cloudflare’s ABR Analytics

2020-09-29 Jamie Herre

Post Syndicated from Jamie Herre original https://blog.cloudflare.com/explaining-cloudflares-abr-analytics/

Explaining Cloudflare's ABR Analytics

Cloudflare’s analytics products help customers answer questions about their traffic by analyzing the mind-boggling, ever-increasing number of events (HTTP requests, Workers requests, Spectrum events) logged by Cloudflare products every day. The answers to these questions depend on the point of view of the question being asked, and we’ve come up with a way to exploit this fact to improve the quality and responsiveness of our analytics.

Useful Accuracy

Consider the following questions and answers:

What is the length of the coastline of Great Britain? 12.4K km
What is the total world population? 7.8B
How many stars are in the Milky Way? 250B
What is the total volume of the Antarctic ice shelf? 25.4M km³
What is the worldwide production of lentils? 6.3M tonnes
How many HTTP requests hit my site in the last week? 22.6M

Useful answers do not benefit from being overly exact. For large quantities, knowing the correct order of magnitude and a few significant digits gives the most useful answer. At Cloudflare, the difference in traffic between different sites or when a single site is under attack can cross nine orders of magnitude and, in general, all our traffic follows a Pareto distribution, meaning that what’s appropriate for one site or one moment in time might not work for another.

Because of this distribution, a query that scans a few hundred records for one customer will need to scan billions for another. A report that needs to load a handful of rows under normal operation might need to load millions when a site is under attack.

To get a sense of the relative difference of each of these numbers, remember “Powers of Ten”, an amazing visualization that Ray and Charles Eames produced in 1977. Notice that the scale of an image determines what resolution is practical for recording and displaying it.

Using ABR to determine resolution

This basic fact informed our design and implementation of ABR for Cloudflare analytics. ABR stands for “Adaptive Bit Rate”. It’s essentially an eponym for the term as used in video streaming such as Cloudflare’s own Stream Delivery. In those cases, the server will select the best resolution for a video stream to match your client and network connection.

In our case, every analytics query that supports ABR will be calculated at a resolution matching the query. For example, if you’re interested to know from which country the most firewall events were generated in the past week, the system might opt to use a lower resolution version of the firewall data than if you had opted to look at the last hour. The lower resolution version will provide the same answer but take less time and fewer resources. By using multiple, different resolutions of the same data, our analytics can provide consistent response times and a better user experience.

You might be aware that we use a columnar store called ClickHouse to store and process our analytics data. When using ABR with ClickHouse, we write the same data at multiple resolutions into separate tables. Usually, we cover seven orders of magnitude – from 100% to 0.0001% of the original events. We wind up using an additional 12% of disk storage but enable very fast ad hoc queries on the reduced resolution tables.

Aggregations and Rollups

The ABR technique facilitates aggregations by making compact estimates of every dimension. Another way to achieve the same ends is with a system that computes “rollups”. Rollups save space by computing either complete or partial aggregations of the data as it arrives.

For example, suppose we wanted to count a total number of lentils. (Lentils are legumes and among the oldest and most widely cultivated crops. They are a staple food in many parts of the world.) We could just count each lentil as it passed through the processing system. Of course because there a lot of lentils, that system is distributed – meaning that there are hundreds of separate machines. Therefore we’ll actually have hundreds of separate counters.

Also, we’ll want to include more information than just the count, so we’ll also include the weight of each lentil and maybe 10 or 20 other attributes. And of course, we don’t want just a total for each attribute, but we’ll want to be able to break it down by color, origin, distributor and many other things, and also we’ll want to break these down by slices of time.

In the end, we’ll have tens of thousands or possibly millions of aggregations to be tabulated and saved every minute. These aggregations are expensive to compute, especially when using aggregations more complicated than simple counters and sums. They also destroy some information. For example, once we’ve processed all the lentils through the rollups, we can’t say for sure that we’ve counted them all, and most importantly, whichever attributes we neglected to aggregate are unavailable.

The number we’re counting, 6.3M tonnes, only includes two significant digits which can easily be achieved by counting a sample. Most of the rollup computations used on each lentil (on the order 1013 to account for 6.3M tonnes) are wasted.

Other forms of aggregations

So far, we’ve discussed ABR and its application to aggregations, but we’ve only given examples involving “counts” and “sums”. There are other, more complex forms of aggregations we use quite heavily. Two examples are “topK” and “count-distinct”.

A “topK” aggregation attempts to show the K most frequent items in a set. For example, the most frequent IP address, or country. To compute topK, just count the frequency of each item in the set and return the K items with the highest frequencies. Under ABR, we compute topK based on the set found in the matching resolution sample. Using a sample makes this computation a lot faster and less complex, but there are problems.

The estimate of topK derived from a sample is biased and dependent on the distribution of the underlying data. This can result in overestimating the significance of elements in the set as compared to their frequency in the full set. In practice this effect can only be noticed when the cardinality of the set is very high and you’re not going to notice this effect on a Cloudflare dashboard. If your site has a lot of traffic and you’re looking at the Top K URLs or browser types, there will be no difference visible at different resolutions. Also keep in mind that as long as we’re estimating the “proportion” of the element in the set and the set is large, the results will be quite accurate.

The other fascinating aggregation we support is known as “count-distinct”, or number of uniques. In this case we want to know the number of unique values in a set. For example, how many unique cache keys have been used. We can safely say that a uniform random sample of the set cannot be used to estimate this number. However, we do have a solution.

We can generate another, alternate sample based on the value in question. For example, instead of taking a random sample of all requests, we take a random sample of IP addresses. This is sometimes called distinct reservoir sampling, and it allows us to estimate the true number of distinct IPs based on the cardinality of the sampled set. Again, there are techniques available to improve these estimates, and we’ll be implementing some of those.

ABR improves resilience and scalability

Using ABR saves us resources. Even better, it allows us to query all the attributes in the original data, not just those included in rollups. And even better, it allows us to check our assumptions against different sample intervals in separate tables as a check that the system is working correctly, because the original events are preserved.

However, the greatest benefits of employing ABR are the ones that aren’t directly visible. Even under ideal conditions, a large distributed system such as Cloudflare’s data pipeline is subject to high tail latency. This occurs when any single part of the system takes longer than usual for any number of a long list of reasons. In these cases, the ABR system will adapt to provide the best results available at that moment in time.

For example, compare this chart showing Cache Performance for a site under attack with the same chart generated a moment later while we simulate a failure of some of the servers in our cluster. In the days before ABR, your Cloudflare dashboard would fail to load in this scenario. Now, with ABR analytics, you won’t see significant degradation.

Stretching the analogy to ABR in video streaming, we want you to be able to enjoy your analytics dashboard without being bothered by issues related to faulty servers, or network latency, or long running queries. With ABR you can get appropriate answers to your questions reliably and within a predictable amount of time.

In the coming months, we’re going to be releasing a variety of new dashboards and analytics products based on this simple but profound technology. Watch your Cloudflare dashboard for increasingly useful and interactive analytics.

Start measuring Web Vitals with Browser Insights

2020-09-29 Jon Levine

Post Syndicated from Jon Levine original https://blog.cloudflare.com/start-measuring-web-vitals-with-browser-insights/

Start measuring Web Vitals with Browser Insights

Many of us at Cloudflare obsess about how to make websites faster. But to improve performance, you have to measure it first. Last year we launched Browser Insights to help our customers measure web performance from the perspective of end users.

Today, we’re partnering with the Google Chrome team to bring Web Vitals measurements into Browser Insights. Web Vitals are a new set of metrics to help web developers and website owners measure and understand load time, responsiveness, and visual stability. And with Cloudflare’s Browser Insights, they’re easier to measure than ever – and it’s free for anyone to collect data from the whole web.

Start measuring Web Vitals with Browser Insights

Why do we need Web Vitals?

When trying to understand performance, it’s tempting to focus on the metrics that are easy to measure — like Time To First Byte (TTFB). While TTFB and similar metrics are important to understand, we’ve learned that they don’t always tell the whole story.

Our partners on the Google Chrome team have tackled this problem by breaking down user experience into three components:

Loading: How long did it take for content to become available?
Interactivity: How responsive is the website when you interact with it?
Visual stability: How much does the page move around while loading? (I think of this as the inverse of “jankiness”)

It’s challenging to create a single metric that captures these high-level components. Thankfully, the folks at Google Chrome team have thought about this, and earlier this year introduced three “Core” Web Vitals metrics: Largest Contentful Paint, First Input Delay, and Cumulative Layout Shift.

How do Web Vitals help make your website faster?

Measuring the Core Web Vitals isn’t the end of the story. Rather, they’re a jumping off point to understand what factors impact a website’s performance. Web Vitals tells you what is happening at a high level, and other more detailed metrics help you understand why user experience could be slow.

Take loading time, for example. If you notice that your Largest Contentful Paint score is “needs improvement”, you want to dig into what is taking so long to load! Browser Insights still measures navigation timing metrics like DNS lookup time and TTFB. By analyzing these metrics in turn, you might want to dig further into optimizing cache hit rates, tuning the performance of your origin server, or tweaking order in which resources like JavaScript and CSS load.

For more information about improving web performance, check out Google’s guides to improving LCP, FID, and CLS.

Why measure Web Vitals with Cloudflare?

First, we think that RUM (Real User Measurement) is a critical companion to synthetic measurement. While you can always try a few page loads on your own laptop and see the results, gathering data from real users is the only way to take into account real-life device performance and network conditions.

There are other great RUM tools out there. Google’s Chrome User Experience Report (CrUX) collects data about the entire web and makes it available through tools like Page Speed Insights (PSI), which combines synthetic and RUM results into useful diagnostic information.

One major benefit of Cloudflare’s Browser Insights is that it updates constantly; new data points are available shortly after seeing a request from an end-user. The data in the Chrome UX Report is a 28-day rolling average of aggregated metrics, so you need to wait until you can see changes reflected in the data.

Another benefit of Browser Insights is that we can measure any browser — not just Chrome. As of this writing, the APIs necessary to report Web Vitals are only supported in Chromium browsers, but we’ll support Safari and Firefox when they implement those APIs.

Finally, Brower Insights is free to use! We’ve worked really hard to make our analytics blazing fast for websites with any amount of traffic. We’re excited to support slicing and grouping by URL, Browser, OS, and Country, and plan to support several more dimensions soon.

Push a button to start measuring

To start using Browser Insights, just head over to the Speed tab in the dashboard. Starting today, Web Vitals metrics are now available for everyone!

Behind the scenes, Browser Insights works by inserting a JavaScript “beacon” into HTML pages. You can control where the beacon loads if you only want to measure specific pages or hostnames. If you’re using CSP version 3, we’ll even automatically detect the nonce (if present) and add it to the script.

Where we’ve been, and where we’re going

We’ve been really proud of the success of Browser Insights. We’ve been hard at work over the last year making lots of improvements — for example, we’ve made the dashboard fast and responsive (and still free!) even for the largest websites.

Coming soon, we’re excited to make this available for all our Web Analytics customers — even those who don’t use Cloudflare today. We’re also hard at work adding much-requested features like client-side error reporting, and diagnostics tools to make it easier to understand where to improve.

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

2020-09-28 Jason Kincaid

Post Syndicated from Jason Kincaid original https://blog.cloudflare.com/birthday-week-on-cloudflare-tv-announcing-24-hours-of-live-discussions-on-the-future-of-the-internet/

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

This week marks Cloudflare’s 10th birthday, and we’re excited to continue our annual tradition of launching an array of products designed to help give back to the Internet. (Check back here each morning for the latest!)

We also see this milestone as an opportunity to reflect on where the Internet was ten years ago, and where it might be headed over the next decade. So we reached out to some of the people we respect most to see if they’d be interested in joining us for a series of Fireside Chats on Cloudflare TV.

We’ve been blown away by the response, and are thrilled to announce our lineup of speakers, featuring many of the most celebrated names in tech and beyond. Among the highlights: Apple co-founder Steve Wozniak, Zoom CEO Eric Yuan, OpenTable CEO Debby Soo, Stripe co-founder and President John Collison, Former CEO & Executive Chairman, Google // Co-Founder, Schmidt Futures. Eric Schmidt, former McAfee CEO Chris Young, Magic Leap CEO and longtime Microsoft executive Peggy Johnson, former Seal Team 6 Commander Dave Cooper, Project Include CEO Ellen Pao, and so many more. All told, we have over 24 hours of live discussions scheduled throughout the week.

To tune in, just head to Cloudflare TV (no registration required). You can view the details for each session by clicking the links below, where you’ll find handy Add to Calendar buttons to make sure you don’t miss anything. We’ll also be rebroadcasting these talks throughout the week, so they’ll be easily accessible in different timezones.

A tremendous thank you to everyone on this list for helping us celebrate Cloudflare’s 10th annual Birthday Week!

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Jay Adelson

Founder of Equinix and Chairman & Co-Founder of Scorbit

Thursday, October 1, 10:00 AM (PDT) // Add to Calendar

Shellye Archambeau

Fortune 500 Board Member and Author & Former CEO of MetricStream

Thursday, October 1, 6:30 PM (PDT) // Add to Calendar

Abhinav Asthana

Founder & CEO of Postman

Wednesday, September 30, 3:30 PM (PDT) // Add to Calendar

Azeem Azhar

Founder of Exponential View

Friday, October 2, 9:00 AM (PDT) // Add to Calendar

John Battelle

Co-Founder & CEO of Recount Media

Wednesday, September 30, 8:30 AM (PDT) // Add to Calendar

Christian Beedgen

CTO & Co-Founder of SumoLogic

Details coming soon

Scott Belsky

Chief Product Officer and Executive Vice President, Creative Cloud at Adobe

Wednesday, September 30, 11:00 AM (PDT) // Add to Calendar

Gleb Budman

CEO & Co-Founder of Backblaze

Wednesday, September 30, 2:00 PM (PDT) // Add to Calendar

Hayden Brown

CEO of Upwork

Details coming soon

Stewart Butterfield

CEO of Slack

Thursday, October 1, 8:30 AM (PDT) // Add to Calendar

John P. Carlin

Former Assistant Attorney General for the US Department of Justice’s National Security Division and current Chair of Morrison & Foerster’s Global Risk + Crisis Management practice

Tuesday, September 29, 12:00 PM (PDT) // Add to Calendar

John Collison

Co-Founder & President of Stripe

Friday, October 2, 3:00 PM (PDT) // Add to Calendar

Dave Cooper

Former Seal Team 6 Commander

Tuesday, September 29, 10:30 AM (PDT) // Add to Calendar

Scott Galloway

Founder & Chair of L2

Wednesday September 30th, 12PM (PDT) // Add to Calendar

Kara Goldin

Founder & CEO of Hint Inc.

Thursday, October 1, 12:30 PM (PDT) // Add to Calendar

David Gosset

Founder of Europe China Forum

Monday September 28th, 5:00PM (PDT) // Add to Calendar

Jon Green

VP and Chief Technologist for Security at Aruba, a Hewlett Packard Enterprise company

Wednesday, September 30th, 9:00AM (PDT) // Add to Calendar

Arvind Gupta

Former CEO of MyGov, Govt. of India and current Head & Co-Founder of Digital India Foundation

Monday, September 28, 8:00 PM (PDT) // Add to Calendar

Anu Hariharan

Partner at Y Combinator

Monday, September 28, 1:00 PM (PDT) // Add to Calendar

Brett Hautop

VP of Global Design + Build at LinkedIn

Friday, October 2, 11:00 AM (PDT) // Add to Calendar

Erik Hersman

CEO of BRCK

Details coming soon

Jennifer Hyman

CEO & Co-Founder of Rent the Runway

Wednesday, September 30, 1:00 PM (PDT) // Add to Calendar

Peggy Johnson

CEO of Magic Leap and former Executive at Microsoft and Qualcomm

Details coming soon

David Kaye

Former UN Special Rapporteur

Details Coming Soon

Pam Kostka

CEO of All Raise

Thursday, October 1, 1:30 PM (PDT) // Add to Calendar

Raffi Krikorian

Managing Director at Emerson Collective and former Engineering Executive at Twitter & Uber

Friday, October 2, 1:30 PM (PDT) // Add to Calendar

Albert Lee

Co-Founder of MyFitnessPal

Monday, September 28, 12:00 PM (PDT) // Add to Calendar

Aaron Levie

CEO & Co-Founder of Box

Thursday, October 1, 4:30 PM (PDT) // Add to Calendar

Alexander Macgillivray

Co-Founder & GC of Alloy and former Deputy CTO of US Government

Thursday, October 1, 11:30 AM (PDT) // Add to Calendar

Ellen Pao

Former CEO of Reddit and current CEO of Project Include

Tuesday, September 29, 2:00 PM (PDT) // Add to Calendar

Keith Rabois

General Partner at Founders Fund and former COO of Square

Wednesday, September 30, 3:00 PM (PDT) // Add to Calendar

Eric Schmidt

Former CEO of Google and current Technical Advisor at Alphabet, Inc.

Monday, September 28, 12:30 PM (PDT) // Add to Calendar

Pradeep Sindhu

Founder & Chief Scientist at Juniper Networks, and Founder & CEO at Fungible

Wednesday, September 30, 11:30 AM (PDT) // Add to Calendar

Karan Singh

Co-Founder & Chief Operating Officer of Ginger

Monday, September 28, 3:00 PM (PDT) // Add to Calendar

Debby Soo

CEO of OpenTable and former Chief Commercial Officer of KAYAK

Details coming soon

Dan Springer

CEO of DocuSign

Thursday, October 1, 1:00 PM (PDT) // Add to Calendar

Bonita Stewart

Vice President, Global Partnerships & Americas Partnerships Solutions at Google

Friday, October 2, 2:00 PM (PDT) // Add to Calendar

Hemant Taneja

Managing Director at General Catalyst

Friday, October 2nd, 4:00PM (PDT) // Add to Calendar

Bret Taylor

President & Chief Operating Officer of Salesforce

Friday, October 2, 12:00 PM (PDT) // Add to Calendar

Jennifer Tejada

CEO of PagerDuty

Details coming soon

Robert Thomson

Chief Executive at News Corp and former Editor-in-Chief at The Wall Street Journal & Dow Jones

Thursday, October 1, 12:00 PM (PDT) // Add to Calendar

Robin Thurston

Founder & CEO of Pocket Outdoor Media and former EVP, Chief Digital Officer of Under Armour

Monday, September 28, 12:00 PM (PDT) // Add to Calendar

Selina Tobaccowala

Chief Digital Officer at Openfit, Co-Founder of Gixo, and former President & CTO of SurveyMonkey

Details coming soon

Michael Wolf

Founder & CEO of Activate and former President and Chief Operating Officer of MTV Networks

Tuesday, September 29, 3:30 PM (PDT) // Add to Calendar

Josh Wolfe

Co-Founder and Managing Partner of Lux Capital

Details coming soon

Steve Wozniak

Co-Founder of Apple, Inc.

Wednesday, September 30, 10:00 AM (PDT) // Add to Calendar

Chris Young

Former CEO of McAfee

Thursday, October 1, 11:00 AM (PDT) // Add to Calendar

Eric Yuan

Founder & Chief Executive Officer of Zoom

Monday, September 28, 3:30 PM (PDT) // Add to Calendar

Introducing Cron Triggers for Cloudflare Workers

2020-09-28 Nancy Gao

Post Syndicated from Nancy Gao original https://blog.cloudflare.com/introducing-cron-triggers-for-cloudflare-workers/

Introducing Cron Triggers for Cloudflare Workers

Today the Cloudflare Workers team is thrilled to announce the launch of Cron Triggers. Before now, Workers were triggered purely by incoming HTTP requests but starting today you’ll be able to set a scheduler to run your Worker on a timed interval. This was a highly requested feature that we know a lot of developers will find useful, and we’ve heard your feedback after Serverless Week.

We are excited to offer this feature at no additional cost, and it will be available on both the Workers free tier and the paid tier, now called Workers Bundled. Since it doesn’t matter which city a Cron Trigger routes the Worker through, we are able to maximize Cloudflare’s distributed system and send scheduled jobs to underutilized machinery. Running jobs on these quiet machines is both efficient and cost effective, and we are able to pass those cost savings down to you.

What is a Cron Trigger and how might I use such a feature?

In case you’re not familiar with Unix systems, the cron pattern allows you to schedule jobs to run periodically at fixed intervals or at scheduled times. Cron Triggers in the context of Workers allow users to set time-based invocations for the job. These Workers happen on a recurring schedule, and differ from traditional Workers in that they do not fire on HTTP requests.

Most developers are familiar with the cron pattern and its usefulness across a wide range of applications. Pulling the latest data from APIs or running regular integration tests on a preset schedule are common examples of this.

“We’re excited about Cron Triggers. Workers is crucial to our stack, so using this feature for live integration tests will boost the developer experience.” – Brian Marks, Software Engineer at Bazaarvoice

How much does it cost to use Cron Triggers?

Triggers are included at no additional cost! Scheduled Workers count towards your request cap for both the free tier and Workers Bundled, but rest assured that there will be no hidden or extra fees. Our competitors charge extra for cron events, or in some cases offer a very limited free tier. We want to make this feature widely accessible and have decided not to charge on a per-trigger basis. While there are no limits for the number of triggers you can have across an account, note that there is a limit of 3 triggers per Worker script for this feature. You can read more about limits on Workers plans in this documentation.

How are you able to offer this feature at no additional cost?

Cloudflare supports a massive distributed system that spans the globe with 200+ cities. Our nodes are named for the IATA airport code that they are closest to. Most of the time we run Workers close to the request origin for performance reasons (ie SFO if you are in the Bay Area, or CDG if you are lucky enough to be in Paris 🥐🍷🧀). In a typical HTTP Worker, we do this because we know that performance is of material importance when someone is waiting for the response.

In the case of Cron Triggers, where the user is running a task on a timed basis, those performance needs are different. A few milliseconds of extra latency do not matter as much when the user isn’t actively waiting for the response. The nature of the feature gives us much more flexibility on where to run the job, since it doesn’t have to necessarily be in a city close to the end user.

Cron Triggers are run on underutilized machines to make the best use of our capacity and route traffic efficiently. For example, a job scheduled from San Francisco at 7pm Pacific Time might be sent to Paris because it’s 4am there and traffic across Europe is low. Sending traffic to these machines during quiet hours is very efficient, and we are more than happy to pass those cost savings down to you. Aside from this scheduling optimization, Workers that are called by Cron Triggers behave similarly to and have all of the same performance and security benefits as typical HTTP Workers.

What’s happening below the hood?

At a high level, schedules created through our API create records in our database. These records contain the information necessary to execute the Worker on the given cron schedule. These records are then picked up by another service which continuously evaluates the state of our edge and distributes the schedules among cities. Once the schedules have been distributed to the edge, a service running in the node polls for changes to the schedules and makes sure they get sent to our runtime at the appropriate time.

If you want to know more details about how we implemented this feature, please refer to the technical blog.

What’s coming next?

With this feature, we’ve expanded what’s possible to build with Workers, and further simplified the developer experience. While Workers previously only ran on web requests, we believe the future of edge computing isn’t strictly tied to HTTP requests and responses. We want to introduce more types of Workers in the future.

We plan to expand out triggers to include different types, such as data or event-based triggers. Our goal is to give users more flexibility and control over when their Workers run. Cron Triggers are our first step in this direction. In addition, we plan to keep iterating on Cron Triggers to make edge infrastructure selection even more sophisticated and optimized — for example, we might even consider triggers that allow our users to run in the most energy-efficient data centers.

How to try Cron Triggers

Cron triggers are live today! You can try it in the Workers dashboard by creating a new Worker and setting up a Cron Trigger.

Making Time for Cron Triggers: A Look Inside

2020-09-28 Aaron Lisman

Post Syndicated from Aaron Lisman original https://blog.cloudflare.com/cron-triggers-for-scheduled-workers/

Making Time for Cron Triggers: A Look Inside

Today, we are excited to launch Cron Triggers to the Cloudflare Workers serverless compute platform. We’ve heard the developer feedback, and we want to give our users the ability to run a given Worker on a scheduled basis. In case you’re not familiar with Unix systems, the cron pattern allows developers to schedule jobs to run at fixed intervals. This pattern is ideal for running any types of periodic jobs like maintenance or calling third party APIs to get up-to-date data. Cron Triggers has been a highly requested feature even inside Cloudflare and we hope that you will find this feature as useful as we have!

Where are Cron Triggers going to be run?

Cron Triggers are executed from the edge. At Cloudflare, we believe strongly in edge computing and wanted our new feature to get all of the performance and reliability benefits of running on our edge. Thus, we wrote a service in core that is responsible for distributing schedules to a new edge service through Quicksilver which will then trigger the Workers themselves.

What’s happening under the hood?

At a high level, schedules created through our API create records in our database with the information necessary to execute the Worker and the given cron schedule. These records are then picked up by another service which continuously evaluates the state of our edge and distributes the schedules between cities.

Once the schedules have been distributed to the edge, a service running in the edge node polls for changes to the schedules and makes sure they get sent to our runtime at the appropriate time.

New Event Type

Cron Triggers gave us the opportunity to finally recognize a new Worker ‘type’ in our API. While Workers currently only run on web requests, we have lots of ideas for the future of edge computing that aren’t strictly tied to HTTP requests and responses. Expect to see even more new handlers in the future for other non-HTTP events like log information from your Worker (think custom wrangler tail!) or even TCP Workers.

Here’s an example of the new Javascript API:

addEventListener('scheduled', event => {
  event.waitUntil(someAsyncFunction(event))
})

Where event has the following interface in Typescript:

interface ScheduledEvent {
  type: 'scheduled';
  scheduledTime: int; // milliseconds since the Unix epoch
}

As long as your Worker has a handler for this new event type, you’ll be able to give it a schedule.

New APIs

PUT /client/v4/accounts/:account_identifier/workers/scripts/:name

The script upload API remains the same, but during script validation we now detect and return the registered event handlers.

PUT /client/v4/accounts/:account_identifier/workers/scripts/:name/schedules

Body

[
 {"cron": "* * * * *"},
 ...
]

This will create or modify all schedules for a script, removing all schedules not in the list. For now, there’s a limit of 3 distinct cron schedules. Schedules can be set to run as often as one minute and don’t accept schedules with years in them (sorry, you’ll have to run your Y3K migration script another way).

GET /client/v4/accounts/:account_identifier/workers/scripts/:name/schedules

Response

{
 "schedules": [
   {
     "cron": "* * * * *",
      "created_on": <time>,
      "modified_on": <time>
   },
   ...
 ]
}

The Scheduler service is responsible for reading the schedules from Postgres and generating per-node schedules to place into Quicksilver. For now, the service simply avoids trying to execute your Worker on an edge node that may be disabled for some reason, but such an approach also gives us a lot of flexibility in deciding where your Worker executes.

In addition to edge node availability, we could optimize for compute cost, bandwidth, or even latency in the future!

What’s actually executing these schedules?

To consume the schedules and actually trigger the Worker, we built a new service in Rust and deployed to our edge using HashiCorp Nomad. Nomad ensures that the schedule runner remains running in the edge node and can move it between machines as necessary. Rust was the best choice for this service since it needed to be fast with high availability and Cap’n Proto RPC support for calling into the runtime. With Tokio, Anyhow, Clap, and Serde, it was easy to quickly get the service up and running without having to really worry about async, error handling, or configuration.

On top of that, due to our specific needs for cron parsing, we built a specialized cron parser using nom that allowed us to quickly parse and compile expressions into values that check against a given time to determine if we should run a schedule.

Once the schedule runner has the schedules, it checks the time and selects the Workers that need to be run. To let the runtime know it’s time to run, we send a Cap’n Proto RPC message. The runtime then does its thing, calling the new ‘scheduled’ event handler instead of ‘fetch’.

How can I try this?

As of today, the Cron Triggers feature is live! Please try it out by creating a Worker and finding the Triggers tab – we’re excited to see what you build with it!

Workers Durable Objects Beta: A New Approach to Stateful Serverless

2020-09-28 Kenton Varda

Post Syndicated from Kenton Varda original https://blog.cloudflare.com/introducing-workers-durable-objects/

Workers Durable Objects Beta:
A New Approach to Stateful Serverless

We launched Cloudflare Workers® in 2017 with a radical vision: code running at the network edge could not only improve performance, but also be easier to deploy and cheaper to run than code running in a single datacenter. That vision means Workers is about more than just edge compute — we’re rethinking how applications are built.

Using a “serverless” approach has allowed us to make deploys dead simple, and using isolate technology has allowed us to deliver serverless more cheaply and without the lengthy cold starts that hold back other providers. We added easy-to-use eventually-consistent edge storage to the platform with Workers KV.

But up until today, it hasn’t been possible to manage state with strong consistency, or to coordinate in real time between multiple clients, entirely on the edge. Thus, these parts of your application still had to be hosted elsewhere.

Durable Objects provide a truly serverless approach to storage and state: consistent, low-latency, distributed, yet effortless to maintain and scale. They also provide an easy way to coordinate between clients, whether it be users in a particular chat room, editors of a particular document, or IoT devices in a particular smart home. Durable Objects are the missing piece in the Workers stack that makes it possible for whole applications to run entirely on the edge, with no centralized “origin” server at all.

Today we are beginning a closed beta of Durable Objects.

Request a beta invite »

What is a “Durable Object”?

I’m going to be honest: naming this product was hard, because it’s not quite like any other cloud technology that is widely-used today. This proverbial bike shed has many layers of paint, but ultimately we settled on “Unique Durable Objects”, or “Durable Objects” for short. Let me explain what they are by breaking that down:

Objects: Durable Objects are objects in the sense of Object-Oriented Programming. A Durable Object is an instance of a class — literally, a class definition written in JavaScript (or your language of choice). The class has methods which define its public interface. An object is an instance of this class, combining the code with some private state.
Unique: Each object has a globally-unique identifier. That object exists in only one location in the whole world at a time. Any Worker running anywhere in the world that knows the object’s ID can send messages to it. All those messages end up delivered to the same place.
Durable: Unlike a normal object in JavaScript, Durable Objects can have persistent state stored on disk. Each object’s durable state is private to it, which means not only that access to storage is fast, but the object can even safely maintain a consistent copy of the state in memory and operate on it with zero latency. The in-memory object will be shut down when idle and recreated later on-demand.

What can they do?

Durable Objects have two primary abilities:

Storage: Each object has attached durable storage. Because this storage is private to a specific object, the storage is always co-located with the object. This means the storage can be very fast while providing strong, transactional consistency. Durable Objects apply the serverless philosophy to storage, splitting the traditional large monolithic databases up into many small, logical units. In doing so, we get the advantages you’ve come to expect from serverless: effortless scaling with zero maintenance burden.
Coordination: Historically, with Workers, each request would be randomly load-balanced to a Worker instance. Since there was no way to control which instance received a request, there was no way to force two clients to talk to the same Worker, and therefore no way for clients to coordinate through Workers. Durable Objects change that: requests related to the same topic can be forwarded to the same object, which can then coordinate between them, without any need to touch storage. For example, this can be used to facilitate real-time chat, collaborative editing, video conferencing, pub/sub message queues, game sessions, and much more.

The astute reader may notice that many coordination use cases call for WebSockets — and indeed, conversely, most WebSocket use cases require coordination. Because of this complementary relationship, along with the Durable Objects beta, we’ve also added WebSocket support to Workers. For more on this, see the Q&A below.

Region: Earth

When using Durable Objects, Cloudflare automatically determines the Cloudflare datacenter that each object will live in, and can transparently migrate objects between locations as needed.

Traditional databases and stateful infrastructure usually require you to think about geographical “regions”, so that you can be sure to store data close to where it is used. Thinking about regions can often be an unnatural burden, especially for applications that are not inherently geographical.

With Durable Objects, you instead design your storage model to match your application’s logical data model. For example, a document editor would have an object for each document, while a chat app would have an object for each chat. There is no problem creating millions or billions of objects, as each object has minimal overhead.

Killer app: Real-time collaborative document editing

Let’s say you have a spreadsheet editor application — or, really, any kind of app where users edit a complex document. It works great for one user, but now you want multiple users to be able to edit it at the same time. How do you accomplish this?

For the standard web application stack, this is a hard problem. Traditional databases simply aren’t designed to be real-time. When Alice and Bob are editing the same spreadsheet, you want every one of Alice’s keystrokes to appear immediately on Bob’s screen, and vice versa. But if you merely store the keystrokes to a database, and have the users repeatedly poll the database for new updates, at best your application will have poor latency, and at worst you may find database transactions repeatedly fail as users on opposite sides of the world fight over editing the same content.

The secret to solving this problem is to have a live coordination point. Alice and Bob connect to the same coordinator, typically using WebSockets. The coordinator then forwards Alice’s keystrokes to Bob and Bob’s keystrokes to Alice, without having to go through a storage layer. When Alice and Bob edit the same content at the same time, the coordinator resolves conflicts instantly. The coordinator can then take responsibility for updating the document in storage — but because the coordinator keeps a live copy of the document in-memory, writing back to storage can happen asynchronously.

Every big-name real-time collaborative document editor works this way. But for many web developers, especially those building on serverless infrastructure, this kind of solution has long been out-of-reach. Standard serverless infrastructure — and even cloud infrastructure more generally — just does not make it easy to assign these coordination points and direct users to talk to the same instance of your server.

Durable Objects make this easy. Not only do they make it easy to assign a coordination point, but Cloudflare will automatically create the coordinator close to the users using it and migrate it as needed, minimizing latency. The availability of local, durable storage means that changes to the document can be saved reliably in an instant, even if the eventual long-term storage is slower. Or, you can even store the entire document on the edge and abandon your database altogether.

With Durable Objects lowering the barrier, we hope to see real-time collaboration become the norm across the web. There’s no longer any reason to make users refresh for updates.

Example: An atomic counter

Here’s a very simple example of a Durable Object which can be incremented, decremented, and read over HTTP. This counter is consistent even when receiving simultaneous requests from multiple clients — none of the increments or decrements will be lost. At the same time, reads are served entirely from memory, no disk access needed.

export class Counter {
  // Constructor called by the system when the object is needed to
  // handle requests.
  constructor(controller, env) {
    // `controller.storage` is an interface to access the object's
    // on-disk durable storage.
    this.storage = controller.storage
  }

  // Private helper method called from fetch(), below.
  async initialize() {
    let stored = await this.storage.get("value");
    this.value = stored || 0;
  }

  // Handle HTTP requests from clients.
  //
  // The system calls this method when an HTTP request is sent to
  // the object. Note that these requests strictly come from other
  // parts of your Worker, not from the public internet.
  async fetch(request) {
    // Make sure we're fully initialized from storage.
    if (!this.initializePromise) {
      this.initializePromise = this.initialize();
    }
    await this.initializePromise;

    // Apply requested action.
    let url = new URL(request.url);
    switch (url.pathname) {
      case "/increment":
        ++this.value;
        await this.storage.put("value", this.value);
        break;
      case "/decrement":
        --this.value;
        await this.storage.put("value", this.value);
        break;
      case "/":
        // Just serve the current value. No storage calls needed!
        break;
      default:
        return new Response("Not found", {status: 404});
    }

    // Return current value.
    return new Response(this.value);
  }
}

Once the class has been bound to a Durable Object namespace, a particular instance of Counter can be accessed from anywhere in the world using code like:

// Derive the ID for the counter object named "my-counter".
// This name is associated with exactly one instance in the
// whole world.
let id = COUNTER_NAMESPACE.idFromName("my-counter");

// Send a request to it.
let response = await COUNTER_NAMESPACE.get(id).fetch(request);

Demo: Chat

Chat is arguably real-time collaboration in its purest form. And to that end, we have built a demo open source chat app that runs entirely at the edge using Durable Objects.

Try the live demo »See the source code on GitHub »

This chat app uses a Durable Object to control each chat room. Users connect to the object using WebSockets. Messages from one user are broadcast to all the other users. The chat history is also stored in durable storage, but this is only for history. Real-time messages are relayed directly from one user to others without going through the storage layer.

Additionally, this demo uses Durable Objects for a second purpose: Applying a rate limit to messages from any particular IP. Each IP is assigned a Durable Object that tracks recent request frequency, so that users who send too many messages can be temporarily blocked — even across multiple chat rooms. Interestingly, these objects don’t actually store any durable state at all, because they only care about very recent history, and it’s not a big deal if a rate limiter randomly resets on occasion. So, these rate limiter objects are an example of a pure coordination object with no storage.

This chat app is only a few hundred lines of code. The deployment configuration is only a few lines. Yet, it will scale seamlessly to any number of chat rooms, limited only by Cloudflare’s available resources. Of course, any individual chat room’s scalability has a limit, since each object is single-threaded. But, that limit is far beyond what a human participant could keep up with anyway.

Other use cases

Durable Objects have infinite uses. Here are just a few ideas, beyond the ones described above:

Shopping cart: An online storefront could track a user’s shopping cart in an object. The rest of the storefront could be served as a fully static web site. Cloudflare will automatically host the cart object close to the end user, minimizing latency.
Game server: A multiplayer game could track the state of a match in an object, hosted on the edge close to the players.
IoT coordination: Devices within a family’s house could coordinate through an object, avoiding the need to talk to distant servers.
Social feeds: Each user could have a Durable Object that aggregates their subscriptions.
Comment/chat widgets: A web site that is otherwise static content can add a comment widget or even a live chat widget on individual articles. Each article would use a separate Durable Object to coordinate. This way the origin server can focus on static content only.

The Future: True Edge Databases

We see Durable Objects as a low-level primitive for building distributed systems. Some applications, like those mentioned above, can use objects directly to implement a coordination layer, or maybe even as their sole storage layer.

However, Durable Objects today are not a complete database solution. Each object can see only its own data. To perform a query or transaction across multiple objects, the application needs to do some extra work.

That said, every big distributed database – whether it be relational, document, graph, etc. – is, at some low level, composed of “chunks” or “shards” that store one piece of the overall data. The job of a distributed database is to coordinate between chunks.

We see a future of edge databases that store each “chunk” as a Durable Object. By doing so, it will be possible to build databases that operate entirely at the edge, fully distributed with no regions or home location. These databases need not be built by us; anyone can potentially build them on top of Durable Objects. Durable Objects are only the first step in the edge storage journey.

Join the Beta

Storing data is a big responsibility which we do not take lightly. Because of the critical importance of getting it right, we are being careful. We will be making Durable Objects available gradually over the next several months.

As with any beta, this product is a work in progress, and some of what is described in this post is not fully enabled yet. Full details of beta limitations can be found in the documentation.

If you’d like to try out Durable Objects now, tell us about your use case. We’ll be selecting the most interesting use cases for early access.

Request a beta invite »

Q&A

Can Durable Objects serve WebSockets?

Yes.

As part of the Durable Objects beta, we’ve made it possible for Workers to act as WebSocket endpoints — including as a client or as a server. Before now, Workers could proxy WebSocket connections on to a back-end server, but could not speak the protocol directly.

While technically any Worker can speak WebSocket in this way, WebSockets are most useful when combined with Durable Objects. When a client connects to your application using a WebSocket, you need a way for server-generated events to be sent back to the existing socket connection. Without Durable Objects, there’s no way to send an event to the specific Worker holding a WebSocket. With Durable Objects, you can now forward the WebSocket to an Object. Messages can then be addressed to that Object by its unique ID, and the Object can then forward those messages down the WebSocket to the client.

The chat app demo presented above uses WebSockets. Check out the source code to see how it works.

How does this compare to Workers KV?

Two years ago, we introduced Workers KV, a global key-value data store. KV is a fairly minimalist global data store that serves certain purposes well, but is not for everyone. KV is eventually consistent, which means that writes made in one location may not be visible in other locations immediately. Moreover, it implements “last write wins” semantics, which means that if a single key is being modified from multiple locations in the world at once, it’s easy for those writes to overwrite each other. KV is designed this way to support low-latency reads for data that doesn’t frequently change. However, these design decisions make KV inappropriate for state that changes frequently, or when changes need to be immediately visible worldwide.

Durable Objects, in contrast, are not primarily a storage product at all — many use cases for them do not actually utilize durable storage. To the extent that they do provide storage, Durable Objects sit at the opposite end of the storage spectrum from KV. They are extremely well-suited to workloads requiring transactional guarantees and immediate consistency. However, since transactions inherently must be coordinated in a single location, and clients on the opposite side of the world from that location will experience moderate latency due to the inherent limitations of the speed of light. Durable Objects will combat this problem by auto-migrating to live close to where they are used.

In short, Workers KV remains the best way to serve static content, configuration, and other rarely-changing data around the world, while Durable Objects are better for managing dynamic state and coordination.

Going forward, we plan to utilize Durable Objects in the implementation of Workers KV itself, in order to deliver even better performance.

Why not use CRDTs?

You can build CRDT-based storage on top of Durable Objects, but Durable Objects do not require you to use CRDTs.

Conflict-free Replicated Data Types (CRDTs), or their cousins, Operational Transforms (OTs), are a technology that allows data to be edited from multiple places in the world simultaneously without synchronization, and without data loss. For example, these technologies are commonly used in the implementation of real-time collaborative document editors, so that a user’s keypresses can show up in their local copy of the document in real time, without waiting to see if anyone else edited another part of the document first. Without getting into details, you can think of these techniques like a real time version of “git fork” and “git merge”, where all merge conflicts are resolved automatically in a deterministic way, so that everyone ends up with the same state in the end.

CRDTs are a powerful technology, but applying them correctly can be challenging. Only certain kinds of data structures lend themselves to automatic conflict resolution in a way that doesn’t lead to easy data loss. Any developer familiar with git can see the problem: arbitrary conflict resolution is hard, and any automated algorithm for it will likely get things wrong sometimes. It’s all the more difficult if the algorithm has to handle merges in arbitrary order and still get the same answer.

We feel that, for most applications, CRDTs are overly complex and not worth the effort. Worse, the set of data structures that can be represented as a CRDT is too limited for many applications. It’s usually much easier to assign a single authoritative coordination point for each document, which is exactly what Durable Objects accomplish.

With that said, CRDTs can be used on top of Durable Objects. If an object’s state lends itself to CRDT treatment, then an application could replicate that object into several objects serving different regions, which then synchronize their states via CRDT. This would make sense for applications to implement as an optimization if and when they find it is worth the effort.

Last thoughts: What does it mean for state to be “serverless”?

Traditionally, serverless has focused on stateless compute. In serverless architectures, the logical unit of compute is reduced to something fine-grained: a single event, such as an HTTP request. This works especially well because events just happened to be the logical unit of work that we think about when designing server applications. No one thinks about their business logic in units of “servers” or “containers” or “processes” — we think about events. It is exactly because of this semantic alignment that serverless succeeds in shifting so much of the logistical burden of maintaining servers away from the developer and towards the cloud provider.

However, serverless architecture has traditionally been stateless. Each event executes in isolation. If you wanted to store data, you had to connect to a traditional database. If you wanted to coordinate between requests, you had to connect to some other service that provides that ability. These external services have tended to re-introduce the operational concerns that serverless was intended to avoid. Developers and service operators have to worry not just about scaling their databases to handle increasing load, but also about how to split their database into “regions” to effectively handle global traffic. The latter concern can be especially cumbersome.

So how can we apply the serverless philosophy to state? Just like serverless compute is about splitting compute into fine-grained pieces, serverless state is about splitting state into fine-grained pieces. Again, we seek to find a unit of state that corresponds to logical units in our application. The logical unit of state in an application is not a “table” or a “collection” or a “graph”. Instead, it depends on the application. The logical unit of state in a chat app is a chat room. The logical unit of state in an online spreadsheet editor is a spreadsheet. The logical unit of state in an online storefront is a shopping cart. By making the physical unit of storage provided by the storage layer match the logical unit of state inherent in the application, we can allow the underlying storage provider (Cloudflare) to take responsibility for a wide array of logistical concerns that previously fell on the developer, including scalability and regionality.

This is what Durable Objects do.

A letter from Cloudflare’s founders (2020)

2020-09-27 Matthew Prince

Post Syndicated from Matthew Prince original https://blog.cloudflare.com/a-letter-from-cloudflares-founders-2020/

A letter from Cloudflare’s founders (2020)

To our stakeholders:

Cloudflare launched on September 27, 2010 — 10 years ago today. Stopping to look back over the last 10 years is challenging in some ways because so much of who we are has changed radically. A decade ago when we launched we had a few thousand websites using us, our tiny office was above a nail salon in Palo Alto, our team could be counted on less than two hands, and our data center locations on one hand.

A letter from Cloudflare’s founders (2020) — Outside our first office in Palo Alto in 2010. Photo by Ray Rothrock.

As the company grew, it would have been easy to stick with accelerating and protecting developers and small business websites and not see the broader picture. But, as this year has shown with crystal clarity, we all depend on the Internet for many aspects of our lives: for access to public information and services, to getting work done, for staying in touch with friends and loved ones, and, increasingly, for educating our children, ordering groceries, learning the latest dance moves, and so many other things. The Internet underpins much of what we do every day, and Cloudflare’s mission to help build a better Internet seems more and more important every day.

Over time Cloudflare has gone from an idea on a piece of paper to one of the largest networks in the world that powers millions of customers. Because we made our network to be flexible and programmable, what we’ve been able to do with it has expanded over time as well. Today we secure the Internet end-to-end — from companies’ infrastructure to individuals seeking a faster, more secure, more private connection. Our programmable, global network is at the core of everything we have been able to achieve so far.

Updating Our Annual Founders’ Letter

This is also the approximate one-year anniversary of Cloudflare going public. At the time, we wrote our first founders’ letter to the potential investors. We thought it made sense on this day, which we think of as our birthday, to reflect on the last year, as well as the last 10 years, and start a tradition of updating our original founders’ letter on September 27th every year.

It’s been quite a year for our business. Since our IPO, we’ve seen record expansion of new customers. That growth has come both from expanding our existing customers as well as winning new business from new customers.

The percentage of the Fortune 1,000 that pay for one or more of Cloudflare’s services rose from 10% when we went public to more than 16% today. Across the web as a whole, according to W3Techs’ data, over the last year Cloudflare has grown from 10.1% of the top 10 million websites using our services to 14.5% using them today. (Amazon CloudFront, in second place based on the number of websites they serve, grew from 0.8% to 0.9% over the same period.)

Every year to celebrate our birthday we’ve made it a tradition to launch products that surprise the market with new ways to expand how anyone can use our network. We think of them as gifts back to the Internet. Three years ago, for instance, we launched our edge computing platform called Workers. Today, just three years later, hundreds of thousands of developers are using Workers to build applications, many of which we believe would be impossible to build on any other platform.

This year we’re once again launching a series of products to extend Cloudflare’s capabilities and hopefully surprise and delight the Internet. One that we’re especially excited about brings a new data model to Workers, allowing even more sophisticated applications to be built on the platform.

The Year of COVID

It is impossible to reflect on the last year and not see the impact of the COVID-19 pandemic on our business, our customers, our employees, as well our friends, colleagues, and loved ones in the greater community. It’s heartening to think that for more than half of Cloudflare’s life as a public company our team has worked remote.

2020 was meant to be an Olympic year, but COVID-19 stopped that, like much else, from happening. Eight years ago, when Cloudflare was just two, the creator of the World Wide Web, Tim Berners-Lee, sent a message from the opening ceremony of the 2012 Olympics. That message read “This is for everyone” and the idea that the Internet is for all of us continues to be a key part of Cloudflare’s ethos today.

When we started Cloudflare we wanted to democratize what we thought were technologies only available to the richest and most Internet-focused organizations. We saw an opportunity to make available to everyone — from individual developers to small businesses to large corporations — the sorts of speed, protection, and reliability that, at the time, only the likes of Google, Amazon, and Facebook could afford.

Giving Back to the Internet

Over 10 years we’ve consistently rolled out the latest technologies, typically ahead of the rest of the industry, to everyone. And in doing so we’ve attracted employees, individuals, developers, customers to our platform. The Internet is for everyone and we’ve shown that a business can be very successful when we aim to serve everyone — large and small.

Something Steve Jobs said back in 1988 still resonates: “If you want to make a revolution, you’ve got to raise the lowest common denominator in every single machine.” Although we aren’t selling machines, we think that’s right: democratizing features matters.

Just look at the scourge of DDoS attacks. Why should DDoS attack mitigation be expensive when it’s a plague on companies large and small? It shouldn’t, and we optimized our business to make it inexpensive for us and passed that on to our customers through Unmetered DDoS Mitigation — another feature we rolled out to celebrate our Birthday Week three years ago.

In 2014, also during Birthday Week, we launched Universal SSL, making encryption — something that had been expensive and difficult — free for all Cloudflare customers. The week we launched it we doubled the size of the encrypted web. Let’s Encrypt followed shortly after and, together, we’ve brought encryption to more than 90% of the web and made the little padlock in your browser something everyone can afford and should expect.

Helping Customers During Their Time of Need

In January of this year, we rolled out Cloudflare for Teams. The product was designed to replace the legacy VPNs and firewalls that were increasingly anachronistic as work moved to the cloud. Little did we know how much COVID-19 would accelerate their obsolescence and make Cloudflare for Teams essential.

Both of us sat on call after call in mid-March with at first small, then increasingly mid-sized, and eventually large and even governmental organizations who reached out to us looking for a way to survive as their teams shifted to working from home and their legacy hardware couldn’t keep up. We made the decision to sacrifice short term profits in order to help businesses large and small get through this crisis by making Cloudflare for Teams free through September.

As we said during our Q1 earnings call, the superheros of this crisis are the medical professionals and scientists who are taking care of the sick and looking for a cure to the disease. But the faithful sidekick throughout has been the Internet. And, as one of the guardians of the Internet, we’re proud of helping ensure it was fast, secure, and reliable around the world when it was needed most. We are proud of how Cloudflare’s products could help the businesses continue to get work done during this unprecedented time by leaning even more on the Internet.

Meeting the Challenges Ahead

Giving back to the Internet is core to who we are, and we do not shy away from a challenge. And there are many challenges ahead. In a little over a month, the United States will hold elections. After the 2016 elections we, along with the rest of the world, were concerned to see technology intended to bring people together instead be used to subvert the democratic process. We decided we needed to do something to help prevent that from happening again.

Three and a half years ago, we launched the Athenian Project to provide free cybersecurity resources to any local, state, or federal officials helping administer elections in the United States. We couldn’t have built Cloudflare into the company it is today without a stable government as a foundational platform. And, when that foundation is challenged, we believe it is our duty to lend our resources to defend it.

Today, we’re helping secure election infrastructure in more than half of the states in the United States. And, over these last weeks before the election, our team is working around the clock to help ensure the process is fair and not disrupted by cyber attacks.

More challenges lie ahead and we won’t shy away from them. Well intentioned governments around the world are increasingly seeking to regulate the Internet to protect their citizens. While the aims are noble, the risk is creating a patchwork of laws that only the Internet giants can successfully navigate. We believe it is critical for us to engage in the conversations around these regulations and work to help ensure as operating online becomes more complex, we can continue to make the opportunities of the Internet created for us when we started Cloudflare available to future startups and entrepreneurs.

Fighting for the Internet

Over the last 10 years, it’s been sad to watch some of the optimism around technology seem to fade. The perception of technology companies shifted from their being able to do no wrong to, today, their being able to do no right. And, as we’ve watched the industry develop, we’ve sympathized with that shift. Too many tech companies have abused customer data, ignored rules, violated privacy, and not been good citizens to the communities in which they operate and serve.

But we continue to believe what we started Cloudflare believing 10 years ago: the Internet itself is a force for good worth fighting to defend. We need to keep striving to make the Internet itself better — always on, always fast, always secure, always private, and available to everyone.

It’s striking to think how much more disruptive the COVID-19 crisis could have been had it struck in 2010 not 2020. The difference today is a better Internet. We’re proud of the role we’ve played in helping build that better Internet.

And, ten years in, we’re just getting started.

Welcome to Birthday Week 2020

2020-09-27 John Graham-Cumming

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/welcome-to-birthday-week-2020/

Welcome to Birthday Week 2020

Each year we celebrate our launch on September 27, 2010 with a week of product announcements. We call this Birthday Week, but rather than receiving gifts, we give them away. This year is no different, except that it is… Cloudflare is 10 years old.

Before looking forward to the coming week, let’s take a look back at announcements from previous Birthday Weeks.

A year into Cloudflare’s life (in 2011) we launched automatic support for IPv6. This was the first of a long line of announcements that support our goal of making available to everyone the latest technologies. If you’ve been following Cloudflare’s growth you’ll know those include SPDY/HTTP/2, TLS 1.3, QUIC/HTTP/3, DoH and DoT, WebP, … At two years old we celebrated with a timeline of our first two years and the fact that we’d reached 500,000 domains using the service. A year later that number had tripled.

In 2014 we released Universal SSL and gave all our customers SSL certificates. In one go we massively increased the size of the encrypted web and made it free and simple to go from http:// to https://. Other HTTPS related features we’ve rolled out include: Automatic HTTPS Rewrites, Encrypted SNI and our CT Log.

In 2017 we unwrapped a bunch of goodies with Unmetered DDoS Mitigation, our video streaming service, Cloudflare Stream, the ability to control where private SSL keys stored through Geo Key Manager. And, last but not least, our hugely popular serverless platform Cloudflare Workers. It’s hard to believe that it’s been three years since we changed the way people think about serverless with our massively distributed, secure and fast to update platform.

Two years ago Cloudflare became a domain registrar with the launch of our “at cost” service: Cloudflare Registrar. We also announced the Bandwidth Alliance which is designed to reduce or eliminate high cloud egress fees. We rolled out support for QUIC and Cloudflare Workers got a globally distributed key value store: Workers KV.

Which brings us to last year with the launch of WARP Plus to speed up and secure the “last mile” connection between a device and Cloudflare’s network. Browser Insights so that customers can optimize their website’s performance and see how each Cloudflare tool helps.

We greatly enhanced our bot management tools with Bot Defend Mode, and rolled out Workers Sites to bring the power of Workers and Workers KV to entire websites.

No Spoilers Here

Here are some hints about what to expect this year for our 10th anniversary Birthday Week:

Monday: We’re fundamentally changing how people think about Serverless

If you studied computer science you’ll probably have come across Niklaus Wirth’s book “Algorithms + Data Structures = Programs”. We’re going to start the week with two enhancements to Cloudflare Workers that are fundamentally going to change how people think about serverless. The lambda calculus is a nice theoretical foundation, but it’s Turing machines that won the day. If you want to build large, real programs you need to have algorithms and data structures.

Tuesday and Wednesday are all about observability. Of an Internet property and of the Internet itself. And they are also about privacy. We’ll roll out new functionality so you can see what’s happening without the need to track people.

Thursday is security day with a new service to protect the parts of websites and Internet applications that are behind the scenes. And, finally, on Friday it’s all about one click performance improvements that leverage our more than 200 city network to speed up static and dynamic content.

Welcome to Birthday Week 2020!

Day 1, Monday: Workers

Day 2, Tuesday: Analytics

Day 3, Wednesday: Cloudflare Radar and Speeding up HTTPS/HTTP3

Day 4, Thursday: API day

Day 5, Friday: Automatic Platform Optimization (starting with WordPress)

All day, every day: Cloudflare TV

How to get started

Why We Built This

How We Measure Web Performance

The Benefits of Automatic Platform Optimization

How Automatic Platform Optimization Works

WordPress Today, Other Platforms Coming Soon

A new place to run WordPress plugins?

How to optimize for WordPress

Move content closer to the user

Zero config edge caching revisited

HTML Caching rules

Edge caching with plugin

How is this implemented?

Edge caching without plugin

Getting content everywhere

Updating content

Moving optimizations to the edge

Making edge publishing generic

Positive security models and client certificates

Enforcing valid requests with schema validation

Demonstration

1. API Configuration

2. Client certificate issuance

3. API Shield rule creation

4. IoT Device Communication

5. Mobile Application (iOS) Communication

Package the certificate and private key

Add the certificate bundle to your iOS application

Modify your URLSession code to use the client certificate

Looking Forward

What is gRPC?

How does gRPC + Cloudflare work?

Using gRPC to build mobile apps at scale

Enabling gRPC support

Radar Internet Insights

Change in Internet traffic

Most popular and trending domains

Attack activity

Technology Trends

Radar Domain Insights

Radar IP Insights

Next Steps

The most important analytics feature: Privacy

Counting visits without tracking users

All of the details, none of the bots

Edge or Browser analytics? Why not both?

How do I get it?

The evolution of analytics at Cloudflare

Useful Accuracy

Using ABR to determine resolution

Aggregations and Rollups

Other forms of aggregations

ABR improves resilience and scalability

Why do we need Web Vitals?

How do Web Vitals help make your website faster?

Why measure Web Vitals with Cloudflare?

Push a button to start measuring

Where we’ve been, and where we’re going

Jay Adelson

Shellye Archambeau

Abhinav Asthana

Azeem Azhar

John Battelle

Christian Beedgen

Scott Belsky

Gleb Budman

Hayden Brown

Stewart Butterfield

John P. Carlin

John Collison

Dave Cooper

Scott Galloway

Kara Goldin

David Gosset