Tag Archives: Product News

How to customize your HTTP DDoS protection settings

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/http-ddos-managed-rules/

How to customize your HTTP DDoS protection settings

How to customize your HTTP DDoS protection settings

We’re excited to announce the availability of the HTTP DDoS Managed Ruleset. This new feature allows Cloudflare customers to independently tailor their HTTP DDoS protection settings. Whether you’re on the Free plan or the Enterprise plan, you can now tweak and optimize the settings directly from within the Cloudflare dashboard or via API.

We expect that in most cases, Cloudflare customers won’t need to customize any settings. Our mission is to make DDoS disruptions a thing of the past, with no customer overhead. To achieve this mission we’re constantly investing in our automated detection and mitigation systems. In some rare cases, there is a need to make some configuration changes, and so now, Cloudflare customers can customize those protection mechanisms independently. The next evolutionary step is to make those settings learn and auto-tune themselves for our customers, based on their unique traffic patterns. Zero-touch DDoS protection at scale.

Unmetered DDoS Protection

Back in 2017, we announced that we will never kick a customer off of our network because they face large attacks, even if they are not paying us at all (i.e., using the Free plan). Furthermore, we committed to never charge a customer for DDoS attack traffic — no matter the size and duration of the attack. Just this summer, our systems automatically detected and mitigated one of the largest DDoS attacks of all time. As opposed to other vendors, Cloudflare customers don’t need to request a service credit for the attack traffic: we simply exclude DDoS traffic from our billing systems. This is done automatically, just like our attack detection and mitigation mechanisms.

Autonomous DDoS Protection

Our unmetered DDoS protection commitment is possible due to our ongoing investment in our network and technology stack. The global coverage and capacity of our network allows us to absorb the largest attacks ever recorded, without manual intervention. Using BGP Anycast, traffic is routed to the closest Cloudflare edge data center as a form of global inter-data center load balancing. Traffic is then load balanced efficiently inside the data center between servers with the help of Unimog, our home-grown L4 load balancer, to ensure that traffic arrives at the least loaded server. Then, each server scans for malicious traffic and, if detected, applies mitigations in the most optimal location in the tech stack. Each server detects and mitigates attacks completely autonomously without requiring any centralized consensus, and shares details with each other using multicast. This is done using our proprietary autonomous edge detection and mitigation system, and this is how we’re able to continue offering unmetered DDoS protection for free at the scale we operate at.

Configurable DDoS Protection

Our autonomous systems use a set of dynamic rules that scan for attack patterns, known attack tools, suspicious patterns, protocol violations, requests causing large amounts of origin errors, excessive traffic hitting the origin/cache, and additional attack vectors. Each rule has a predefined sensitivity level and default action that varies based on the rule’s confidence that the traffic is indeed part of an attack.

But how do we determine those confidence levels? The answer to that depends on each specific rule and what that rule is looking for. Some rules look for the patterns in HTTP attributes that are generated by known attack tools and botnets, known protocol violations and other general suspicious patterns and traffic abnormalities. If a given rule is searching for the HTTP patterns of known attack tools, then once found, the likelihood (i.e., confidence) that this traffic is part of an attack is high, and we can therefore safely block all the traffic that matches that rule. However, in other cases, the detected patterns or abnormal activity might resemble an attack but can actually be caused by faulty applications that generate abnormal HTTP calls, misbehaving API clients that flood their origin server, and even legitimate traffic that naively violates protocol standards. In those cases, we might want to rate-limit the traffic that matches the rule or serve a challenge action to verify and allow legitimate users in while blocking bad bots and attackers.

How to customize your HTTP DDoS protection settings

Configuring the DDoS Protection Settings

In the past, you’d have to go through our support channels to customize any of the default actions and sensitivity levels. In some cases, this may have taken longer to resolve than desired. With today’s release, you can tailor and fine-tune the settings of our autonomous edge system by yourself to quickly improve the accuracy of the protection for your specific application needs.

If you previously contacted Cloudflare Support to apply customizations, the DDoS Ruleset has been set to Essentially off or Low for your zone, based on your existing customization. You can visit the dashboard to view the settings and change them if you need.

If you’ve requested to exclude or bypass the mitigations for specific HTTP attributes or IPs, or if you’ve requested a significantly high threshold that requires Cloudflare approval, then those customizations are still active but may not yet be visible in the dashboard.

If you haven’t experienced this issue previously, there is no action required on your end. However, if you would like to customize your DDoS protection settings, go directly to the DDoS tab or follow these steps:

  1. Log in to the Cloudflare dashboard, and select your account and website.
  2. Go to Firewall > DDoS.
  3. Next to HTTP DDoS attack protection, click Configure.
  4. In Ruleset configuration, select the action and sensitivity values for all the rules in the HTTP DDoS Managed Ruleset.
How to customize your HTTP DDoS protection settings

Alternatively, follow the API documentation to programmatically configure the DDoS protection settings.

In the configuration page, you can select a different Action and Sensitivity Level to override all the DDoS protection rules as a group of rules (i.e., the “ruleset”).

How to customize your HTTP DDoS protection settings

Alternatively, you can click Browse Rules to override specific rules, rather than all of them as a set of rules.

Mitigation Action

The mitigation action defines what action to take when the mitigation rule is applied. Our systems constantly analyze traffic and track potentially malicious activity. When certain request-per-second thresholds exceed the configured sensitivity level, a mitigation rule with a dynamically generated signature will be applied to mitigate the attack. The default mitigation action can vary according to the specific rule. A rule with less confidence may apply a Challenge action as a form of soft mitigation, and a rule with a Block action is applied when there is higher confidence that the traffic is part of an attack — as a form of a stricter mitigation action.

The available values for the action are:

  • Block
  • Challenge (CAPTCHA)
  • Log
  • Use Rule Defaults
How to customize your HTTP DDoS protection settings

Some examples of when you may want to change the mitigation action include:

  • Safer onboarding: You’re onboarding a new HTTP application which has odd traffic patterns, naively violates protocol violations or causes spiky behavior. In this case, you can set the action to Log and see what traffic our systems flag. Afterwards, you can make the necessary changes to the sensitivity levels as required and switch the mitigation action back to the default.
  • Stricter mitigation: A DDoS attack has been detected but a Rate-limit or Challenge action have been applied due to the rule’s default logic. However, in this specific case, you’re sure that this is malicious traffic, so you can change the action to Block for a more complete mitigation.

Mitigation Sensitivity

The sensitivity level defines when the mitigation rule is applied. Our systems constantly analyze traffic and track potentially malicious activity. When certain request-per-second thresholds are crossed, a mitigation rule with a dynamically generated signature will be applied to mitigate the attack. Toggling the sensitivity levels allows you to define when the mitigation is applied. The higher the sensitivity, the faster the mitigation is applied. The available values for sensitivity are:

  1. High (default)
  2. Medium
  3. Low
  4. Essentially Off

Essentially Off means that we’ve set an exceptionally low sensitivity level so in most cases traffic won’t be mitigated for you. However, attack traffic will be mitigated at exceptional levels to ensure the safety and stability of the Cloudflare network.

How to customize your HTTP DDoS protection settings

Some examples of when you may want to change the sensitivity action include:

  • Avoid impact to legitimate traffic: One of the rules has applied mitigation to your legitimate traffic due to a suspicious pattern. In this case, you may want to reduce the rule sensitivity to avoid recurrence of the issue and negative impact to your traffic.
  • Legacy applications: One of your legacy HTTP applications is violating protocol standards, or you may have mistakenly introduced a bug into your mobile application/API client. These cases may result in abnormal traffic activity that may be flagged by our systems. In such a case, you can select the Essentially Off sensitivity level until you’ve resolved the issue on your end, to avoid false positives.

Overriding Specific Rules

As mentioned above, you can also select a specific rule to override its action and sensitivity levels. The per-rule override takes priority over the ruleset override.

How to customize your HTTP DDoS protection settings

When configuring per-rule overrides, you’ll see that some rules have a DDoS Dynamic action. This means that the mitigation is multi-staged and will apply different mitigation actions depending on various factors including attack type, request characteristics, and various other factors. This dynamic action can also be overridden if you choose to do so.

How to customize your HTTP DDoS protection settings

DDoS Attack Analytics

When a DDoS attack is detected and mitigated, you’ll receive a real-time DDoS alert (if you’ve configured one) and you’ll be able to view the attack in the Firewall analytics dashboard. The attack details and the rule ID that was triggered will also be displayed in the Activity log as part of each HTTP request log that was mitigated.

How to customize your HTTP DDoS protection settings
How to customize your HTTP DDoS protection settings

Helping Build a Better Internet

At Cloudflare, everything we do is guided by our mission to help build a better Internet. A significant part of that mission is to make DDoS downtime and service disruptions a thing of the past. By giving our users the visibility and tools they need in order to understand and improve their DDoS protection, we’re hoping to make another step towards a better Internet.

Do you have feedback about the user interface or anything else? In the new DDoS tab, we’ve added a link to provide feedback, so you too can help shape the future of Cloudflare’s DDoS protection Managed Rules.

Not using Cloudflare yet? Start now.

Announcing Project Turpentine: an easy way to get off Varnish

Post Syndicated from Joao Sousa Botto original https://blog.cloudflare.com/announcing-turpentine/

Announcing Project Turpentine: an easy way to get off Varnish

Announcing Project Turpentine: an easy way to get off Varnish

When Varnish and the Varnish Configuration Language (VCL) were first introduced 15 years ago, they were an incredibly powerful combination to configure caching on servers (and your networks). It seemed a logical choice for a language to configure CDNs — caching in the cloud.

A lot has changed on the Internet since then.

In particular, caching is just one of many things that “CDNs” are expected to do: load balancing, DDoS protection, rate limiting, transformations, synthetic responses, routing and more. But above all what “CDNs” need to be is programmable, not just configurable.

Configuration went from a niche activity to a much more common — and often involved — requirement. We’ve heard from a lot of teams that want to remove critical dependencies on the one person they have who knows how to make configuration changes — because they’re the only one on the team who knows how to write in VCL.

But it’s not just about who can write VCL — it’s what VCL is increasingly being asked to do. A lot of our customers have told us that they have much greater expectations for what they expect the network to do: they don’t just want to configure anymore… they want to be able to program it! VCL is being pushed and stretched into things it was never envisaged to do.

These are often the frustrations we hear from customers about the use of VCL. And yet, at the same time, migrations are always hard. Taking thousands of lines of code that have been built up over the years for a mission critical service, and rewriting it from scratch? Nobody wants to do that.

Today, we are excited to announce a solution to all these problems: Turpentine. Turpentine is a true VCL-to-TypeScript converter: it is the easiest and most effective way for you to migrate your legacy Varnish to a modern, Turing-complete programming language and onto the edge. But don’t think that because you’re moving up in terms of language abstraction that you’re giving up performance — it’s the opposite. Turpentine enables porting your VCL-based configuration to Cloudflare Workers — which is known for its speed. Beyond being able to eliminate the use of VCL, Turpentine enables you to take full advantage of Cloudflare: including proxies, firewall, load balancers, tiered caching/shielding and everything else Cloudflare offers.

And you’ll be able to configure it using one of the most widely used languages in the world.
Whether you’re using standard configurations, or have a heavily customized VCL file, Turpentine will generate human-readable and well commented TypeScript code and deploy it to our serverless platform that is… fast.

Announcing Project Turpentine: an easy way to get off Varnish

How It Works

To turn a VCL into TypeScript code that is well optimized, human-readable, and commented, Turpentine takes a three-phase approach:

  1. The VCL is parsed, its meaning is understood, and TypeScript code that preserves its intentions is generated. This is more than transpiling. All the functionality configured in the VCL is rewritten to the best solution available. For example, rate limiting might be custom code on your VCL — but have native functionality on Cloudflare.
  2. The code is cleaned and optimized. During this phase, Turpentine looks for redundancies and inefficiencies, and removes them. Your code will be running on Cloudflare Workers, the fastest serverless platform in the world, and we definitely don’t want to leave behind inefficiencies that would prevent you from enjoying its benefits to the full extent.
  3. The final step is pretty printing. The whole point of this exercise is to make the code easy to understand by as many people as possible on your team — so that virtually any engineer on your team, not just infrastructure experts, can program your network moving forward.

Turpentine in Action

We’ve been perfecting Turpentine in a private beta with some very large customers who have some very complicated VCL files.

Here’s a demo of a VCL file being converted to a Worker:

Legacy to Modern

Happily, this all occurs without the pains of the usual legacy migration. There’s no project plan. No engineers and IT teams trying to replicate each bit of configuration or investigating whether a specific feature even existed.

If you’re interested in finding out more, please register your interest in the sign-up form here — we’d love to explore with you. For now, we’re staying in private beta, so we can walk every new customer through the process. A Cloudflare engineer will help your team get up to speed and comfortable with the new infrastructure.

Announcing Project Turpentine: an easy way to get off Varnish

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/early-hints/

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

Want to know a secret about Internet performance? Browsers spend an inordinate amount of time twiddling their thumbs waiting to be told what to do. This waiting impacts page load performance. Today, we’re excited to announce support for Early Hints, which dramatically improves browser page load performance and reduces thumb-twiddling time.

In initial tests using Early Hints, we have observed more than 30% improvement to page load time for browsers visiting a website for the first time.

Early Hints is available in beta today — Cloudflare customers can request access to Early Hints in the dashboard’s Speed tab. It’s free for all customers because we think the web should be fast!

Early Hints in a nutshell

Browsers need instructions for what to render and what resources need to be fetched to complete “painting” a given web page. These instructions come from a server response. But the servers sending these responses often need time to compile these resources — this is known as “server think time.” While the servers are busy during this time… browsers sit idle and wait.

Early Hints takes advantage of “server think time” to asynchronously send instructions to the browser to begin loading resources while the origin server is compiling the full response. By sending these hints to a browser before the full response is prepared, the browser can figure out what it needs to do to load the webpage faster for the end user. Faster page loads and lower user-perceived latency means happier users!

More formally, Early Hints is a web standard that defines a new HTTP status code (103 Early Hints) that defines new interactions between a client and server. 103s are served to clients while a 200 OK (or error) response is being prepared (the so-called “server think time”), and contain hints on which assets will likely be needed to fully render the web page. This hinting speeds up page loads and generally reduces user-perceived latency.

Sending hints on which assets to expect before the entire response is compiled allows the browser to use think time (when it would otherwise have been waiting) to fetch needed assets, prepare parts of the displayed page, and otherwise get ready for the full response to be returned.  

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

Cloudflare, as an edge network that sits in between client and server, is well positioned to issue these hints to clients on servers’ behalf. This is true for a few reasons:

  1. 103 is an experimental status code that origins may not be able to emit on their own, mostly for legacy reasons. Much of the machinery that powers the web incorrectly assumes that HTTP requests always correspond 1:1 with HTTP responses. This mistaken premise is baked into most HTTP server software, making it difficult for origin servers to emit Early Hints 103 responses prior to a “final” 200 OK response.

    Cloudflare edge servers handling this complexity on behalf of our customers neatly sidesteps these technical challenges and gets the adoption flywheel spinning faster on this exciting new technology (more on this flywheel later).

  2. Cloudflare’s edge is very close to end users. This means we can serve hints very quickly, filling even the smallest of server think blocks with useful information the client can use to get a jump start on loading assets.
  3. Cloudflare already sees request and response flow from our customers. We use this data to generate hints automatically, without a customer needing to make any origin changes at all.

How can we speed up “slow” dynamic page loads?

The typical request/response cycle between browsers and servers leaves a lot of room for optimization. When you type an address into the search bar of your browser and press Enter, a series of things happen to get you the content you need, as quickly as possible. Your browser first converts the hostname in the URL into an IP address, then establishes an initial connection to the server where the content is stored.

After the connection is established, the actual request is sent. This is often a GET request with a bunch of information about what the browser can and cannot display to the end user. Following the request, the browser must wait for the origin server to send the first bytes of the response before it begins to render the page. In this time, the server is busy executing all sorts of business logic (looking things up in databases, personalizing the page, detecting fraud, etc.) before spitting a response out to the browser.

Once the response for the original HTML page is received, the browser then needs to parse the page, generate a Document Object Model (DOM), and begin loading subresources specified on the page, like images, scripts, and additional stylesheets.

Let’s take a look at this in action. Below is the performance waterfall for a checkout page on pinecoffeesupply.com, a coffee shop with a storefront on Shopify:  

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

Page rendering cannot complete (and the shopper can’t get their coffee fix, and the coffee shop can’t get paid!) until key assets are loaded. Information on subresources needed for the browser to load the page aren’t available until the server has thought about and returned the initial response (the first document in the above table). In the above example, the page load could have been accelerated had the browser known, prior to receiving the full response, that the stylesheet and the four subsequent scripts will be needed to render the page.

Attempting to parallelize this dependency is what Early Hints is all about — productively using that “server think time” to help get the browser going on critical steps to render the page before the full server response arrives. The green “thinking” bar overlaps with the blue “content download” bar, allowing both the browser and server to be preparing the page at the same time. No more waiting. This is what Early Hints is all about!

“Entrepreneurs know that first impressions matter. Shopify’s own data shows that on average, when a store improves the speed of the first page in the buyer journey by 10%, there is a 7% increase in conversion. We see great promise in Early Hints being another tool to help improve the performance and experience for all merchants and customers.”
Colin Bendell, Director Performance Engineering at Shopify

How does Early Hints speed things up?

Early Hints is a status code used in non-final HTTP responses. It is designed to speed up overall page load times by giving the browser an early signal about what certain linked assets might appear in the final response. The browser takes those hints and begins preparing the page for when the final 200 OK response from the server arrives.

The RFC provides this example of how the request/response cycle will look with Early Hints:

Client request:

GET / HTTP/1.1 
Host: example.com

Server responses:
Early Hint Response

HTTP/1.1 103 Early Hints
     Link: </style.css>; rel=preload; as=style
     Link: </script.js>; rel=preload; as=script

…Server Think Time…

Full Response

HTTP/1.1 200 OK
     Date: Thurs, 16 Sept 2021 11:30:00 GMT
     Content-Length: 1234
     Content-Type: text/html; charset=utf-8
     Link: </style.css>; rel=preload; as=style
     Link: </script.js>; rel=preload; as=script

[Rest of Response]

Seems simple enough. We’ve neatly communicated information to the browser on what resources it should consider preloading while the server is computing a full response.

This isn’t a new idea though!

Didn’t server push try to solve this problem?

There have been previous attempts at solving this problem, notably HTTP/2 “server push”. However, server push suffered from two major problems:

Server push sent the wrong bytes at the wrong time

The server pushed assets to the client, without great awareness of what was already present in browser caches, without knowing how much “server think time” would elapse and how best to spend it. Web performance guru Pat Meenan summarized server push’s gotchas in a mail thread discussing Chrome’s intent to deprecate and remove server push support:

Push sounds like a great solution, particularly when it can be done intelligently to not push resources already in cache and if it can exactly only fill the wait time while a CDN edge goes back to an origin for the HTML but getting those conditions right in practice is extremely rare. In virtually every case I have seen, the pushed resources end up delaying the HTML itself, the CSS and other render-blocking resources. Delaying the HTML is particularly bad because it delays the browser’s discovery of all of the other resources on the page. Preload works with the normal document parsing and resource discovery, letting preloaded resources intermix with other important resources and giving the dev, browsers and origins more control over prioritization.

Early Hints avoids these issues by “hinting” (vs. push being dictatorial) to clients which assets they should load, allowing them to prioritize resource loads with more complete information about what they will likely need to render a page, what they have cached, and other heuristics.

Using Early Hints with “rel=preload” solves the unsolicited bytes problem, but can still suffer from the ordering issue (forcing clients to load the wrong bytes at the wrong time). Early Hints’ superpower may actually be its use in conjunction with “rel=preconnect”. Many pages load hundreds of third party resources, and setting up connections and TLS sessions with each outside origin is time (but not bandwidth) intensive. Setting up these third party origin connections early, during think time, has practical advantages for page loading experience without the potential collateral damage of using “rel=preload”.

Server push wasn’t broadly implemented

The other notable issue with Server Push was that browsers supported it, and some origin servers implemented the feature, but no major CDN products (Cloudflare’s included) deemed the feature promising enough to implement and support at scale. Support for standards like server push or Early Hints in each key piece of the web content supply chain is critical for mass adoption.

Cracking the chicken and egg problem

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

Standards like Early Hints often don’t get off the ground because of this chicken and egg problem: clients have no reason to support new standards without server support and servers have no reason to implement support without clients speaking their language. As previously discussed, Early Hints is particularly complicated for origins to directly support, because 103 is an unfamiliar status code and multiple responses to a single request may not be a well-supported pattern in common HTTP server and application stacks.

We’ve tackled this in a few ways:

  1. Close partnership with Google Chrome and other browser teams to build support for the same standard at the same time, ensuring a critical mass of adoption for Early Hints from day one.
  2. Developing ways for hints to be emitted by our edge to clients with support for the standard without requiring support for the standard from the origin. We’ve been fortunate to have a ready, willing, and able design partner in Shopify, whose performance engineering team has been instrumental in shaping our implementation and testing our production deployment.

We plan to support Early Hints in two ways: one enabled now, and the other coming soon.

To avoid requiring origins to emit 103 responses from their origins directly, we settled on an approach that takes advantage of something many customers already used to indicate which assets an HTML page depends on, the Link: response header.

  • When Cloudflare gets a response from the origin, we will parse it for Link headers with preload or preconnect rel types. These rel types indicate to the browser that the asset should be loaded as soon as possible (preload), or that a connection should be established to the specified origin but no bytes should be transferred (preconnect).
  • Cloudflare takes these headers and caches them at our edge, ready to be served as a 103 Early Hints payload.
  • When subsequent requests come for that asset, we immediately send the browser the cached Early Hints response while proxying the request to the origin server to generate the full response.
  • Cloudflare then proxies the full response from origin to the browser when it’s available.
  • When the full response is available, it will contain Link headers (that’s how we started this whole story). We will compare the Link headers in the 200 response with the cached version to make sure that they are the most up to date. If they have changed since they’ve been cached, we will automatically purge the out-of-date Early Hints and re-cache the new ones.

Coming soon: Smart Early Hints generation at the edge

We’re working hard on the next version of Early Hints. Smart Early Hints will use machine learning to generate Early Hints even when there isn’t a Link header present in the response from which we can harvest a 103. By analyzing historical request/response cycles for our customers, we can infer what assets being preloaded would help browsers load a webpage more quickly.

Look for Smart Early Hints in the next few months.

Browser support

As previously mentioned, we’ve partnered closely with Google Chrome and other browser vendors to ensure that their implementations of Early Hints interoperate well with Cloudflare’s. Google Chrome, Microsoft Edge, and Mozilla Firefox have announced their intention to support Early Hints. Progress for this support can be tracked here and here. We expect more browsers to announce support soon.

Benchmarking the impact of Early Hints

Once you get accepted into the beta and enable the feature, you can see Early Hints in action using Google Chrome, version 94 or higher. As of Sept 16th 2021, the Chrome Dev channel is at version 94.

Testing Early Hints with Chrome version 94 or higher

Early Hints support is available in Chrome by running (assuming a Mac, but the same flag works on Windows or Linux):

open /Applications/Google\ Chrome\ Dev.app --args --enable-features=EarlyHintsPreloadForNavigation

Testing Early Hints with Web Page Test

  1. Open webpagetest.org (a free performance testing tool).
  2. Specify the desired test URL. It should have the necessary preload/preconnect rel types in the Link header of the response.
  3. Choose Chrome Canary (or any Chrome version higher than 94) for the browser
  4. Under advanced settings, select Chromium.
  5. At the bottom of the Chromium section there’s a command-line section where you should paste Chrome’s Early Hints flag: --enable-features=EarlyHintsPreloadForNavigation.
  • To see the page load comparison, you can remove this flag and Early Hints won’t be enabled.

You can also test via Chrome’s Origin Trials.

Using Web Page Test, we’ve been able to observe greater than 30% improvement to a page’s content loading in the browser as measured by Largest Contentful Paint (LCP). An example test filmstrip:

Early Hints: How Cloudflare Can Improve Website Load Times by 30%

A few other considerations when testing…

  • Chrome will not support Early Hints on HTTP/1.1 (or earlier protocols).
  • Chrome will not support Early Hints on subresource requests.
  • We encourage testers to see how different rel types improve performance along with other Cloudflare performance enhancements like Argo.
  • It can be hard to see which resources are actually being loaded early as a result of Early Hints. Interrogating the ResourceTiming API will return initiator=EarlyHints for those assets.

Signing up for the Early Hints beta

Early Hints promises to revolutionize web performance. Support for the standard is live at our edge globally and is being tested by customers hungry for speed.

How to sign up for Cloudflare’s Early Hints beta:

  1. Sign in to your Cloudflare Account
  2. In the dashboard, navigate to Speed tab
  3. Click on the Optimization section
  4. Locate the Early Hints beta sign up card and request access. We’re enabling access for users on the beta list in batches.

Early Hints holds tremendous promise to speed up everyone’s experience on the web. It’s exactly the kind of product we like working on at Cloudflare and exactly the kind of feature we think should be free, for everyone. Be sure to check out the beta and let us know what you think.

You can explore our beta implementation in the Speed tab. And if you’re an engineer interested in improving the performance of the web for all, come join us!

Cloudflare Backbone: A Fast Lane on the Busy Internet Highway

Post Syndicated from Tanner Ryan original https://blog.cloudflare.com/cloudflare-backbone-internet-fast-lane/

Cloudflare Backbone: A Fast Lane on the Busy Internet Highway

Cloudflare Backbone: A Fast Lane on the Busy Internet Highway

The Internet is an amazing place. It’s a communication superhighway, allowing people and machines to exchange exabytes of information every day. But it’s not without its share of issues: whether it’s DDoS attacks, route leaks, cable cuts, or packet loss, the components of the Internet do not always work as intended.

The reason Cloudflare exists is to help solve these problems. As we continue to grow our rapidly expanding global network in more than 250 cities, while directly connecting with more than 9,800 networks, it’s important that our network continues to help bring improved performance and resiliency to the Internet. To accomplish this, we built our own backbone. Other than improving redundancy, the immediate advantage to you as a Cloudflare user? It can reduce your website loading times by up to 45% — and you don’t have to do a thing.

The Cloudflare Backbone

We began building out our global backbone in 2018. It comprises a network of long-distance fiber optic cables connecting various Cloudflare data centers across North America, South America, Europe, and Asia. This also includes Cloudflare’s metro fiber network, directly connecting data centers within a metropolitan area.

Cloudflare Backbone: A Fast Lane on the Busy Internet Highway

Our backbone is a dedicated network, providing guaranteed network capacity and consistent latency between various locations. It gives us the ability to securely, reliably, and quickly route packets between our data centers, without having to rely on other networks.

This dedicated network can be thought of as a fast lane on a busy highway. When traffic in the normal lanes of the highway encounter slowdowns from congestion and accidents, vehicles can make use of a fast lane to bypass the traffic and get to their destination on time.

Our software-defined network is like a smart GPS device, as we’re always calculating the performance of routes between various networks. If a route on the public Internet becomes congested or unavailable, our network automatically adjusts routing preferences in real-time to make use of all routes we have available, including our dedicated backbone, helping to deliver your network packets to the destination as fast as we can.

Measuring backbone improvements

As we grow our global infrastructure, it’s important that we analyze our network to quantify the impact we’re having on performance.

Here’s a simple, real-world test we’ve used to validate that our backbone helps speed up our global network. We deployed a simple API service hosted on a public cloud provider, located in Chicago, Illinois. Once placed behind Cloudflare, we performed benchmarks from various geographic locations with the backbone disabled and enabled to measure the change in performance.

Instead of comparing the difference in latency our backbone creates, it is important that our experiment captures a real-world performance gain that an API service or website would experience. To validate this, our primary metric is measuring the average request time when accessing an API service from Miami, Seattle, San Jose, São Paulo, and Tokyo. To capture the response of the network itself, we disabled caching on the Cloudflare dashboard and sent 100 requests from each testing location, both while forcing traffic through our backbone, and through the public Internet.

Now, before we claim our backbone solves all Internet problems, you can probably notice that for some tests (Seattle, WA and San Jose, CA), there was actually an increase in response time when we forced traffic through the backbone. Since latency is directly proportional to the distance of fiber optic cables, and since we have over 9,800 direct connections with other Internet networks, there is a possibility that an uncongested path on the public Internet might be geographically shorter, causing this speedup compared to our backbone.

Luckily for us, we have technologies like Argo Smart Routing, Argo Tiered Caching, WARP+, and most recently announced Orpheus, which dynamically calculates the performance of each route at our data centers, choosing the fastest healthy route at that time. What might be the fastest path during this test may not be the fastest at the time you are reading this.

With that disclaimer out of the way, now onto the test.

Cloudflare Backbone: A Fast Lane on the Busy Internet Highway

With the backbone disabled, if a visitor from São Paulo performed a request to our service, they would be routed to our São Paulo data center via BGP Anycast. With caching disabled, our São Paulo data center forwarded the request over the public Internet to the origin server in Chicago. On average, the entire process to fetch data from the origin server and return to the response to the requesting user took 335.8 milliseconds.

Once the backbone was enabled and requests were created, our software performed tests to determine the fastest healthy route to the origin, whether it was a route on the public Internet or through our private backbone. For this test the backbone was faster, resulting in an average total request time of 230.2 milliseconds. Just by routing the request through our private backbone, we improved the average response time by 31%.

We saw even better improvement when testing from Tokyo. When routing the request over the public Internet, the request took an average of 424 milliseconds. By enabling our backbone which created a faster path, the request took an average of 234 milliseconds, creating an average response time improvement of 44%.

Visitor Location Distance to Chicago Avg. response time using public Internet (ms) Avg. response using backbone (ms) Change in response time
Miami, FL, US 1917 km 84 75 10.7% decrease
Seattle, WA, US 2785 km 118 124 5.1% increase
San Jose, CA, US 2856 km 122 132 8.2% increase
São Paulo, BR 8403 km 336 230 31.5% decrease
Tokyo JP 10129 km 424 234 44.8% decrease

We also observed a smaller deviation in the response time of packets routed through our backbone over larger distances.

Cloudflare Backbone: A Fast Lane on the Busy Internet Highway

Our next generation network

Cloudflare is built on top of lossy, unreliable networks that we do not have control over. It’s our software that turns these traditional tubes of the Internet into a smart, high performing, and reliable network Cloudflare customers get to use today. Coupled with our new, but rapidly expanding backbone, it is this software that produces significant performance gains over traditional Internet networks.

Whether you visit a website powered by Cloudflare’s Argo Smart Routing, Argo Tiered Caching, Orpheus, or use our 1.1.1.1 service with WARP+ to access the Internet, you get direct access to the Internet fast lane we call the Cloudflare backbone.

For Cloudflare, a better Internet means improving Internet security, reliability, and performance. The backbone gives us the ability to build out our network in areas that have typically lacked infrastructure investments by other networks. Even with issues on the public Internet, these initiatives allow us to be located within 50 milliseconds of 95% of the Internet connected population.

In addition to our growing global infrastructure providing 1.1.1.1, WARP, Roughtime, NTP, IPFS Gateway, Drand, and F-Root to the greater Internet, it’s important that we extend our services to those who are most vulnerable. This is why we extend all our infrastructure benefits directly to the community, through projects like Galileo, Athenian, Fair Shot, and Pangea.

And while these thousands of fiber optic connections are already fixing today’s Internet issues, we truly are just getting started.

Want to help build the future Internet? Networks that are faster, safer, and more reliable than they are today? The Cloudflare Infrastructure team is currently hiring!

If you operate an ISP or transit network and would like to bring your users faster and more reliable access to websites and services powered by Cloudflare’s rapidly expanding network, please reach out to our Edge Partnerships team at [email protected]

Unboxing the Last Mile: Introducing Last Mile Insights

Post Syndicated from David Tuber original https://blog.cloudflare.com/last-mile-insights/

Unboxing the Last Mile: Introducing Last Mile Insights

Unboxing the Last Mile: Introducing Last Mile Insights

“The last 20% of the work requires 80% of the effort.” The Pareto Principle applies in many domains — nowhere more so on the Internet, however, than on the Last Mile. Last Mile networks are heterogeneous and independent of each other, but all of them need to be running to allow for everyone to use the Internet. They’re typically the responsibility of Internet Service Providers (ISPs). However, if you’re an organization running a mission-critical service on the Internet, not paying attention to Last Mile networks is in effect handing off responsibility for the uptime and performance of your service over to those ISPs.

Probably not the best idea.

When a customer puts a service on Cloudflare, part of our job is to offer a good experience across the whole Internet. We couldn’t do that without focusing on Last Mile networks. In particular, we’re focused on two things:

  • Cloudflare needs to have strong connectivity to Last Mile ISPs and needs to be as close as possible to every Internet-connected person on the planet.
  • Cloudflare needs good observability tools to know when something goes wrong, and needs to be able to surface that data to you so that you can be informed.

Today, we’re excited to announce Last Mile Insights, to help with this last problem in particular. Last Mile Insights allows customers to see where their end-users are having trouble connecting to their Cloudflare properties. Cloudflare can now show customers the traffic that failed to connect to Cloudflare, where it failed to connect, and why. If you’re an enterprise Cloudflare customer, you can sign up to join the beta in the Cloudflare Dashboard starting today: in the Analytics tab under Edge Reachability.

The Last Mile is historically the most complicated, least understood, and in some ways the most important part of operating a reliable network. We’re here to make it easier.

What is the Last Mile?

The Last Mile is the connection between your home and your ISP. When we talk about how users connect to content on the Internet, we typically do it like this:

Unboxing the Last Mile: Introducing Last Mile Insights

This is useful, but in reality, there are lots of things in the path between a user and anything on the Internet. Say that a user is connecting to a resource hosted behind Cloudflare. The path would look like this:

Unboxing the Last Mile: Introducing Last Mile Insights

Cloudflare is a global Anycast network that takes traffic from the Internet and proxies it to your origin. Because we function as a proxy, we think of the life of a request in two legs: before it reaches Cloudflare (end users to Cloudflare), and after it reaches Cloudflare (Cloudflare to origin). However, in Internet parlance, there are generally three legs: the First Mile tends to represent the path from an origin server to the data that you are requesting. The Middle Mile represents the path from an origin server to any proxies or other network hops. And finally, there is the final hop from the ISP to the user, which is known as the Last Mile.

Issues with the Last Mile are difficult to detect. If users are unable to reach something on the Internet, it is difficult for the resource to report that there was a problem. This is because if a user never reaches the resource, then the resource will never know something is wrong. Multiply that one problem across hundreds of thousands of Last Mile ISPs coming from a diverse set of regions, and it can be really hard for services to keep track of all the possible things that can go wrong on the Internet. The above graphic actually doesn’t really reflect the scope of the problem, so let’s revise it a bit more:

Unboxing the Last Mile: Introducing Last Mile Insights

It’s not an easy problem to keep on top of.

Brand New Last Mile Insights

Cloudflare is launching a closed beta of a brand new Last Mile reporting tool, Last Mile Insights. Last Mile Insights allows for customers to see where their end-users are having trouble connecting to their Cloudflare properties. Cloudflare can now show customers the traffic that failed to connect to Cloudflare, where it failed to connect, and why.

Unboxing the Last Mile: Introducing Last Mile Insights

Access to this data is useful to our customers because when things break, knowing what is broken and why — and then communicating with your end users — is vital. During issues, users and employees may create support/helpdesk tickets and social media posts to understand what’s going on. Knowing what is going on, and then communicating effectively about what the problem is and where it’s happening, can give end users confidence that issues are identified and being investigated… even if the issues are occurring on a third party network. Beyond that, understanding the root of the problem can help with mitigations and speed time to resolution.

How do Last Mile Insights work?

Our Last Mile monitoring tools use a combination of signals and machine learning to detect errors and performance regressions on the Last Mile.

Among the signals: Network Error Logging (NEL) is a browser-based reporting system that allows users’ browsers to report connection failures to an endpoint specified by the webpage that failed to load. When a user is able to connect to Cloudflare on a site with NEL enabled, Cloudflare will pass back two headers that indicate to the browser that they should report any network failures to an endpoint specified in the headers. The browser will then operate as usual, and if something happens that prevents the browser from being able to connect to the site, it will log the failure as a report and send it to the endpoint. This all happens in real time; the endpoint receives failure reports instantly after the browser experiences them.

The browser can send failure reports for many reasons: it could send reports because the TLS certificate was incorrect, the ISP or an upstream transit was having issues on the request path, the terminating server was overloaded and dropping requests, or a data center was unreachable. The W3C specification outlines specific buckets that the browser should break reports into and uploads those as reasons the browser could not connect. So the browser is literally telling the reporting endpoint why it was unable to reach the desired site. Here’s an example of a sample report a browser gives to Cloudflare’s endpoint:

Unboxing the Last Mile: Introducing Last Mile Insights

The report itself is a JSON blob that contains a lot of things, but the things we care about are when in the request the failure occurred (phase), why the request failed (tcp.timed_out), the ASN the request came over, and the metro area where the request came from. This information allows anyone looking at the reports to see where things are failing and why. Personal Identifiable Information is not captured in NEL reports. For more information, please see our KB article on NEL.

Many services can operate their own reporting endpoints and set their own headers indicating that users who connect to their site should upload these reports to the endpoint they specify. Cloudflare is also an operator of one such endpoint, and we’re excited to open up the data collected by us for customer use and visibility.  Let’s talk about a customer who used Last Mile Insights to help make a bad day on the Internet a little better.

Case Study: Canva

Canva is a Cloudflare customer that provides a design and collaboration platform hosted in the cloud. With more than 60 million users around the world, having constant access to Canva’s platform is critical. Last year, Canva users connecting through Cox Communications in San Diego started to experience connectivity difficulties. Around 50% of Canva’s users connecting via Cox Communications saw disconnects during that time period, and these users weren’t able to access Canva or Cloudflare at all. This wasn’t a Canva or Cloudflare outage, but rather, was caused by Cox routing traffic destined for Canva incorrectly, and causing errors for mutual Cox/Canva customers as a result.

Normally, this scenario would have taken hours to diagnose and even longer to mitigate. Canva would’ve seen a slight drop in traffic, but as the outage wasn’t on Canva’s side, it wouldn’t have flagged any alerts based on traffic drops. Canva engineers, in this case, would be notified by the users which would then be followed by a lengthy investigation to diagnose the problem.

Fortunately, Cloudflare has invested in monitoring systems to proactively identify issues exactly like these. Within minutes of the routing anomaly being introduced on Cox’s network, Canva was made aware of the issue via our monitoring, and a conversation with Cox was started to remediate the issue. Meanwhile, Canva could advise their users on the steps to fix it.

Cloudflare is excited to be offering our internal monitoring solution to our customers so that they can see what we see.

But providing insights into seeing where problems happen on the Last Mile is only part of the solution. In order to truly deliver a reliable, fast network, we also need to be as close to end users as possible.

Getting close to users

Getting close to end users is important for one reason: it minimizes the time spent on the Last Mile. These networks can be unreliable and slow. The best way to improve performance is to spend as little time on them as possible. And the only way to do that is to get close to our users. In order to get close to our users, Cloudflare is constantly expanding our presence into new cities and markets. We’ve just announced expansion into new markets and are adding even more new markets all the time to get as close to every network and every user as we can.

This is because not every network is the same. Some users may be clustered very close together in cities with high bandwidth, in others, this may not be the case. Because user populations are not homogeneous, each ISP operates their network differently to meet the needs of their users. Physical distance from where servers are matters a great deal, because nobody can outrun the speed of light. If you’re farther away from the content you want, it will take longer to reach it. But distance is not the only variable; bandwidth and speed will also vary, because networks are operated differently all over the world. But one thing we do know is that your network performance will also be impacted by how healthy your Last Mile network is.

Healthier networks perform better

A healthy network has no downtime, minimal congestion, and low packet loss.  These things all add latency. If you’re driving somewhere, street closures, traffic, and bad roads will prevent you from going as fast as possible to where you need to be. Healthy networks provide the best possible conditions for you to connect, and Last Mile performance is better because of it.  Consider three networks in the same country: ISP A, ISP B, and ISP C. These ISPs have similar distribution among their users. ISP A is healthy and is directly connected with Cloudflare. ISP B is healthy but is not directly connected to Cloudflare.  ISP C is an unhealthy network. Our data shows that Last Mile latencies for ISP C are significantly slower than Network A or B because the network quality of ISP C is worse.

Unboxing the Last Mile: Introducing Last Mile Insights

This box plot shows that the latencies to Cloudflare for ISP C are 360% higher than ISPs A or B.

We want all networks to be like Network A, but that’s not always the case, and it’s something Cloudflare can’t control. The only thing Cloudflare can do to mitigate performance problems like these is to limit how much time you spend on these networks.

Shrinking the Last Mile gives better performance

By placing data centers close to our users, we reduce the amount of time spent on these Last Mile networks, and the latency between end users and Cloudflare goes down. A great example of this is how bringing up new locations in Africa affected the latency for the Internet-connected population there.  Blue shows the latency before these locations were added, and red shows after:

Unboxing the Last Mile: Introducing Last Mile Insights

Our efforts globally have brought 95% of the Internet-connected population within 50ms of us:

Unboxing the Last Mile: Introducing Last Mile Insights

You will also notice that 80% of the Internet is within 30ms of us. The tail for Last Mile latencies is very long, and every data center we add helps bring that tail closer to great performance. As we expand into more locations and countries, more of the Internet will be even better connected.

But even when the Last Mile is shrunken down by our infrastructure expansions, networks can still have issues that are difficult to detect. Existing logging and monitoring solutions don’t provide a good way to see what the problem is. Cloudflare has built a sophisticated set of tools to identify issues with Last Mile networks outside our control, and help reduce time to resolution for this purpose, and it has already found problems on the Last Mile for our customers.

Cloudflare has unique performance and insight into Last Mile networking

Running an application on the Internet requires customers to look at the whole Internet. Many cloud services optimize latency starting at the first mile and work their way out, because it’s easier to optimize for things they can control. Because the Last Mile is controlled by hundreds or thousands of ISPs, it is difficult to influence how the Last Mile behaves.

Cloudflare is focused on closing performance gaps everywhere, including close to your users and employees. Last mile performance and reliability is critically important to delivering content, keeping employees productive, and all the other things the world depends on the Internet to do. If a Last Mile provider is having a problem, then users connecting to the Internet through them will have a bad day.

Cloudflare’s efforts to provide better Last Mile performance and visibility allow customers to rely on Cloudflare to optimize the Last Mile, making it one less thing they have to think about. Through Last Mile Insights and network expansion efforts — available today in the Cloudflare Dashboard,  in the Analytics tab under Edge Reachability — we want to provide you the ability to see what’s really happening on the Internet while knowing that Cloudflare is working on giving your users the best possible Internet experience.

Discovering what’s slowing down your website with Web Analytics

Post Syndicated from Joao Sousa Botto original https://blog.cloudflare.com/web-analytics-vitals-explorer/

Discovering what’s slowing down your website with Web Analytics

Discovering what’s slowing down your website with Web Analytics

Web Analytics is Cloudflare’s privacy-focused real user measurement solution. It leverages a lightweight JavaScript beacon and does not use any client-side state, such as cookies or localStorage, to collect usage metrics. Nor does it “fingerprint” individuals via their IP address, User Agent string, or any other data.

Cloudflare Web Analytics makes essential web analytics, such as the top-performing pages on your website and top referrers, available to everyone for free, and it’s becoming more powerful than ever.

Focusing on Performance

Earlier this year we merged Web Analytics with our Browser Insights product, which enabled customers proxying their websites through Cloudflare to evaluate visitors’ experience on their web properties through Core Web Vitals such as Largest Contentful Paint (LCP) and First Input Delay (FID).

It was important to bring the Core Web Vitals performance measurements into Web Analytics given the outsized impact that page load times have on bounce rates. A page load time increase from 1s to 3s increases bounce rates by 32% and from 1s to 6s increases it by 106% (source).

Now that you know the impact a slow-loading web page can have on your visitors, it’s time for us to make it a no-brainer to take action. Read on.

Becoming Action-Oriented

We believe that, to deliver the most value to our users, the product should facilitate the following process:

  1. Measure the real user experience
  2. Grade this experience — is it satisfactory or in need of improvement?
  3. Provide actionable insights — what part of the web page should be tweaked to improve the user experience?
  4. Repeat
Discovering what’s slowing down your website with Web Analytics

And it all starts with Web Analytics Vitals Explorer, which started rolling out today.

Introducing Web Analytics Vitals Explorer

Vitals Explorer enables you to easily pinpoint which elements on your pages are affecting users the most, with accurate measurements from the visitors perspective and an easy-to-read impact grading.

To do that, we have automatically updated the Web Analytics JavaScript beacon so that it collects the relevant vital measurements from the browser. As always, we are not collecting any information that would invade your visitors’ privacy.

Usage

Once this new beacon is updated on your sites — and again the update will happen transparently to you — you can then navigate to the Core Web Vitals page on Web Analytics. When entering that page, you will see three graphs grading the user experience for Largest Contentful Paint (LCP), First Input Delay (FID), and Cumulative Layout Shift (CLS). Below each graph you can see the debug section with the top five elements with a negative impact on the metric. Lastly, when clicking on either of these elements shown in the data table, you will be presented with its impact and exact paths so that you can easily decide whether this is worth keeping on your website in its current format.

Discovering what’s slowing down your website with Web Analytics

In addition to this new Core Web Vitals content, we have also added First Paint and First Contentful Paint to the Page Load Time page. When you navigate to this page you will now see the page load summary and a graph representing page load timing. These will allow you to quickly identify any regressions to these important performance metrics.

Discovering what’s slowing down your website with Web Analytics

Measurement details

This additional debugging information for Core Web Vitals is measured during the lifespan of the page (until the user leaves the tab or closes the browser window, which updates visibilityState to a hidden state).

Here’s what we collect:

Common for all Core Web Vitals

  • Element is a CSS selector representing the DOM node. With this string, the developer can use `document.querySelector(<element_name>)` in their browser’s dev console to find out which DOM node has a negative impact on your scores/values.
  • Path is the URL path at the time the Core Web Vitals are captured.
  • Value is the metric value for each Core Web Vitals. This value is in milliseconds for LCP or FID and a score for CLS (Cumulative Layout Shift).

Largest Contentful Paint

  • URL is the source URL (such as image, text, web fonts).
  • Size is the source object’s size in bytes.

First Input Delay

  • Name is the type of event (such as mousedown, keydown, pointerdown).

Cumulative Layout Shift

Layout information is a JSON value that includes width, height, x axis position, y axis position, left, right, top, and bottom. You are able to observe layout shifts that happen on the page by observing these values.

  • CurrentRect is the largest source element’s layout information after the shift. This JSON value is shown as Current under Layout Shifts section in the Web Analytics UI.
  • PreviousRect is the largest source element’s layout information before the shift. This JSON value is shown as Previous under Layout Shifts section in the Web Analytics UI.

Paint Timings

Additionally, we have added two important paint timings

  • First Paint is the time between navigation and when the browser renders the first pixels to the screen.
  • First Contentful Paint is the time when the browser renders the first bit of content from the DOM.

A lot of this is based on standard browser measurements, which you can read about in detail on this blog post from Google.

Moving forward

And we are by no means done. Moving forward, we will bring this structured approach with grading and actionable insights into as Web Analytics measurements as possible, and keep guiding you through how to improve your visitors’ experience. So stay tuned.
And in the meantime, do let us know what you think about this feature and ask questions on the community forums.

Optimizing images on the web

Post Syndicated from Greg Brimble original https://blog.cloudflare.com/optimizing-images/

Optimizing images on the web

Optimizing images on the web

Images are a massive part of the Internet. On the median web page, images account for 51% of the bytes loaded, so any improvement made to their speed or their size has a significant impact on performance.

Today, we are excited to announce Cloudflare’s Image Optimization Testing Tool. Simply enter your website’s URL, and we’ll run a series of automated tests to determine if there are any possible improvements you could make in delivering optimal images to visitors.

Optimizing images on the web

How users experience speed

Everyone who has ever browsed the web has experienced a website that was slow to load. Often, this is a result of poorly optimized images on that webpage that are either too large for purpose or that were embedded on the page with insufficient information.

Images on a page might take painfully long to load as pixels agonizingly fill in from top-to-bottom; or worse still, they might cause massive shifts of the page layout as the browser learns about their dimensions. These problems are a serious annoyance to users and as of August 2021, search engines punish pages accordingly.

Understandably, slow page loads have an adverse effect on a page’s “bounce rate” which is the percentage of visitors which quickly move off of the page. On e-commerce sites in particular, the bounce rate typically has a direct monetary impact and pages are usually very image-heavy. It is critically important to optimize all the images on your webpages to reduce load on and egress from your origin, to improve your performance in search engine rankings and, ultimately, to provide a great experience for your users.

Measuring speed

Since the end of August 2021, Google has used the Core Web Vitals to quantify page performance when considering search results rankings. These metrics are three numbers: Largest Contentful Paint (LCP), First Input Delay (FID), and Cumulative Layout Shift (CLS). They approximate the experience of loading, interactivity and visual stability respectively.

CLS and LCP are the two metrics we can improve by optimizing images. When CLS is high, this indicates that large amounts of the page layout is shifting as it loads. LCP measures the time it takes for the single largest image or text block in the viewport to render.

These can both be measured “in the field” with Real User Monitoring (RUM) analytics such as Cloudflare’s Web Analytics, or in a “lab environment” using Cloudflare’s Image Optimization Testing Tool.

How to optimize for speed

Dimensions

One of the most impactful performance improvements a website author can make is ensuring they deliver images with appropriate dimensions. Images taken on a modern camera can be truly massive, and some recent flagship phones have gigantic sensors. The Samsung Galaxy S21 Ultra, for example, has a 108 MP sensor which captures a 12,000 by 9,000 pixel image. That same phone has a screen width of only 1440 pixels. It is physically impossible to show every pixel of the photo on that device: for a landscape photo, only 12% of pixel columns can be displayed.

Embedding this image on a webpage presents the same problem, but this time, that image and all of its unused pixels are sent over the Internet. Ultimately, this creates unnecessary load on the server, higher egress costs, and longer loading times for visitors.. This is exacerbated even further for visitors on mobile since they are often using a slower connection and have limits on their data usage. On a fast 3G connection, that 108 MP photo might consume 26 MB of both the visitor’s data plan and the website’s egress bandwidth, and take more than two minutes to load!

It might be tempting to always deliver images with the highest possible resolution to avoid “blocky” or pixelated images, but when resizing is done correctly, this is not a problem. “Blocky” artifacts typically occur when an image is processed multiple times (for example, an image is repeatedly uploaded and downloaded by users on a platform which compresses that image). Pixelated images occur when an image has been shrunk to a size smaller than the screen it is rendered on.

So, how can website authors avoid these pitfalls and ensure a correctly sized image is delivered to visitors’ devices? There are two main approaches:

  • Media conditions with srcset and sizes

When embedding an image on a webpage, traditionally the author would simply pass a src attribute on an img tag:

<img src="hello_world_12000.jpg" alt="Hello, world!" />

Since 2017, all modern browsers have supported the more dynamic srcset attribute. This allows authors to set multiple image sources, depending on the matching media condition of the visitor’s browser:

<img srcset="hello_world_1500.jpg 1500w,
             hello_world_2000.jpg 2000w,
             hello_world_12000.jpg 12000w"
     sizes="(max-width: 1500px) 1500px,
            (max-width: 2000px) 2000px,
            12000px"
     src="hello_world_12000.jpg"
     alt="Hello, world!" />

Here, with the srcset attribute, we’re informing the browser that there are three variants of the image, each with a different intrinsic width: 1,500 pixels, 2,000 pixels and the original 12,000 pixels. The browser then evaluates the media conditions in the sizes attribute ( (max-width: 1500px) and (max-width: 2000px)) in order to select the appropriate image variant from the srcset attribute. If the browser’s viewport width is less than 1500px, the hello_world_1500.jpg image variant will be loaded; if the browser’s viewport width is between 1500px and 2000px, the hello_world_2000.jpg image variant will be loaded; and finally, if the browser’s viewport width is larger than 2000px, the browser will fallback to loading the hello_world_12000.jpg image variant.

Similar behavior is also possible with a picture element, using the source child element which supports a variety of other selectors.

  • Client Hints

Client Hints are a standard that some browsers are choosing to implement, and some not. They are a set of HTTP request headers which tell the server about the client’s device. For example, the browser can attach a Viewport-Width header when requesting an image which informs the server of the width of that particular browser’s viewport (note this header is currently in the process of being renamed to Sec-CH-Viewport-Width).

This simplifies the markup in the previous example greatly — in fact, no changes are required from the original simple HTML:

<img src="hello_world_12000.jpg" alt="Hello, world!" />

If Client Hints are supported, when the browser makes a request for hello_world_12000.jpg, it might attach the following header:

Viewport-Width: 1440

The server could then automatically serve a smaller image variant (e.g. hello_world_1500.jpg), despite the request originally asking for hello_world_12000.jpg image.

By enabling browsers to request an image with appropriate dimensions, we save bandwidth and time for both your server and for your visitors.

Format

JPEG, PNG, GIF, WebP, and now, AVIF. AVIF is the latest image format with widespread industry support, and it often outperforms its preceding formats. AVIF supports transparency with an alpha channel, it supports animations, and it is typically 50% smaller than comparable JPEGs (vs. WebP’s reduction of only 30%).

We added the AVIF format to Cloudflare’s Image Resizing product last year as soon as Google Chrome added support. Firefox 93 (scheduled for release on October 5, 2021) will be Firefox’s first stable release, and with both Microsoft and Apple as members of AVIF’s Alliance for Open Media, we hope to see support in Edge and Safari soon. Before these modern formats, we also saw innovative approaches to improving how an image loads on a webpage. BlurHash is a technique of embedding a very small representation of the image inside the HTML markup which can be immediately rendered and acts as a placeholder until the final image loads. This small representation (hash) produced a blurry mix of colors similar to that of the final image and so eased the loading experience for users.

Progressive JPEGs are similar in effect, but are a built-in feature of the image format itself. Instead of encoding the image bytes from top-to-bottom, bytes are ordered in increasing levels of image detail. This again produces a more subtle loading experience, where the user first sees a low quality image which progressively “enhances” as more bytes are loaded.

Optimizing images on the web

Quality

The newer image formats (WebP and AVIF) support lossless compression, unlike their predecessor, JPEG. For some uses, lossless compression might be appropriate, but for the majority of websites, speed is prioritized and this minor loss in quality is worth the time and bytes saved.

Optimizing where to set the quality is a balancing act: too aggressive and artifacts become visible on your image; too little and the image is unnecessarily large. Butteraugli and SSIM are examples of algorithms which approximate our perception of image quality, but this is currently difficult to automate and is therefore best set manually. In general, however, we find that around 85% in most compression libraries is a sensible default.

Markup

All of the previous techniques reduce the number of bytes an image uses. This is great for improving the loading speed of those images and the Largest Contentful Paint (LCP) metric. However, to improve the Cumulative Layout Shift (CLS) metric, we must minimize changes to the page layout. This can be done by informing the browser of the image size ahead of time.

On a poorly optimized webpage, images will be embedded without their dimensions in the markup. The browser fetches those images, and only once it has received the header bytes of the image can it know about the dimensions of that image. The effect is that the browser first renders the page where the image takes up zero pixels, and then suddenly redraws with the dimensions of that image before actually loading the image pixels themselves. This is jarring to users and has a serious impact on usability.

It is important to include dimensions of the image inside HTML markup to allow the browser to allocate space for that image before it even begins loading. This prevents unnecessary redraws and reduces layout shift. It is even possible to set dimensions when dynamically loading responsive images: by informing the browser of the height and width of the original image, assuming the aspect ratio is constant, it will automatically calculate the correct height, even when using a width selector.

<img height="9000"
     width="12000"
     srcset="hello_world_1500.jpg 1500w,
             hello_world_2000.jpg 2000w,
             hello_world_12000.jpg 12000w"
     sizes="(max-width: 1500px) 1500px,
            (max-width: 2000px) 2000px,
            12000px"
     src="hello_world_12000.jpg"
     alt="Hello, world!" />

Finally, lazy-loading is a technique which reduces the work that the browser has to perform right at the onset of page loading. By deferring image loads to only just before they’re needed, the browser can prioritize more critical assets such as fonts, styles and JavaScript. By setting the loading property on an image to lazy, you instruct the browser to only load the image as it enters the viewport. For example, on an e-commerce site which renders a grid of products, this would mean that the page loads faster for visitors, and seamlessly fetches images below the fold, as a user scrolls down. This is supported by all major browsers except Safari which currently has this feature hidden behind an experimental flag.

<img loading="lazy" … />

Hosting

Finally, you can improve image loading by hosting all of a page’s images together on the same first-party domain. If each image was hosted on a different domain, the browser would have to perform a DNS lookup, create a TCP connection and perform the TLS handshake for every single image. When they are all co-located on a single domain (especially so if that is the same domain as the page itself), the browser can re-use the connection which improves the speed it can load those images.

Test your website

Today, we’re excited to announce the launch of Cloudflare’s Image Optimization Testing Tool. Simply enter your website URL, and we’ll run a series of automated tests to determine if there are any possible improvements you could make in delivering optimal images to visitors.

We use WebPageTest and Lighthouse to calculate the Core Web Vitals on two versions of your page: one as the original, and one with Cloudflare’s best-effort automatic optimizations. These optimizations are performed using a Cloudflare Worker in combination with our Image Resizing product, and will transform an image’s format, quality, and dimensions.

We report key summary metrics about your webpage’s performance, including the aforementioned Cumulative Layout Shift (CLS) and Largest Contentful Page (LCP), as well as a detailed breakdown of each image on your page and the optimizations that can be made.

Cloudflare Images

Cloudflare Images can help you to solve a number of the problems outlined in this post. By storing your images with Cloudflare and configuring a set of variants, we can deliver optimized images from our edge to your website or app. We automatically set the optimal image format and allow you to customize the dimensions and fit for your use-cases.

We’re excited to see what you build with Cloudflare Images, and you can expect additional features and integrations in the near future. Get started with Images today from $6/month.

Building Cloudflare Images in Rust and Cloudflare Workers

Post Syndicated from Yevgen Safronov original https://blog.cloudflare.com/building-cloudflare-images-in-rust-and-cloudflare-workers/

Building Cloudflare Images in Rust and Cloudflare Workers

Building Cloudflare Images in Rust and Cloudflare Workers

This post explains how we implemented the Cloudflare Images product with reusable Rust libraries and Cloudflare Workers. It covers the technical design of Cloudflare Image Resizing and Cloudflare Images. Using Rust and Cloudflare Workers helps us quickly iterate and deliver product improvements over the coming weeks and months.

Reuse of code in Rusty image projects

We developed Image Resizing in Rust. It’s a web server that receives HTTP requests for images along with resizing options, fetches the full-size images from the origin, applies resizing and other image processing operations, compresses, and returns the HTTP response with the optimized image.

Rust makes it easy to split projects into libraries (called crates). The image processing and compression parts of Image Resizing are usable as libraries.

We also have a product called  Polish, which is a Golang-based service that recompresses images in our cache. Polish was initially designed to run command-line programs like jpegtran and pngcrush. We took the core of Image Resizing and wrapped it in a command-line executable. This way, when Polish needs to apply lossy compression or generate WebP images or animations, it can use Image Resizing via a command-line tool instead of a third-party tool.

Reusing libraries has allowed us to easily unify processing between Image Resizing and Polish (for example, to ensure that both handle metadata and color profiles in the same way).

Cloudflare Images is another product we’ve built in Rust. It added support for a custom storage back-end, variants (size presets), support for signing URLs and more. We made it as a collection of Rust crates, so we can reuse pieces of it in other services running anywhere in our network. Image Resizing provides image processing for Cloudflare Images and shares libraries with Images to understand the new URL scheme, access the storage back-end, and database for variants.

How Image Resizing works

Building Cloudflare Images in Rust and Cloudflare Workers

The Image Resizing service runs at the edge and is deployed on every server of the Cloudflare global network. Thanks to Cloudflare’s global Anycast network, the closest Cloudflare data center will handle eyeball image resizing requests. Image Resizing is tightly integrated with the Cloudflare cache and handles eyeball requests only on a cache miss.

There are two ways to use Image Resizing. The default URL scheme provides an easy, declarative way of specifying image dimensions and other options. The other way is to use a JavaScript API in a Worker. Cloudflare Workers give powerful programmatic control over every image resizing request.

How Cloudflare Images work

Building Cloudflare Images in Rust and Cloudflare Workers

Cloudflare Images consists of the following components:

  • The Images core service that powers the public API to manage images assets.
  • The Image Resizing service responsible for image transformations and caching.
  • The Image delivery Cloudflare Worker responsible for serving images and passing corresponding parameters through to the Imaging Resizing service.
  • Image storage that provides access and storage for original image assets.

To support Cloudflare Images scenarios for image transformations, we made several changes to the Image Resizing service:

  • Added access to Cloudflare storage with original image assets.
  • Added access to variant definitions (size presets).
  • Added support for signing URLs.

Image delivery

The primary use case for Cloudflare Images is to provide a simple and easy-to-use way of managing images assets. To cover egress costs, we provide image delivery through the Cloudflare managed imagedelivery.net domain. It is configured with Tiered Caching to maximize the cache hit ratio for image assets. imagedelivery.net provides image hosting without a need to configure a custom domain to proxy through Cloudflare.

A Cloudflare Worker powers image delivery. It parses image URLs and passes the corresponding parameters to the image resizing service.

How we store Cloudflare Images

There are several places we store information on Cloudflare Images:

  • image metadata in Cloudflare’s core data centers
  • variant definitions in Cloudflare’s edge data centers
  • original images in core data centers
  • optimized images in Cloudflare cache, physically close to eyeballs.

Image variant definitions are stored and delivered to the edge using Cloudflare’s distributed key-value store called Quicksilver. We use a single source of truth for variants. The Images core service makes calls to Quicksilver to read and update variant definitions.

The rest of the information about the image is stored in the image URL itself:
https://imagedelivery.net/<encoded account id>/<image id>/<variant name>

<image id> contains a flag, whether it’s publicly available or requires access verification. It’s not feasible to store any image metadata in Quicksilver as the data volume would increase linearly with the number of images we host. Instead, we only allow a finite number of variants per account, so we responsibly utilize available disk space on the edge. The downside of storing image metadata as part of <image id> is that <image id> will change on access change.

How we keep Cloudflare Images up to date

The only way to access images is through the use of variants. Each variant is a named image resizing configuration. Once the image asset is fetched, we cache the transformed image in the Cloudflare cache. The critical question is how we keep processed images up to date. The answer is by purging the Cloudflare cache when necessary. There are two use cases:

  • access to the image is changed
  • the variant definition is updated

In the first instance, we purge the cache by calling a URL:
https://imagedelivery.net/<encoded account id>/<image id>

Then, the customer updates the variant we issue a cache purge request by tag:
account-id/variant-name

To support cache purge by tag, the image resizing service adds the necessary tags for all transformed images.

How we restrict access to Cloudflare Images

The Image resizing service supports restricted access to images by using URL signatures with expiration. URLs are signed with an SHA-256 HMAC key. The steps to produce valid signatures are:

  1. Take the path and query string (the path starts with /).
  2. Compute the path’s SHA-256 HMAC with the query string, using the Images’ URL signing key as the secret. The key is configured in the Dashboard.
  3. If the URL is meant to expire, compute the Unix timestamp (number of seconds since 1970) of the expiration time, and append ?exp= and the timestamp as an integer to the URL.
  4. Append ? or & to the URL as appropriate (? if it had no query string; & if it had a query string).
  5. Append sig= and the HMAC as hex-encoded 64 characters.

A signed URL looks like this:

Building Cloudflare Images in Rust and Cloudflare Workers

A signed URL with an expiration timestamp looks like this:

Building Cloudflare Images in Rust and Cloudflare Workers

Signature of /hello/world URL with a secret ‘this is a secret’ is 6293f9144b4e9adc83416d1b059abcac750bf05b2c5c99ea72fd47cc9c2ace34.

https://imagedelivery.net/hello/world?sig=6293f9144b4e9adc83416d1b059abcac750bf05b2c5c99ea72fd47cc9c2ace34

Building Cloudflare Images in Rust and Cloudflare Workers
Building Cloudflare Images in Rust and Cloudflare Workers

Direct creator uploads with Cloudflare Worker and KV

Similar to Cloudflare Stream, Images supports direct creator uploads. That allow users to upload images without API tokens. Everyday use of direct creator uploads is by web apps, client-side applications, or mobile apps where users upload content directly to Cloudflare Images.

Once again, we used our serverless platform to support direct creator uploads. The successful API call stores the account’s information in Workers KV with the specified expiration date. A simple Cloudflare Worker handles the upload URL, which reads the KV value and grants upload access only on a successful call to KV.

Future Work

Cloudflare Images product has an exciting product roadmap. Let’s review what’s possible with the current architecture of Cloudflare Images.

Resizing hints on upload

At the moment, no image transformations happen on upload. That means we can serve the image globally once it is uploaded to Image storage. We are considering adding resizing hints on image upload. That won’t necessarily schedule image processing in all cases but could provide a valuable signal to resize the most critical image variants. An example could be to generate an AVIF variant for the most vital image assets.

Serving images from custom domains

We think serving images from a domain we manage (with Tiered Caching) is a great default option for many customers. The downside is that loading Cloudflare images requires additional TLS negotiations on the client-side, adding latency and impacting loading performance. On the other hand, serving Cloudflare Images from custom domains will be a viable option for customers who set up a website through Cloudflare. The good news is that we can support such functionality with the current architecture without radical changes in the architecture.

Conclusion

The Cloudflare Images product runs on top of the Cloudflare global network. We built Cloudflare Images in Rust and Cloudflare Workers. This way, we use Rust reusable libraries in several products such as Cloudflare Images, Image Resizing, and Polish. Cloudflare’s serverless platform is an indispensable tool to build Cloudflare products internally. If you are interested in building innovative products in Rust and Cloudflare Workers, we’re hiring.

How Cloudflare Images can make your life easier

Post Syndicated from Rita Soares original https://blog.cloudflare.com/how-cloudflare-images-can-make-your-life-easier/

How Cloudflare Images can make your life easier

How Cloudflare Images can make your life easier

Imagine how a customer would feel if they get to your website, and it takes forever to load all the images you serve. This would become a negative user experience that might lead to lower overall site traffic and high bounce rates.

The good news is that regardless of whether you need to store and serve 100,000 or one million images, Cloudflare Images gives you the tools you need to build an entire image pipeline from scratch.

Customer pains

After speaking with many of Cloudflare customers, we quickly understood that whether you are an e-commerce retailer, a blogger or have a platform for creators, everyone shares the same problems:

  • Egress fees. Each time an image needs to go from Product A (storage) to Product B (optimize) and to Product C (delivery) there’s a fee. If you multiply this by the millions of images clients serve per day it’s easy to understand why their bills are so high.
  • Buckets everywhere. Our customers’ current pipelines involve uploading images to and from services like AWS, then using open source products to optimize images, and finally to serve the images they need to store them in another cloud bucket since CDN don’t have long-term storage. This means that there is a dependency on buckets at each step of the pipeline.
  • Load times. When an image is not correctly optimized the image can be much larger than needed resulting in an unnecessarily long download time. This can lead to a bad end user experience that might result in loss of overall site traffic.
  • High Maintenance. To maintain an image pipeline companies need to rely on several AWS and open source products, plus an engineering team to build and maintain that pipeline. This takes the focus away from engineering on the product itself.

How can Cloudflare Images help?  

Zero Egress Costs

The majority of cloud providers allow you to store images for a small price, but the bill starts to grow every time you need to retrieve that image to optimize and deliver. The good news is that with Cloudflare Images customers don’t need to worry about egress costs, since all storage, optimization and delivery are part of the same tool.

The buckets stop with Cloudflare Images

One small step for humankind, one giant leap for image enthusiasts!

With Cloudflare Images the bucket pain stops now, and customers have two options:

  1. One image upload can generate up to 100 variants, which allows developers to stop placing image sizes in URLs. This way, if a site gets redesigned there isn’t a need to change all the image sizes because you already have all the variants you need stored in Cloudflare Images.
  2. Give your users a one-time permission to upload one file to your server. This way developers don’t need to write additional code to move files from users into a bucket — they will be automatically uploaded into your Cloudflare storage.

Minimal engineering effort

Have you ever dreamed about your team focusing entirely on product development instead of maintaining infrastructure? We understand, and this is why we created a straightforward set of APIs as well as a UI in the Cloudflare Dashboard. This allows your team to serve and optimize images without the need to set up and maintain a pipeline from scratch.

Once you get access to Cloudflare Images your team can start:

  • Uploading, deleting and updating images via API.
  • Setting up preferred variants.
  • Editing with Image Resizing both with the UI and API.
  • Serving an image with one click.

Process images on the fly

We all know that Google and many other search engines use the loading speed as one of their ranking factors; Cloudflare Images helps you be on the top of that list. We automatically detect what browser your clients are using and serve the most optimized version of the image, so that you don’t need to worry about file extensions, configuring origins for your image sets or even cache hit rates.

Curious to have a sneak peek at Cloudflare Images? Sign up now!

From AMP to Signed Exchanges, Or How Innovation Happens at Cloudflare

Post Syndicated from Matthew Prince original https://blog.cloudflare.com/from-amp-to-signed-exchanges-or-how-innovation-happens-at-cloudflare/

From AMP to Signed Exchanges, Or How Innovation Happens at Cloudflare

From AMP to Signed Exchanges, Or How Innovation Happens at Cloudflare

This is the story of how we decided to work with Google to build Signed Exchanges support at Cloudflare. But, more generally, it’s also a story of how Cloudflare thinks about building disruptive new products and how we’ve built an organization designed around continuous innovation and long-term thinking.

A Threat to the Open Web?

The story starts with me pretty freaked out. In May 2015, Facebook had announced a new format for the web called Instant Articles. The format allowed publishers to package up their pages and serve them directly from Facebook’s infrastructure. This was a threat to Google, so the company responded in October with Accelerated Mobile Pages (AMP). The idea was generally the same as Facebook’s but using Google’s infrastructure.

As a general Internet user, if these initiatives were successful they were pretty scary. The end game was that the entirety of the web would effectively be slurped into Facebook and Google’s infrastructure.

But as the cofounder and CEO of Cloudflare, this presented an even more immediate risk. If everyone moved their infrastructure to Facebook and Google, there wasn’t much left for us to do. Our mission is to help build a better Internet, but we’ve always assumed there would be an Internet. If Facebook and Google were successful, there was real risk there would just be Facebook and Google.

That said, the rationale behind these initiatives was compelling. While they ended with giving Facebook and Google much more control, they started by trying to solve a real problem. The web was designed with the assumption that the devices connecting to it would be on a fixed, wired connection. As more of the web moved to being accessed over wireless, battery-powered, relatively low-power devices, many of the assumptions of the web were holding back its performance.

This is particularly true in the developing world. While a failed connection can happen anywhere, the further you get from where content is hosted, the more likely it is to happen. Facebook and Google both reasoned that if they could package up the web and serve complete copies of pages from their infrastructure, which spanned the developing world, they could significantly increase the usability of the web in areas where there was still an opportunity for Internet usage to grow. Again, this is a laudable goal. But, if successful, the results would have been dreadful for the Internet as we know it.

Seeds of Disruption

So that’s why I was freaked out. In our management meetings at Cloudflare I’d walk through how this was a risk to the Internet and our business, and we needed to come up with a strategy to address it. Everyone on our team listened and agreed but ultimately and reasonably said: that’s in the future, and we have immediate priorities of things our customers need, so we’ll need to wait until next quarter to prioritize it.

That’s all correct, and probably the right decision if you are forced to make one, but it’s also how companies end up getting disrupted. So, in 2016, we decided to fund a small team led by Dane Knecht, Cloudflare’s founding product manager, to set up a sort of skunkworks team in Austin, TX. The idea was to give the team space away from headquarters, so it could work on strategic projects with a long payoff time horizon.

Today, Dane’s team is known as the Emerging Technologies & Incubation (ETI) team. It was where products like Cloudflare for Teams, 1.1.1.1, and Workers were first dreamed up and prototyped. And it remains critical to how Cloudflare continues to be so innovative. Austin, since 2016, has also grown from a small skunkworks outpost to what will, before the end of this year, be our largest office. That office now houses members from every Cloudflare team, not just ETI. But, in some ways, it all started with trying to figure out how we should respond to Instant Articles and AMP.

We met with both Facebook and Google. Facebook’s view of the world was entirely centered around their app, and didn’t leave much room for partners. Google, on the other hand, was born out of the open web and still ultimately wanted to foster it. While there has been a lot of criticism of AMP, much of which we discussed with them directly, it’s important to acknowledge that it started from a noble goal: to make the web faster and easier to use for those with limited Internet resources.

We built a number of products to extend the AMP ecosystem and make it more open. Viewed on their own, those products have not been successes. But they catalyzed a number of other innovations. For instance, building a third party AMP cache on Cloudflare required a more programmable network. That directly resulted in us prototyping a number of different serverless computing strategies and finally settling on Workers. In fact, many of the AMP products we built were the first products built using Workers.

Part of the magic of our ETI team is that they are constantly trying new things. They’re set up differently, in order to take lots of “shots on goal.” Some won’t work, in which case we want them to fail fast. And, even for those that don’t, we are always learning, collaborating, and innovating. That’s how you create a culture of innovation that produces products at the rate we do at Cloudflare.

Signed Exchanges: Helping Build a Better Internet

Importantly also, working with the AMP team at Google helped us better collaborate on ideas around Internet performance. Cloudflare’s mission is to “help build a better Internet.” It’s not to “build a better Internet.” The word “help” is essential and something I’ll always correct if I hear someone leave it out. The Internet is inherently a collection of networks, and also a collection of work from a number of people and organizations. Innovation doesn’t happen in a vacuum but is catalyzed by collaboration and open standards. Working with other great companies who are aligned with democratizing performance optimization technology and speeding up the Internet is how we believe we can make significant and meaningful leaps in terms of performance.

From AMP to Signed Exchanges, Or How Innovation Happens at Cloudflare

And that’s what Signed Exchanges have the opportunity to be. They take the best parts of AMP — in terms of allowing pages to be preloaded to render almost instantly — but give back control over the content to the individual publishers. They don’t require you to exclusively use Google’s infrastructure and are extensible well beyond just traffic originating from search results. And they make the web incredibly fast and more accessible even in those areas where Internet access is slow or expensive.

We’re proud of the part we played in bringing this new technology to the Internet. We’re excited to see how people use it to build faster services available more broadly. And the ETI team is back at work looking over the innovation horizon and continuously asking the question: what’s next?

From AMP to Signed Exchanges, Or How Innovation Happens at Cloudflare

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

Post Syndicated from David Tuber original https://blog.cloudflare.com/orpheus/

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

Cloudflare’s mission is to help build a better Internet for everyone. Building a better Internet means helping build more reliable and efficient services that everyone can use. To help realize this vision, we’re announcing the free distribution of two products, one old and one new:

  • Tiered Caching is now available to all customers for free. Tiered Caching reduces origin data transfer and improves performance, making web properties cheaper and faster to operate. Tiered Cache was previously a paid addition to Free, Pro, and Business plans as part of Argo.
  • Orpheus is now available to all customers for free. Orpheus routes around problems on the Internet to ensure that customer origin servers are reachable from everywhere, reducing the number of errors your visitors see.

Tiered Caching: improving website performance and economics for everyone

Tiered Cache uses the size of our network to reduce requests to customer origins by dramatically increasing cache hit ratios. With data centers around the world, Cloudflare caches content very close to end users, but if a piece of content is not in cache, the Cloudflare edge data centers must contact the origin server to receive the cacheable content. This can be slow and places load on an origin server compared to serving directly from cache.

Tiered Cache works by dividing Cloudflare’s data centers into a hierarchy of lower-tiers and upper-tiers. If content is not cached in lower-tier data centers (generally the ones closest to a visitor), the lower-tier must ask an upper-tier to see if it has the content. If the upper-tier does not have it, only the upper-tier can ask the origin for content. This practice improves bandwidth efficiency by limiting the number of data centers that can ask the origin for content, reduces origin load, and makes websites more cost-effective to operate.

Dividing data centers like this results in improved performance for visitors because distances and links traversed between Cloudflare data centers are generally shorter and faster than the links between data centers and origins. It also reduces load on origins, making web properties more economical to operate. Customers enabling Tiered Cache can achieve a 60% or greater reduction in their cache miss rate as compared to Cloudflare’s traditional CDN service.

Additionally, Tiered Cache concentrates connections to origin servers so they come from a small number of data centers rather than the full set of network locations. This results in fewer open connections using server resources.

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

Tiered Cache is simple to enable:

  • Log into your Cloudflare account.
  • Navigate to the Caching in the dashboard.
  • Under Caching, select Tiered Cache.
  • Enable Tiered Cache.

From there, customers will automatically be enrolled in Smart Tiered Cache Topology without needing to make any additional changes. Enterprise Customers can select from different prefab topologies or have a custom topology created for their unique needs.

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

Smart Tiered Cache dynamically selects the single best upper tier for each of your website’s origins with no configuration required. We will dynamically find the single best upper tier for an origin by using Cloudflare’s performance and routing data. Cloudflare collects latency data for each request to an origin. Using this latency data, we can determine how well any upper-tier data center is connected with an origin and can empirically select the best data center with the lowest latency to be the upper-tier for an origin.

Today, Smart Tiered Cache is being offered to ALL Cloudflare customers for free, in contrast to other CDNs who may charge exorbitant fees for similar or worse functionality. Current Argo customers will get additional benefits described here. We think that this is a foundational improvement to the performance and economics of running a website.

But what happens if an upper-tier can’t reach an origin?

Orpheus: solving origin reachability problems for everyone

Cloudflare is a reverse proxy that receives traffic from end users and proxies requests back to customer servers or origins. To be successful, Cloudflare needs to be reachable by end users while simultaneously being able to reach origins. With end users around the world, Cloudflare needs to be able to reach origins from multiple points around the world at the same time. This is easier said than done! The Internet is not homogenous, and diverse Cloudflare network locations do not necessarily take the same paths to a given customer origin at any given time. A customer origin may be reachable from some networks but not from others.

Cloudflare developed Argo to be the Waze of the Internet, allowing our network to react to changes in Internet traffic conditions and route around congestion and breakages in real-time, ensuring end users always have a good experience. Argo Smart Routing provides amazing performance and reliability improvements to our customers.

Enter Orpheus. Orpheus provides reachability benefits for customers by finding unreachable paths on the Internet in real time, and guiding traffic away from those paths, ensuring that Cloudflare will always be able to reach an origin no matter what is happening on the Internet.  

Today, we’re excited to announce that Orpheus is available to and being used by all our customers.

Fewer 522s

You may have seen this error before at one time or another.

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

This error indicates that a user was unable to reach content because Cloudflare couldn’t reach the origin. Because of the unpredictability of the Internet described above, users may see this error even when an origin is up and able to receive traffic.

So why do you see this error? The 522 error occurs when network instability causes traffic sent by Cloudflare to fail either before it reaches the origin, or on the way back from the origin to Cloudflare. This is the equivalent of either Cloudflare or your origin sending a request and never getting a response. Both sides think that they’re fine, but the network path between them is not reachable at all. This causes customer pain.

Orpheus solves that pain, ensuring that no matter where users are or where the origin is, an Internet application will always be reachable from Cloudflare.

How it works

Orpheus builds and provisions routes from Cloudflare to origins by analyzing data from users on every path from Cloudflare and ordering them on a per-data center level with the goal of eliminating connection errors and minimizing packet loss. If Orpheus detects errors on the current path from Cloudflare back to a customer origin, Orpheus will steer subsequent traffic from the impacted network path to the healthiest path available.

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

This is similar to how Argo works but with some key differences: Argo is always steering traffic down the fastest path, whereas Orpheus is reactionary and steers traffic down healthy (and not necessarily the fastest) paths when needed.

Improving origin reachability for customers

Let’s look at an example.

Barry has an origin hosted in WordPress in Chicago for his daughter’s band. This zone primarily sees traffic from three locations: the location closest to his daughter in Seattle, the location closest to him in Boston, and the location closest to his parents in Tampa, who check in on their granddaughter’s site daily for updates.

One day, a link between Tampa and the Chicago origin gets cut by a wandering backhoe. This means that Tampa loses some connectivity back to the Chicago origin. As a result, Barry’s parents start to see failures when connecting back to origin when connecting to the site. This reflects in origin reachability decreasing. Orpheus helps here by finding alternate paths for Barry’s parents, whether it’s through Boston, Seattle, or any location in between that isn’t impacted by the fiber cut seen in Tampa.

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

So even though there is packet loss between one of Cloudflare’s data centers and Barry’s origin, because there is a path through a different Cloudflare data center that doesn’t have loss, the traffic will still succeed because the traffic will go down the non-lossy path.

How much does Orpheus help my origin reachability?

In our rollout of Orpheus for customers, we observed that Orpheus improved Origin reachability by 23%, from 99.87% to 99.90%. Here is a chart showing the improvement Orpheus provides (lower is better):

Improving Origin Performance for Everyone with Orpheus and Tiered Cache

We measure this reachability improvement by measuring 522 rates for every data center-origin pair and then comparing traffic that traversed Orpheus routes with traffic that went directly back to origin. Orpheus was especially helpful at improving reachability for slightly lossy paths that could present small amounts of failure over a long period of time, whereas direct to origin would see those failures.

Note that we’ll never get this number to 0% because, with or without Orpheus, some origins really are unreachable because they are down!

Orpheus makes Cloudflare products better

Orpheus pairs well with some of our products that are already designed to provide highly available services on an uncertain Internet. Let’s go over the interactions between Orpheus and three of our products: Load Balancing, Cloudflare Network Interconnect, and Tiered Cache.

Load Balancing

Orpheus and Load Balancing go together to provide high reachability for every origin endpoint. Load balancing allows for automatic selection of endpoints based on health probes, ensuring that if an origin isn’t working, customers will still be available and operational. Orpheus finds reachable paths from Cloudflare to every origin. These two products in tandem provide a highly available and reachable experience for customers.

Cloudflare Network Interconnect

Orpheus and Cloudflare Network Interconnect (CNI) combine to always provide a highly reachable path, no matter where in the world you are. Consider Acme, a company who is connected to the Internet by only one provider that has a lot of outages. Orpheus will do its best to steer traffic around the lossy paths, but if there’s only one path back to the customer, Orpheus won’t be able to find a less-lossy path. Cloudflare Network Interconnect solves this problem by providing a path that is separate from the transit provider that any Cloudflare data center can access. CNI provides a viable path back to Acme’s origin that will allow Orpheus to engage from any data center in the world if loss occurs.

Shields for All

Orpheus and Tiered Cache can combine to build an adaptive shield around an origin that caches as much as possible while improving traffic back to origin. Tiered Cache topologies allow for customers to deflect much of their static traffic away from their origin to reduce load, and Orpheus helps ensure that any traffic that has to go back to the origin traverses over highly available links.

Improving origin performance for everyone

The Internet is a growing, ever-changing ecosystem. With the release of Orpheus and Tiered Cache for everyone, we’ve given you the ability to navigate whatever the Internet has in store to provide the best possible experience to your customers.

Argo 2.0: Smart Routing Learns New Tricks

Post Syndicated from David Tuber original https://blog.cloudflare.com/argo-v2/

Argo 2.0: Smart Routing Learns New Tricks

Argo 2.0: Smart Routing Learns New Tricks

We launched Argo in 2017 to improve performance on the Internet. Argo uses real-time global network information to route around brownouts, cable cuts, packet loss, and other problems on the Internet. Argo makes the network that Cloudflare relies on—the Internet—faster, more reliable, and more secure on every hop around the world.

Without any complicated configuration, Argo is able to use real-time traffic data to pick the fastest path across the Internet, improving performance and delivering more satisfying experiences to your customers and users.

Today, Cloudflare is announcing several upgrades to Argo’s intelligent routing:

  • When it launched, Argo was entirely focused on the “middle mile,” speeding up connections from Cloudflare to our customers’ servers. Argo now delivers optimal routes from clients and users to Cloudflare, further reducing end-to-end latency while still providing the impressive edge to origin performance that Argo is known for. These last-mile improvements reduce end user round trip times by up to 40%.
  • We’re also adding support for accelerating pure IP workloads, allowing Magic Transit and Magic WAN customers to build IP networks to enjoy the performance benefits of Argo.

Starting today, all Free, Pro, and Business plan Argo customers will see improved performance with no additional configuration or charge. Enterprise customers have already enjoyed the last mile performance improvements described here for some time. Magic Transit and WAN customers can contact their account team to request Early Access to Argo Smart Routing for Packets.

What’s Argo?

Argo finds the best and fastest possible path for your traffic on the Internet. Every day, Cloudflare carries hundreds of billions of requests across our network and the Internet. Because our network, our customers, and their end users are well distributed globally, all of these requests flowing across our infrastructure paint a great picture of how different parts of the Internet are performing at any given time.

Just like Waze examines real data from real drivers to give you accurate, uncongested — and sometimes unorthodox — routes across town, Argo Smart Routing uses the timing data Cloudflare collects from each request to pick faster, more efficient routes across the Internet.

In practical terms, Cloudflare’s network is expansive in its reach. Some Internet links in a given region may be congested and cause poor performance (a literal traffic jam). By understanding this is happening and using alternative network locations and providers, Argo can put traffic on a less direct, but faster, route from its origin to its destination.

These benefits are not theoretical: enabling Argo Smart Routing shaves an average of 33% off HTTP time to first byte (TTFB).

One other thing we’re proud of: we’ve stayed super focused on making it easy to use. One click in the dashboard enables better, smarter routing, bringing the full weight of Cloudflare’s network, data, and engineering expertise to bear on making your traffic faster. Advanced analytics allow you to understand exactly how Argo is performing for you around the world.

You can read a lot more about how Argo works in our original launch blog post.

Even More Blazing Fast

We’ve continuously improved Argo since the day it was launched, making it faster, quicker to respond to changes on the Internet, and allowing more types of traffic to flow over smart routes.

Argo’s new performance optimizations improve last mile latencies and reduce time to first byte even further. Argo’s last mile optimizations can save up to 40% on last mile round trip time (RTT) with commensurate improvements to end-to-end latency.

Running benchmarks against an origin server in the central United States, with visitors coming from around the world, Argo delivered the following results:

Argo 2.0: Smart Routing Learns New Tricks

The Argo improvements on the last mile reduced overall time to first byte by 39%, and reduced end-to-end latencies by 5% overall:

Argo 2.0: Smart Routing Learns New Tricks

Faster, better caching

Argo customers don’t just see benefits to their dynamic traffic. Argo’s new found skills provide benefits for static traffic as well. Because Argo now finds the best path to Cloudflare, client TTFB for cache hits sees the same last mile benefit as dynamic traffic.

Getting access to faster Argo

The best part about all these improvements? They’re already deployed and enabled for all Argo customers! These optimizations have been live for Enterprise customers for some time and were enabled for Free, Pro, and Business plans this week.

Moving Down the Stack: Argo Smart Routing for Packets

Customers use Magic Transit and Magic WAN to create their own IP networks on top of Cloudflare’s network, with access to a full suite of network functions (firewalls, DDoS mitigation, and more) delivered as a service. This allows customers to build secure, private, global networks without the need to purchase specialized hardware. Now, Argo Smart Routing for Packets allows these customers to create these IP networks with the performance benefits of Argo.

Consider a fictional gaming company, Golden Fleece Games. Golden Fleece deployed Magic Transit to mitigate attacks by malicious actors on the Internet. They want to be able to provide a quality game to their users while staying up. However, they also need their service to be as fast as possible. If their game sees additional latency, then users won’t play it, and even if their service is technically up, the increased latency will show a decrease in users. For Golden Fleece, being slow is just as bad as being down.

Finance customers also have similar requirements for low latency, high security scenarios. Consider Jason Financial, a fictional Magic Transit customer using Packet Smart Routing. Jason Financial employees connect to Cloudflare in New York, and their requests are routed to their data center which is connected to Cloudflare through a Cloudflare Network Interconnect attached to Cloudflare in Singapore. For Jason Financial, reducing latency is extraordinarily important: if their network is slow, then the latency penalties they incur can literally cost them millions of dollars due to how fast the stock market moves. Jason wants Magic Transit and other Cloudflare One products to secure their network and prevent attacks, but improving performance is important for them as well.

Argo’s Smart Routing for Packets provides these customers with the security they need at speeds faster than before. Now, customers can get the best of both worlds: security and performance. Now, let’s talk a bit about how it works.

A bird’s eye view of the Internet

Argo Smart Routing for Packets picks the fastest possible path between two points. But how does Argo know that the chosen route is the fastest? As with all Argo products, the answer comes by analyzing a wealth of network data already available on the Cloudflare edge. In Argo for HTTP or Argo for TCP, Cloudflare is able to use existing timing data from traffic that’s already being sent over our edge to optimize routes. This allows us to improve which paths are taken as traffic changes and congestion on the Internet happens. However, to build Smart Routing for Packets, the game changed, and we needed to develop a new approach to collect latency data at the IP layer.

Let’s go back to the Jason Financial case. Before, Argo would understand that the number of paths that are available from Cloudflare’s data centers back to Jason’s data center is proportional to the number of data centers Cloudflare has multiplied by the number of distinct interconnections between each data center. By looking at the traffic to Singapore, Cloudflare can use existing Layer 4 traffic and network analytics to determine the best path. But Layer 4 is not Layer 3, and when you move down the stack, you lose some insight into things like round trip time (RTT), and other metrics that compose time to first byte because that data is only produced at higher levels of the application stack. It can become harder to figure out what the best path actually is.

Optimizing performance at the IP layer can be more difficult than at higher layers. This is because protocol and application layers have additional headers and stateful protocols that allow for further optimization. For example, connection reuse is a performance improvement that can only be realized at higher layers of the stack because HTTP requests can reuse existing TCP connections. IP layers don’t have the concept of connections or requests at all: it’s just packets flowing over the wire.

To help bridge the gap, Cloudflare makes use of an existing data source that already exists for every Magic Transit customer today: health check probes. Every Magic Transit customer leverages existing health check probes from every single Cloudflare data center back to the customer origin. These probes are used to determine tunnel health for Magic Transit, so that Cloudflare knows which paths back to origin are healthy. These probes contain a variety of information that can also be used to improve performance as well. By examining health check probes and adding them to existing Layer 4 data, Cloudflare can get a better understanding of one-way latencies and can construct a map that allows us to see all the interconnected data centers and how fast they are to each other. Once this customer gets a Cloudflare Network Interconnect, Argo can use the data center-to-data center probes to create an alternate path for the customer that’s different from the public Internet.

Argo 2.0: Smart Routing Learns New Tricks

Using this map, Cloudflare can construct dynamic routes for each customer based on where their traffic enters Cloudflare’s network and where they need to go. This allows us to find the optimal route for Jason Financial and allows us to always pick the fastest path.

Packet-Level Latency Reductions

We’ve kind of buried the lede here! We’ve talked about how hard it is to optimize performance for IP traffic. The important bit: despite all these difficulties, Argo Smart Routing for Packets is able to provide a 10% average latency improvement worldwide in our internal testing!

Argo 2.0: Smart Routing Learns New Tricks

Depending on your network topology, you may see latency reductions that are even higher!

How do I get Argo Smart Routing for Packets?

Argo Smart Routing for Packets is in closed beta and is available only for Magic Transit customers who have a Cloudflare Network Interconnect provisioned. If you are a Magic Transit customer interested in seeing the improved performance of Argo Smart Routing for Packets for yourself, reach out to your account team today! If you don’t have Magic Transit but want to take advantage of bigger performance gains while acquiring uncompromised levels of network security, begin your Magic Transit onboarding process today!

What’s next for Argo

Argo’s roadmap is simple: get ever faster, for any type of traffic.

Argo’s recent optimizations will help customers move data across the Internet at as close to the speed of light as possible. Internally, “how fast are we compared to the speed of light” is one of our engineering team’s key success metrics. We’re not done until we’re even.

Data at Cloudflare just got a lot faster: Announcing Live-updating Analytics and Instant Logs

Post Syndicated from Jon Levine original https://blog.cloudflare.com/instant-logs/

Data at Cloudflare just got a lot faster: Announcing Live-updating Analytics and Instant Logs

Data at Cloudflare just got a lot faster: Announcing Live-updating Analytics and Instant Logs

Today, we’re excited to introduce Live-updating Analytics and Instant Logs. For Pro, Business, and Enterprise customers, our analytics dashboards now update live to show you data as it arrives. In addition to this, Enterprise customers can now view their HTTP request logs instantly in the Cloudflare dashboard.

Cloudflare’s data products are essential for our customers’ visibility into their network and applications. Having this data in real time makes it even more powerful — could you imagine trying to navigate using a GPS that showed your location a minute ago? That’s the power of real time data!

Real time data unlocks entirely new use cases for our customers. They can respond to threats and resolve errors as soon as possible, keeping their applications secure and minimising disruption to their end users.

Lightning fast, in-depth analytics

Cloudflare products generate petabytes of log data daily and are designed for scale. To make sense of all this data, we summarize it using analytics — the ability to see time series data, tops Ns, and slices and dices of the data generated by Cloudflare products. This allows customers to identify trends and anomalies and drill deep into problems.

We take it a step further from just showing you high-level metrics. With Cloudflare Analytics you have the ability to quickly drill down into the most important data — narrow in on a specific time period and add a chain of filters to slice your data further and see all the reflecting analytics.

Data at Cloudflare just got a lot faster: Announcing Live-updating Analytics and Instant Logs
Video of Cloudflare analytics showing live updating and drill down capabilities

Let’s say you’re a developer who’s made some recent changes to your website, you’ve deleted some old content and created new web pages. You want to know as soon as possible if these changes have led to any broken links, so you can quickly identify them and make fixes. With live-updating analytics, you can monitor your traffic by status code. If you notice an uptick in 404 errors add a filter to get details on all 404s and view the top referrers causing the errors. From there, take steps to resolve the problem whether by creating a redirect page rule or fixing broken links on your own site.

Instant Logs at your fingertips

While Analytics are a great way to see data at an aggregate level, sometimes you need event level information, too. Logs are powerful because they record every single event that flows through a network, so you can figure out what occurred on a granular level.

Our Logpush system is already able to get logs from our global edge network to a customer’s storage destination or analytics provider within seconds. However, setting this up has a lot of overhead and often customers incur long processing times at their destination. We wanted logs to be instant — instant to set up, deliver and take action on.

It’s that easy.

With Instant Logs, customers can actively monitor the traffic that’s flowing through their network and make key decisions that affect their applications now. Real time data unlocks totally new use cases:

  • For Security Engineers: Stop an attack as it’s developing. For example, apply a Firewall rule and see it’s impact — get answers within seconds. If it’s not what you were intending, try another rule and check again.
  • For Developers: Roll out a config change — to Cloudflare, or to your origin — and have piece of mind to watch as your error rates stay flat (we hope!).

(By the way, if you’re a fan of Workers and want to see real time Workers logging, check out the recently released dashboard for Workers logs.)

Logs at the speed of sight

“Real time” or “instant” can mean different things to different people in different contexts. At Cloudflare, we’re striving to make it as close to the speed of sight as possible. For us, this means we wanted the “glass-to-glass” time — from when you hit “enter” in your browser until when the logs appear — to be under one second.

How did we do?

Today, Cloudflare’s Instant Logs have an average delay of two seconds, and we’re continuing to make improvements to drive that down.

“Real-time” is a very fuzzy term. Looking at other services we see Akamai talking about real-time data as “within minutes” or “latency of 10 minutes”, Amazon talks about “near real-time” for CloudWatch, Google Cloud Logging provides log tailing with a configurable buffer “up to 60 seconds” to deal with potential out-of-order log delivery, and we benchmarked Fastly logs at 25 seconds.

Our goal is to drive down the delay as much as possible (within the laws of physics). We’re happy to have shipped Instant Logs that arrive in two seconds, but we’re not satisfied and will continue to bring that number down.

In time sensitive scenarios such as an attack or an outage, a few minutes or even 30 seconds of delay can have a big impact on customers. At Cloudflare, our goal is to get our customer’s data into their hands as fast as possible  — and we’re just getting started.

How to get access?

Live-updating Analytics is available now on all Pro, Business, and Enterprise plans. Select the “Last 30 minutes” view of your traffic in the Analytics tab to start monitoring your analytics live.

We’ll be starting our Beta for Instant Logs in a couple of weeks. Join the waitlist to get notified about when you can get access!

If you’re eager for details on the inner workings of Instant Logs, check out our blog post about how we built Instant Logs.

What’s next

We’re hard at work to make Instant Logs available for all Enterprise customers — stay tuned after joining our waitlist. We’re also planning to bring all of our datasets to Instant Logs, including Firewall Events. In addition, we’re working on the next set of features like the ability to download logs from your session and compute running aggregates from logs.

For a peek into what we have our sights on next, we know how important it is to perform analysis on not only up-to-date data, but also historical data. We want to give customers the ability to analyze logs, draw insights and perform forensics straight from the Cloudflare platform.

If this sounds cool, we’re hiring engineers for our data team in Lisbon, London and San Francisco — would love to have you help us build the future of data at Cloudflare.

Cloudflare Workers: the Fast Serverless Platform

Post Syndicated from Rita Kozlov original https://blog.cloudflare.com/cloudflare-workers-the-fast-serverless-platform/

Cloudflare Workers: the Fast Serverless Platform

Cloudflare Workers: the Fast Serverless Platform

Just about four years ago, we announced Cloudflare Workers, a serverless platform that runs directly on the edge.

Throughout this week, we will talk about the many ways Cloudflare is helping make applications that already exist on the web faster. But if today is the day you decide to make your idea come to life, building your project on the Cloudflare edge, and deploying it directly to the tubes of the Internet is the best way to guarantee your application will always be fast, for every user, regardless of their location.

It’s been a few years since we talked about how Cloudflare Workers compares to other serverless platforms when it comes to performance, so we decided it was time for an update. While most of our work on the Workers platform over the past few years has gone into making the platform more powerful: introducing new features, APIs, storage, debugging and observability tools, performance has not been neglected.

Today, Workers is 30% faster than it was three years ago at P90. And it is 210% faster than [email protected], and 298% faster than Lambda.

Oh, and also, we eliminated cold starts.

How do you measure the performance of serverless platforms?

I’ve run hundreds of performance benchmarks between CDNs in the past — the formula is simple: we use a tool called Catchpoint, which makes requests from nodes all over the world to the same asset, and reports back on the time it took for each location to return a response.

Measuring serverless performance is a bit different — since the thing you’re comparing is the performance of compute, rather than a static asset, we wanted to make sure all functions performed the same operation.

In our 2018 blog on speed testing, we had each function simply return the current time. For the purposes of this test, “serverless” products that were not able to meet the minimum criteria of being able to perform this task were disqualified. Serverless products used in this round of testing executed the same function, of identical computational complexity, to ensure accurate and fair results.

It’s also important to note what it is that we’re measuring. The reason performance matters, is because it impacts the experience of actual end customers. It doesn’t matter what the source of latency is: DNS, network congestion, cold starts… the customer doesn’t care what the source is, they care about wasting time waiting for their application to load.

It is therefore important to measure performance in terms of the end user experience — end to end, which is why we use global benchmarks to measure performance.

The result below shows tests run from 50 nodes all over the world, across North America, South America, Europe, Asia and Oceania.

Blue: Cloudflare Workers
Red: [email protected]
Green: Lambda

(Link to results).

Cloudflare Workers: the Fast Serverless Platform
Cloudflare Workers: the Fast Serverless Platform
Cloudflare Workers: the Fast Serverless Platform

As you can see from the results, no matter where users are in the world, when it comes to speed, Workers can guarantee the best experience for customers.

In the case of Workers, getting the best performance globally requires no additional effort on the developers’ part. Developers do not need to do any additional load balancing, or configuration of regions. Every deployment is instantly live on Cloudflare’s extensive edge network.

Even if you’re not seeking to address a global audience, and your customer base is conveniently located on the East coast of the United States, Workers is able to guarantee the fastest response on all requests.

Cloudflare Workers: the Fast Serverless Platform

Above, we have the results just from Washington, DC, as close as we could get to us-east-1. And again, without any optimization, Workers is 34% faster.

Why is that?

What defines the performance of a serverless platform?

Other than the performance of the code itself, from the perspective of the end user, serverless application performance is fundamentally a function of two variables: distance an application executes from the user, and the time it takes the runtime itself to spin up. The realization that distance from the user is becoming a greater and greater bottleneck on application performance is causing many serverless vendors to push deeper and deeper into the edge. Running applications on the edge — closer to the end user — increases performance. As 5G comes online, this trend will only continue to accelerate.

However, many cloud vendors in the serverless space run into a critical problem when addressing the issue when competing for faster performance. And that is: the legacy architecture they’re using to build out their offerings doesn’t work well with the inherent limitations of the edge.

Since the goal behind the serverless model is to intentionally abstract away the underlying architecture, not everyone is clear on how legacy cloud providers like AWS have created serverless offerings like Lambda. Legacy cloud providers deliver serverless offerings by spinning up a containerized process for your code. The provider auto-scales all the different processes in the background. Every time a container is spun up, the entire language runtime is spun up with it, not just your code.

To help address the first graph, measuring global performance, vendors are attempting to move away from their large, centralized architecture (a few, big data centers) to a distributed, edge-based world (a greater number of smaller data centers all over the world) to close the distance between applications and end users. But there’s a problem with their approach: smaller data centers mean fewer machines, and less memory. Each time vendors pursue a small but many data centers strategy to operate closer to the edge, the likelihood of a cold start occurring on any individual process goes up.

This effectively creates a performance ceiling for serverless applications on container-based architectures. If legacy vendors with small data centers move your application closer to the edge (and the users), there will be fewer servers, less memory, and more likely that an application will need a cold start. To reduce the likelihood of that, they’re back to a more centralized model; but that means running your applications from one of a few big centralized data centers. These larger centralized data centers, by definition, are almost always going to be further away from your users.

You can see this at play in the graph above by looking at the results of the tests when running in [email protected] — despite the reduced proximity to the end user, p90 performance is slower than that of Lambda’s, as containers have to spin up more frequently.

Serverless architectures built on containers can move up and down the frontier, but ultimately, there’s not much they can do to shift that frontier curve.

What makes Workers so fast?

Workers was designed from the ground up for an edge-first serverless model. Since Cloudflare started with a distributed edge network, rather than trying to push compute from large centralized data centers out into the edge, working under those constraints forced us to innovate.

In one of our previous blog posts, we’ve discussed how this innovation translated to a new paradigm shift with Workers’ architecture being built on lightweight V8 isolates that can spin up quickly, without introducing a cold start on every request.

Not only has running isolates given us advantage out of the box, but as V8 gets better, so does our platform. For example, when V8 announced Liftoff, a compiler for WASM, all WASM Workers instantly got faster.

Similarly, whenever improvements are made to Cloudflare’s network (for example, when we add new data centers) or stack (e.g., supporting new, faster protocols like HTTP/3), Workers instantly benefits from it.

Additionally, we’re always seeking to make improvements to Workers itself to make the platform even faster. For example, last year, we released an improvement that helped eliminate cold starts for our customers.

One key advantage that helps Workers identify and address performance gaps is the scale at which it operates. Today, Workers services hundreds of thousands of developers, ranging from hobbyists to enterprises all over the world, serving millions of requests per second. Whenever we make improvements for a single customer, the entire platform gets faster.

Performance that matters

The ultimate goal of the serverless model is to enable developers to focus on what they do best — build experiences for their users. Choosing a serverless platform that can offer the best performance out of the box means one less thing developers have to worry about. If you’re spending your time optimizing for cold starts, you’re not spending your time building the best feature for your customers.

Just like developers want to create the best experience for their users by improving the performance of their application, we’re constantly striving to improve the experience for developers building on Workers as well.

In the same way customers don’t want to wait for slow responses, developers don’t want to wait on slow deployment cycles.

This is where the Workers platform excels yet again.

Any deployment on Cloudflare Workers takes less than a second to propagate globally, so you don’t want to spend time waiting on your code deploy, and users can see changes as quickly as possible.

Of course, it’s not just the deployment time itself that’s important, but the efficiency of the full development cycle, which is why we’re always seeking to improve it at every step: from sign up to debugging.

Don’t just take our word for it!

Needless to say, much as we try to remain neutral, we’re always going to be just a little biased. Luckily, you don’t have to take our word for it.

We invite you to sign up and deploy your first Worker today — it’ll just take a few minutes!

Vary for Images: Serve the Correct Images to the Correct Browsers

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/vary-for-images-serve-the-correct-images-to-the-correct-browsers/

Vary for Images: Serve the Correct Images to the Correct Browsers

Vary for Images: Serve the Correct Images to the Correct Browsers

Today, we’re excited to announce support for Vary, an HTTP header that ensures different content types can be served to user-agents with differing capabilities.

At Cloudflare, we’re obsessed with performance. Our job is to ensure that content gets from our network to visitors quickly, and also that the correct content is served. Serving incompatible or unoptimized content burdens website visitors with a poor experience while needlessly stressing a website’s infrastructure. Lots of traffic served from our edge consists of image files, and for these requests and responses, serving optimized image formats often results in significant performance gains. However, as browser technology has advanced, so too has the complexity required to serve optimized image content to browsers all with differing capabilities — not all browsers support all image formats! Providing features to ensure that the correct images are served to the correct requesting browser, device, or screen is important!

Serving images on the modern web

In the web’s early days, if you wanted to serve a full color image, JPEGs reigned supreme and were universally supported. Since then, the state of the art in image encoding has advanced by leaps and bounds, and there are now increasingly more advanced and efficient codecs like WebP and AVIF that promise reduced file sizes and improved quality.

This sort of innovation is exciting, and delivers real improvements to user experience. However, it makes the job of web servers and edge networks more complicated. As an example, until very recently, WebP image files were not universally supported by commonly used browsers. A specific browser not supporting an image file becomes a problem when “intermediate caches”, like Cloudflare, are involved in delivering content.

Let’s say, for example, that a website wants to provide the best experience to whatever browser requests the site. A desktop browser sends a request to the website and the origin server responds with the website’s content including images. This response is cached by a CDN prior to getting sent back to the requesting browser.

Now let’s say a mobile browser comes along and requests that same website with those images. In the situation where a cached image is a WebP file, and WebP is not supported by the mobile browser, the website will not load properly because the content returned from cache is not supported by the mobile browser. That’s a problem.

To help solve this issue, today we’re excited to announce our support of the Vary header for images.

How Vary works

Vary is an HTTP response header that allows origins to serve variants of the same content from a single URL, and have intermediate caches serve the correct variant to each user-agent that comes along.

Smashing Magazine has an excellent deep dive on how Vary negotiation works here.

When browsers send a request for a website, they include a variety of request headers. A fairly common example might look something like:

GET /page.html HTTP/1.1
Host: example.com
Connection: keep-alive
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36
Accept-Encoding: gzip, deflate, br

As we can see above, the browser sends a lot of information in these headers along with the GET request for the URL. What’s important for Vary for Images is the Accept header. The Accept header tells the origin what sort of content the browser is capable of handling (file types, etc.) and provides a list of content preferences.

When the origin gets the request, it sees the Accept header which details the content preference for the browser’s request. In the origin’s response, Vary tells the browser that content returned was different depending on the value of the Accept header in the request. Thus if a different browser comes along and sends a request with different Accept header values, this new browser can get a different response from the origin. An example origin response may look something like:

HTTP/1.1 200 OK
Content-Length: 123456
Vary: Accept

How Vary works with Cloudflare’s cache

Now, let’s add Cloudflare to the mix. Cloudflare sits in between the browser and the origin in the above example. When Cloudflare receives the origin’s response, we cache the specific image variant so that subsequent requests from browsers with the same image preferences can be served from cache. This also means that serving multiple image variants for the same asset will create distinct cache entries.

Accept header normalization

Caching variants in intermediate caches can be difficult to get right. Naive caching of variants can cause problems by serving incorrect or unsupported image variants to browsers. Some solutions that reduce the potential for caching incorrect variants generally provide those safeguards at the expense of performance.

For example, through a process known as content-negotiation, the correct variant is directed to the requesting browser through a process of multiple requests and responses. The browser could send a request to the origin asking for a list of available resource variants. When the origin responds with the list, the browser can make an additional request for the desired resources from that list, which the server would then respond to. These redundant calls to narrow down which type of content that the browser accepts and the server has available can cause performance delays.

Vary for Images: Serve the Correct Images to the Correct Browsers

Vary for Images reduces the need for these redundant negotiations to an origin by parsing the request’s Accept header and sending that on to the origin to ensure that the origin knows exactly what content it needs to deliver to the browser. Additionally because the expected variant values can be set in Cloudflare’s API (see below), we make an end-run around the negotiation process because we are sure what to ask for and expect from the origin. This reduces the needless back-and-forth between browsers and servers.

How to Enable Vary for Images

You can enable Vary for Images from Cloudflare’s API for Pro, Business, and Enterprise Customers.

Things to keep in mind when using Vary:

  • Vary for Images enables varying on the following file extensions: avif, bmp, gif, jpg, jpeg, jp2, jpg2, png, tif, tiff, webp. These extensions can have multiple variants served so long as the origin server sends the Vary: Accept response header.
  • If the origin server sends Vary: Accept but does not serve the expected variant, the response will not be cached. This will be indicated with the BYPASS cache status in the response headers.
  • The list of variant types the origin serves for each extension must be configured so that Cloudflare can decide which variant to serve without having to contact the origin server.

Enabling Vary in action

Enabling Vary functionality currently requires the use of the Cloudflare API. Here’s an example of how to enable variant support for a zone that wants to serve JPEGs in addition to WebP and AVIF variants for jpeg and jpg extensions.

Create a variants rule:

curl -X PATCH
"https://api.cloudflare.com/client/v4/zones/023e105f4ecef8ad9ca31a8372d0 c353/cache/variants" \ 
	-H "X-Auth-Email: [email protected]" \ 
	-H "X-Auth-Key: 3xamp1ek3y1234" \ 
	-H "Content-Type: application/json" \ 
	--data 
'{"value":{"jpeg":["image/webp","image/avif"],"jpg":["image/webp","image/avif"]}}' 

Modify to only allow WebP variants:

curl -X PATCH 
"https://api.cloudflare.com/client/v4/zones/023e105f4ecef8ad9ca31a8372d0 c353/cache/variants" \ 
	-H "X-Auth-Email: [email protected]" \ 
	-H "X-Auth-Key: 3xamp1ek3y1234" \ 
	-H "Content-Type: application/json" \ 
	--data 
'{"value":{"jpeg":["image/webp"],"jpg":["image/webp"]}}' 

Delete the rule:

curl -X DELETE 
"https://api.cloudflare.com/client/v4/zones/023e105f4ecef8ad9ca31a8372d0c353/cache/variants" \ 
	-H "X-Auth-Email: [email protected]" \ 
	-H "X-Auth-Key: 3xamp1ek3y1234" 

Get the rule:

curl -X GET 
"https://api.cloudflare.com/client/v4/zones/023e105f4ecef8ad9ca31a8372d0c353/cache/variants" \
	-H "X-Auth-Email: [email protected]" \ 
	-H "X-Auth-Key: 3xamp1ek3y1234"

Purging variants

Any purge of varied images will purge all content variants for that URL. That way, if the image changes, you can easily update the cache with a single purge versus chasing down how many potential out-of-date variants may exist. This behavior is true regardless of purge type (single file, tag, or hostname) used.

Other image optimization tools available at Cloudflare

Providing an additional option for customers to optimize the delivery of images also allows Cloudflare to support more customer configurations. For other ways Cloudflare can help you serve images to visitors quickly and efficiently, you can check out:

  • Polish — Cloudflare’s automatic product that strips image metadata and applies compression. Polish accelerates image downloads by reducing image size.
  • Image Resizing — Cloudflare’s image resizing product works as a proxy on top of the Cloudflare edge cache to apply the adjustments to an image’s size and quality.  
  • Cloudflare for Images — Cloudflare’s all-in-one service to host, resize, optimize, and deliver all of your website’s images.

Try Vary for Images Out

Vary for Images provides options that ensure the best images are served to the browser based on the browser’s capabilities and preferences. If you’re looking for more control over how your images are delivered to browsers, we encourage you to try this new feature out.

Welcome to Speed Week and a Waitless Internet

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/fastest-internet/

Welcome to Speed Week and a Waitless Internet

Welcome to Speed Week and a Waitless Internet

No one likes to wait. Internet impatience is something we all suffer from.

Waiting for an app to update to show when your lunch is arriving; a website that loads slowly on your phone; a movie that hasn’t started to play… yet.

But building a waitless Internet is hard. And that’s where Cloudflare comes in. We’ve built the global network for Internet applications, be they websites, IoT devices or mobile apps. And we’ve optimized it to cut the wait.

If you believe ISP advertising then you’d think that bandwidth (100Mbps! 1Gbps! 2Gbps!) is the be all and end all of Internet speed. That’s a small component of what it takes to deliver the always on, instant experience we want and need.

The reality is you need three things: ample bandwidth, to have content and applications close to the end user, and to make the software as fast as possible. Simple really. Except not, because all three things require a lot of work at different layers.

In this blog post I’ll look at the factors that go into building our fast global network: bandwidth, latency, reliability, caching, cryptography, DNS, preloading, cold starts, and more; and how Cloudflare zeroes in on the most powerful number there is: zero.

I will focus on what happens when you visit a website but most of what I say below applies to the fitness tracker on your wrist sending information up to the cloud, your smart doorbell alerting you to a visitor, or an app getting you the weather forecast.

Faster than the speed of sight

Imagine for a moment you are about to type in the name of a website on your phone or computer. You’ve heard about an exciting new game “Silent Space Marine” and type in silentspacemarine.com.

The very first thing your computer does is translate that name into an IP address. Since computers do absolutely everything with numbers under the hood this “DNS lookup” is the first necessary step.

It involves your computer asking a recursive DNS resolver for the IP address of silentspacemarine.com. That’s the first opportunity for slowness. If the lookup is slow everything else will be slowed down because nothing can start until the IP address is known.

The DNS resolver you use might be one provided by your ISP, or you might have changed it to one of the free public resolvers like Google’s 8.8.8.8. Cloudflare runs the world’s fastest DNS resolver, 1.1.1.1, and you can use it too. Instructions are here.

With fast DNS name resolution set up your computer can move on to the next step in getting the web page you asked for.

Aside: how fast is fast? One way to think about that is to ask yourself how fast you are able to perceive something change. Research says that the eye can make sense of an image in 13ms. High quality video shows at 60 frames per second (about 16ms per image). So the eye is fast!

What that means for the web is that we need to be working in tens of milliseconds not seconds otherwise users will start to see the slowness.

Slowly, desperately slowly it seemed to us as we watched

Why is Cloudflare’s 1.1.1.1 so fast? Not to downplay the work of the engineering team who wrote the DNS resolver software and made it fast, but two things help make it zoom: caching and closeness.

Caching means keeping a copy of data that hasn’t changed, so you don’t have to go ask for it. If lots of people are playing Silent Space Marine then a DNS resolver can keep its IP address in cache so that when a computer asks for the IP address the software can reply instantly. All good DNS resolvers cache information for speed.

But what happens if the IP address isn’t in the resolver’s cache. This happens the first time someone asks for it, or after a timeout period where the resolver needs to check that the IP address hasn’t changed. In order to get the IP address the resolver asks an authoritative DNS server for the information. That server is ‘authoritative’ for a specific domain (like silentspacemarine.com) and knows the correct IP address.

Since DNS resolvers sometimes have to ask authoritative servers for IP addresses it’s also important that those servers are fast too. That’s one reason why Cloudflare runs one of the world’s largest and fastest authoritative DNS services. Slow authoritative DNS could be another reason an end user has to wait.

So much for caching, what about ‘closeness’. Here’s the problem: the speed of light is really slow. Yes, I know everyone tells you that the speed of light is really fast, but that’s because us sentient water-filled carbon lifeforms can’t move very fast.

But electrons shooting through wires, and lasers blasting data down fiber optic cables, send data at or close to light speed. And sadly light speed is slow. And this slowness shows up because in order to get anything on the Internet you need to go back and forth to a server (many, many times).

In the best case of asking for silentspacemarine.com and getting its IP address there’s one roundtrip:

“Hello, can you tell me the address of silentspacemarine.com?”
“Yes, it’s…”

Even if you made the DNS resolver software instantaneous you’d pay the price of the speed of light. Sounds crazy, right? Here’s a quick calculation. Let’s imagine at home I have fiber optic Internet and the nearest DNS resolver to me is the city 100 km’s away. And somehow my ISP has laid the straightest fiber cable from me to the DNS resolver.

The speed of light in fiber is roughly 200,000,000 meters per second. Round trip would be 200,000 meters and so in the best possible case a whole one ms has been eaten up by the speed of light. Now imagine any worse case and the speed of light starts eating into the speed of sight.

The solution is quite simple: move the DNS resolver as close to the end user as possible. That’s partly why Cloudflare has built out (and continues to grow) our network. Today it stands at 250 cities worldwide.

Aside: actually it’s not “quite simple” because there’s another wrinkle. You can put servers all over the globe, but you also have to hook them up to the Internet. The beauty of the Internet is that it’s a network of networks. That also means that you don’t just plug into the Internet and get the lowest latency, you need to connect to multiple ISPs, transit networks and more so that end users, whatever network they use, get the best waitless experience they want.

That’s one reason why Cloudflare’s network isn’t simply all over the world, it’s also one of the most interconnected networks.

So far, in building the waitless Internet, we’ve identified fast DNS resolvers and fast authoritative DNS as two needs. What’s next?

Hello. Hello. OK.

So your web browser knows the IP address of Silent Space Marine and got it quickly. Great. Next step is for it to ask the web server at that IP address for the web page. Not so fast! The first step is to establish a connection to that server.

This is almost always done using a protocol called TCP that was invented in the 1970s. The very first step is for your computer and the server to agree they want to communicate. This is done with something called a three-way handshake.

Your computer sends a message saying, essentially, “Hello”, the server replies “I heard you say Hello” (that’s one round trip) and then your computer replies “I heard you say you heard me say Hello, so now we can chat” (actually it’s SYN then SYN-ACK and then ACK).

So, at least one speed-of-light troubled round trip has occurred. How do we fight the speed of light? We bring the server (in this case, web server) close to the end user. Yet another reason for Cloudflare’s massive global network and high interconnectedness.

Now the web browser can ask the web server for the web page of Silent Space Marine, right? Actually, no. The problem is we don’t just need a fast Internet we also need one that’s secure and so pretty much everything on the Internet uses an encryption protocol called TLS (which some old-timers will call SSL) and so next a secure connection has to be established.

Aside: astute readers might be wondering why I didn’t mention security in the DNS section above. Yep, you’re right, that’s a whole other wrinkle. DNS also needs to be secure (and fast) and resolvers like 1.1.1.1 support the encrypted DNS standards DoH and DoT. Those are built on top of… TLS. So in order to have fast, secure DNS you need the same thing as fast, secure web, and that’s fast TLS.

Oh, and by the way, you don’t want to get into some silly trade off between security and speed. You need both, which is why it’s helpful to use a service provider, like Cloudflare, that does everything.

Is this line secure?

TLS is quite a complicated protocol involving a web browser and a server establishing encryption keys and at least one of them (typically the web server) providing that they are who they purport to be (you wouldn’t want a secure connection to your bank’s website if you couldn’t be sure it was actually your bank).

The back and forth of establishing the secure connection incurs more hits on the speed of light. And so, once again, having servers close to end users is vital. And having really fast encryption software is vital too. Especially since encryption will need to happen on a variety of devices (think an old phone vs. a brand new laptop).

So, staying on top of the latest TLS standard is vital (we’re currently on TLS 1.3), and implementing all the tricks that speed TLS up is important (such as session resumption and 0-RTT resumption), and making sure your software is highly optimized.

So far getting to a waitless Internet has involved fast DNS resolvers, fast authoritative DNS, being close to end users to fast TCP handshakes, optimized TLS using the latest protocols. And we haven’t even asked the web server for the page yet.

If you’ve been counting round trips we’re currently standing at four: one for DNS, one for TCP, two for TLS. Lots of opportunity for the speed of light to be a problem, but also lots of opportunity for wider Internet problems to cause a slow-down.

Skybird, this is Dropkick with a red dash alpha message in two parts

Actually, before we let the web browser finally ask for the web page there are two things we need to worry about. And both are to do with when things go wrong. You may have noticed that sometimes the Internet doesn’t work right. Sometimes it’s slow.

The slowness is usually caused by two things: congestion and packet loss. Dealing with those is also vital to giving the end user the fastest experience possible.

In ancient times, long before the dawn of history, people used to use telephones that had physical wires connected to them. Those wires connected to exchanges and literal electrical connections were made between two phones over long distances. That scaled pretty well for a long time until a bunch of packet heads came along in the 1960s and said “you know you could create a giant shared network and break all communication up into packets and share the network”. The Internet.

But when you share something you can also get congestion and congestion control is a huge part of ensuring that the Internet is shared equitably amongst users. It’s one of the miracles of the Internet that theory done in the 1970s and implemented in the 1980s has allowed the network to support real time gaming and streaming video while allowing simultaneous chat and web browsing.

The flip side of congestion control is that in order to prevent a user from overwhelming the network you have to slow them down. And we’re trying to be as fast as possible! Actually, we need to be as fast as possible while remaining fair.

And congestion control is closely related to packet loss because one way that servers and browsers and computers know that there’s congestion is when their packets get lost.

We stay on top of the latest congestion control algorithms (such as BBR) so that users get the fastest, fairest possible experience. And we do something else: we actively try to work around packet loss.

Technologies like Argo and our private fiber backbone help us route around bad Internet weather that’s causing packet loss and send connections over dedicated fiber optic links that span the globe.

More on that in the coming week.

It’s happening!

And so, finally your web browser asks the web server for the web page with an innocent looking GET / command. And the web server responds with a big blob of HTML and just when you thought things were going to be simple, they are super complicated.

The complexity comes from two places: the HTTP protocol is now on its third major version, and the content of web pages is under the control of the designer.

First, protocols. HTTP/2 and HTTP/3 both provide significant speedups for web sites by introducing parallel request/response handling, better compression and ways to work around congestion and packet loss. Although HTTP/1.1 is still widely used, these newer protocols are the majority of traffic.

Cloudflare Radar shows HTTP/1.1 has dropped into the 20% range globally.

Welcome to Speed Week and a Waitless Internet

As people upgrade to recent browsers on their computers and devices the new protocols become more and more important. Staying on top of these, and optimizing them is vital as part of the waitless Internet.

And then comes the content of web pages. Images are a vital part of the web and delivering optimized images right-sized and right-formatted for the end user device plays a big part in a fast web.

But before the web browser can start loading the images it has to get and understand the HTML of the web page. This is wasteful as the browser could be downloading images (and other assets like fonts of JavaScript) while still processing the HTML if it knew about them in advance. The solution to that is for the web server to send a hint about what’s needed along with the HTML.

More on that in the coming week.

Imagical

One of the largest categories of content we deliver for our customers consists of static and animated images. And they are also a ripe target for optimization. Images tend to be large and take a while to download and there are a vast variety of end user devices. So getting the right size and format image to the end user really helps with performance.

Getting it there at the right time also means that images can be loaded lazily and only when the user scrolls them into visibility.

But, traditionally, handling different image formats (especially as new ones like WebP and AVIF get invented), different device types (think of all the different screen sizes out there), and different compression schemes has been a mess of services.

And chained services for different aspects of the image pipeline can be slow and expensive. What you really want is simple storage and an integrated way to deliver the right image to the end user tailored just for them.

More on that in the coming week.

Cache me if you can

As I mentioned in the section about DNS, a few thousand words ago, caching is really powerful and caching content near the end user is super powerful. Cloudflare makes extensive use of caching (particularly of images but also things like GraphQL) on its servers. This makes our customers’ websites fast as images can be delivered quickly from servers near the end user.

But it introduces a problem. If you have a lot of servers around the world then the caches need to be filled with content in order for it to be ready for end users. And the more servers you add the harder it gets to keep them all filled. You want the ‘cache hit ratio’ (how often content is served from cache without having to go back to the customer’s server) to be as high as possible.

But if you’ve got the content cached in Casablanca, and a user visits your website in Chennai they won’t have the fastest content delivery. To solve this some service providers make a deliberate decision not to have lots of servers near end users.

Sounds a bit crazy but their logic is “it’s hard to keep all those caches filled in lots of cities, let’s have only a few cities”. Sad. We think smart software can solve that problem and allow you to have your cache and eat it. We’ve used smart software to solve global load balancing problems and are doing the same for global cache. That way we get high cache hit ratios, super low latency to end users and low load on customer web servers.

More on that in the coming week.

Zero Cool

You know what’s cooler than a millisecond? Zero milliseconds.

Back in 2017 Cloudflare launched Workers, our serverless/edge computing platform. Four years on Workers is widely used and entire companies are being built on the technology. We added support for a variety of languages (such as COBOL and Rust), a distributed key-value store, Durable Objects, WebSockets, Cron Triggers and more.

But people were often concerned about cold start times because they were thinking about other serverless platforms that had significant spool up times for code that wasn’t ready to run.

Last year we announced that we eliminated cold starts from Workers. You don’t have to worry. And we’ll go deeper into why Cloudflare Workers is the fastest serverless platform out there.

More on that in the coming week.

And finally…

If you run a large global network and want to know if it’s really the fastest there is, and where you need to do work to keep it fast, the only way is to measure. Although there are third-party measurement tools available they can suffer from biases and their methodology is sometimes unclear.

We decided the only way we could understand our performance vs. other networks was to build our own like-for-like testing tool and measure performance across the Internet’s 70,000+ networks.

We’ll also talk about how we keep everything fast, from lightning quick configuration updates and code deploys to logs you don’t have to wait for to ludicrously fast cache purges to real time analytics.

More on that in the coming week.

Welcome to Speed Week*

*Can’t wait for tomorrow? Go play Silent Space Marine. It uses the technologies mentioned above.

What’s new with Cloudflare for SaaS?

Post Syndicated from Dina Kozlov original https://blog.cloudflare.com/whats-new-with-cloudflare-for-saas/

What’s new with Cloudflare for SaaS?

What’s new with Cloudflare for SaaS?

This past April, we announced the Cloudflare for SaaS Beta which makes our SSL for SaaS product available to everyone. This allows any customer — from first-time developers to large enterprises — to use Cloudflare for SaaS to extend our full product suite to their own customers. SSL for SaaS is the subset of Cloudflare for SaaS features that focus on a customer’s Public Key Infrastructure (PKI) needs.

Today, we’re excited to announce all the customizations that our team has been working on for our Enterprise customers — for both Cloudflare for SaaS and SSL for SaaS.

Let’s start with the basics — the common SaaS setup

If you’re running a SaaS company, your solution might exist as a subdomain of your SaaS website, e.g. template.<mysaas>.com, but ideally, your solution would allow the customer to use their own vanity hostname for it, such as example.com.

The most common way to begin using a SaaS company’s service is to point a CNAME DNS record to the subdomain that the SaaS provider has created for your application. This ensures traffic gets to the right place, and it allows the SaaS provider to make infrastructure changes without involving your end customers.

What’s new with Cloudflare for SaaS?

We kept this in mind when we built our SSL for SaaS a few years ago. SSL for SaaS takes away the burden of certificate issuance and management from the SaaS provider by proxying traffic through Cloudflare’s edge. All the SaaS provider needs to do is onboard their zone to Cloudflare and ask their end customers to create a CNAME to the SaaS zone — something they were already doing.

The big benefit of giving your customers a CNAME record (instead of a fixed IP address) is that it gives you, the SaaS provider, more flexibility and control. It allows you to seamlessly change the IP address of your server in the background. For example, if your IP gets blocked by ISP providers, you can update that address on your customers’ behalf with a CNAME record. With a fixed A record, you rely on each of your customers to make a change.

While the CNAME record works great for most customers, some came back and wanted to bypass the limitation that CNAME records present. RFC 1912 states that CNAME records cannot coexist with other DNS records, so in most cases, you cannot have a CNAME at the root of your domain, e.g. example.com. Instead, the CNAME needs to exist at the subdomain level, i.e. www.example.com. Some DNS providers (including Cloudflare) bypass this restriction through a method called CNAME flattening.

Since SaaS providers have no control over the DNS provider that their customers are using, the feedback they got from their customers was that they wanted to use the apex of their zone and not the subdomain. So when our SaaS customers came back asking us for a solution, we delivered. We call it Apex Proxying.

Apex Proxying

For our SaaS customers who want to allow their customers to proxy their apex to their zone, regardless of which DNS provider they are using, we give them the option of Apex Proxying. Apex Proxying is an SSL for SaaS feature that gives SaaS providers a pair of IP addresses to provide to their customers when CNAME records do not work for them.

Cloudflare starts by allocating a dedicated set of IPs for the SaaS provider. The SaaS provider then gives their customers these IPs that they can add as A or AAAA DNS records, allowing them to proxy traffic directly from the apex of their zone.

While this works for most, some of our customers want more flexibility. They want to have multiple IPs that they can change, or they want to assign different sets of customers to different buckets of IPs. For those customers, we give them the option to bring their own IP range, so they control the IP allocation for their application.

What’s new with Cloudflare for SaaS?

Bring Your Own IPs

Last year, we announced Bring Your Own IP (BYOIP), which allows customers to bring their own IP range for Cloudflare to announce at our edge. One of the benefits of BYOIP is that it allows SaaS customers to allocate that range to their account and then, instead of having a few IPs that their customers can point to, they can distribute all the IPs in their range.

What’s new with Cloudflare for SaaS?

SaaS customers often require granular control of how their IPs are allocated to different zones that belong to different customers. With 256 IPs to use, you have the flexibility to either group customers together or to give them dedicated IPs. It’s up to you!

While we’re on the topic of grouping customers, let’s talk about how you might want to do this when sending traffic to your origin.

Custom Origin Support

When setting up Cloudflare for SaaS, you indicate your fallback origin, which defines the origin that all of your Custom Hostnames are routed to. This origin can be defined by an IP address or point to a load balancer defined in the zone. However, you might not want to route all customers to the same origin. Instead, you want to route different customers (or custom hostnames) to different origins — either because you want to group customers together or to help you scale the origins supporting your application.

Our Enterprise customers can now choose a custom origin that is not the default fallback origin for any of their Custom Hostnames. Traditionally, this has been done by emailing your account manager and requesting custom logic at Cloudflare’s edge, a very cumbersome and outdated practice. But now, customers can easily indicate this in the UI or in their API requests.

Wildcard Support

Oftentimes, SaaS providers have customers that don’t just want their domain to stay protected and encrypted, but also the subdomains that fall under it.

We wanted to give our Enterprise customers the flexibility to extend this benefit to their end customers by offering wildcard support for Custom Hostnames.

Wildcard Custom Hostnames extend the Custom Hostname’s configuration from a specific hostname — e.g. “blog.example.com” — to the next level of subdomains of that hostname, e.g. “*.blog.example.com”.

To create a Custom Hostname with a wildcard, you can either indicate Enable wildcard support when creating a Custom Hostname in the Cloudflare dashboard or when you’re creating a Custom Hostname through the API, indicate wildcard: “true”.

What’s new with Cloudflare for SaaS?

Now let’s switch gears to TLS certificate management and the improvements our team has been working on.

TLS Management for All

SSL for SaaS was built to reduce the burden of certificate management for SaaS providers. The initial functionality was meant to serve most customers and their need to issue, maintain, and renew certificates on their customers’ behalf. But one size does not fit all, and some of our customers have more specific needs for their certificate management — and we want to make sure we accommodate them.

CSR Support/Custom certs

One of the superpowers of SSL for SaaS is that it allows Cloudflare to manage all the certificate issuance and renewals on behalf of our customers and their customers. However, some customers want to allow their end customers to upload their own certificates.

For example, a bank may only trust certain certificate authorities (CAs) for their certificate issuance. Alternatively, the SaaS provider may have initially built out TLS support for their customers and now their customers expect that functionality to be available. Regardless of the reasoning, we have given our customers a few options that satisfy these requirements.

For customers who want to maintain control over their TLS private keys or give their customers the flexibility to use their certification authority (CA) of choice, we allow the SaaS provider to upload their customer’s certificate.

If you are a SaaS provider and one of your customers does not allow third parties to generate keys on their behalf, then you want to allow that customer to upload their own certificate. Cloudflare allows SaaS providers to upload their customers’ certificates to any of their custom hostnames — in just one API call!

Some SaaS providers never want a person to see private key material, but want to be able to use the CA of their choice. They can do so by generating a Certificate Signing Request (CSR) for their Custom Hostnames, and then either use those CSRs themselves to order certificates for their customers or relay the CSRs to their customers so that they can provision their own certificates. In either case, the SaaS provider is able to then upload the certificate for the Custom Hostname after the certificate has been issued from their customer’s CA for use at Cloudflare’s edge.

Custom Metadata

For our customers who need to customize their configuration to handle specific rules for their customer’s domains, they can do so by using Custom Metadata and Workers.

By adding metadata to an individual custom hostname and then deploying a Worker to read the data, you can use the Worker to customize per-hostname behavior.

Some customers use this functionality to add a customer_id field to each custom hostname that they then send in a request header to their origin. Another way to use this is to set headers like HTTP Strict Transport Security (HSTS) on a per-customer basis.

Saving the best for last: Analytics!

Tomorrow, we have a very special announcement about how you can now get more visibility into your customers’ traffic and — more importantly —  how you can share this information back to them.

Interested? Reach out!

If you’re an Enterprise customer, and you’re interested in any of these features, reach out to your account team to get access today!

Quick Tunnels: Anytime, Anywhere

Post Syndicated from Rishabh Bector original https://blog.cloudflare.com/quick-tunnels-anytime-anywhere/

Quick Tunnels: Anytime, Anywhere

Quick Tunnels: Anytime, Anywhere

My name is Rishabh Bector, and this summer, I worked as a software engineering intern on the Cloudflare Tunnel team. One of the things I built was quick Tunnels and before departing for the summer, I wanted to write a blog post on how I developed this feature.

Over the years, our engineering team has worked hard to continually improve the underlying architecture through which we serve our Tunnels. However, the core use case has stayed largely the same. Users can implement Tunnel to establish an encrypted connection between their origin server and Cloudflare’s edge.

This connection is initiated by installing a lightweight daemon on your origin, to serve your traffic to the Internet without the need to poke holes in your firewall or create intricate access control lists. Though we’ve always centered around the idea of being a connector to Cloudflare, we’ve also made many enhancements behind the scenes to the way in which our connector operates.

Typically, users run into a few speed bumps before being able to use Cloudflare Tunnel. Before they can create or route a tunnel, users need to authenticate their unique token against a zone on their account. This means in order to simply spin up a Tunnel testing environment, users need to first create an account, add a website, change their nameservers, and wait for DNS propagation.

Starting today, we’re excited to fix that. Cloudflare Tunnel now supports a free version that includes all the latest features and does not require any onboarding to Cloudflare. With today’s change, you can begin experimenting with Tunnel in five minutes or less.

Introducing Quick Tunnels

When administrators start using Cloudflare Tunnel, they need to perform four specific steps:

  1. Create the Tunnel
  2. Configure the Tunnel and what services it will represent
  3. Route traffic to the Tunnel
  4. And finally… run the Tunnel!

These steps give you control over how your services connect to Cloudflare, but they are also a chore. Today’s change, which we are calling quick Tunnels, not only removes some onboarding requirements, we’re also condensing these into a single step.

If you have a service running locally that you want to share with teammates or an audience, you can use this single command to connect your service to Cloudflare’s edge. First, you need to install the Cloudflare connector, a lightweight daemon called cloudflared. Once installed, you can run the command below.

cloudflared tunnel

Quick Tunnels: Anytime, Anywhere

When run, cloudflared will generate a URL that consists of a random subdomain of the website trycloudflare.com and point traffic to localhost port 8080. If you have a web service running at that address, users who visit the subdomain generated will be able to visit your web service through Cloudflare’s network.

Configuring Quick Tunnels

We built this feature with the single command in mind, but if you have services that are running at different default locations, you can optionally configure your quick Tunnel to support that.

One example is if you’re building a multiplayer game that you want to share with friends. If that game is available locally on your origin, or even your laptop, at localhost:3000, you can run the command below.

cloudflared tunnel ---url localhost:3000

You can do this with IP addresses or URLs, as well. Anything that cloudflared can reach can be made available through this service.

How does it work?

Cloudflare quick Tunnels is powered by Cloudflare Workers, giving us a serverless compute deployment that puts Tunnel management in a Cloudflare data center closer to you instead of a centralized location.

When you run the command cloudflared tunnel, your instance of cloudflared initiates an outbound-only connection to Cloudflare. Since that connection was initiated without any account details, we treat it as a quick Tunnel.

A Cloudflare Worker, which we call the quick Tunnel Worker, receives a request that a new quick Tunnel should be created. The Worker generates the random subdomain and returns that to the instance of cloudflared. That instance of cloudflared can now establish a connection for that subdomain.

Meanwhile, a complementary service running on Cloudflare’s edge receives that subdomain and the identification number of the instance of cloudflared. That service uses that information to create a DNS record in Cloudflare’s authoritative DNS which maps the randomly-generated hostname to the specific Tunnel you created.

The deployment also relies on the Workers Cron Trigger feature to perform clean up operations. On a regular interval, the Worker looks for quick Tunnels which have been disconnected for more than five minutes. Our Worker classifies these Tunnels as abandoned and proceeds to delete them and their associated DNS records.

What about Zero Trust policies?

By default, all the quick Tunnels that you create are available on the public Internet at the randomly generated URL. While this might be fine for some projects and tests, other use cases require more security.

Quick Tunnels: Anytime, Anywhere

If you need to add additional Zero Trust rules to control who can reach your services, you can use Cloudflare Access alongside Cloudflare Tunnel. That use case does require creating a Cloudflare account and adding a zone to Cloudflare, but we’re working on ideas to make that easier too.

Where should I notice improvements?

We first launched a version of Cloudflare Tunnel that did not require accounts over two years ago. While we’ve been thrilled that customers have used this for their projects, Cloudflare Tunnel evolved significantly since then. Specifically, Cloudflare Tunnel relies on a new architecture that is more redundant and stable than the one used by that older launch. While all Tunnels that migrated to this new architecture, which we call Named Tunnels, enjoyed those benefits, the users on this option that did not require an account were left behind.

Today’s announcement brings that stability to quick Tunnels. Tunnels are now designed to be long-lived, persistent objects. Unless you delete them, Tunnels can live for months, an improvement over the average lifespan measured in hours before connectivity issues disrupted a Tunnel in the older architecture.

These quick Tunnels run on this same, resilient architecture not only expediting time-to-value, but also improving the overall tunnel quality of life.

What’s next?

Today’s quick Tunnels add a powerful feature to Cloudflare Tunnels: the ability to create a reliable, resilient tunnel in a single command, without the hassle of creating an account first. We’re excited to help your team build and connect services to Cloudflare’s network and on to your audience or teammates. If you have additional questions, please share them in this community post here.

Introducing logs from the dashboard for Cloudflare Workers

Post Syndicated from Ashcon Partovi original https://blog.cloudflare.com/workers-dashboard-logs/

Introducing logs from the dashboard for Cloudflare Workers

Introducing logs from the dashboard for Cloudflare Workers

If you’re writing code: what can go wrong, will go wrong.

Many developers know the feeling: “It worked in the local testing suite, it worked in our staging environment, but… it’s broken in production?” Testing can reduce mistakes and debugging can help find them, but logs give us the tools to understand and improve what we are creating.

if (this === undefined) {
  console.log("there’s no way… right?") // Narrator: there was.
}

While logging can help you understand when the seemingly impossible is actually possible, it’s something that no developer really wants to set up or maintain on their own. That’s why we’re excited to launch a new addition to the Cloudflare Workers platform: logs and exceptions from the dashboard.

Starting today, you can view and filter the console.log output and exceptions from a Worker… at no additional cost with no configuration needed!

View logs, just a click away

When you view a Worker in the dashboard, you’ll now see a “Logs” tab which you can click on to view a detailed stream of logs and exceptions. Here’s what it looks like in action:

Each log entry contains an event with a list of logs, exceptions, and request headers if it was triggered by an HTTP request. We also automatically redact sensitive URLs and headers such as Authorization, Cookie, or anything else that appears to have a sensitive name.

If you are in the Durable Objects open beta, you will also be able to view the logs and requests sent to each Durable Object. This is a great tool to help you understand and debug the interactions between your Worker and a Durable Object.

For now, we support filtering by event status and type. Though, you can expect more filters to be added to the dashboard very soon! Today, we support advanced filtering with the wrangler CLI, which will be discussed later in this blog.

console.log(), and you’re all set

It’s really simple to get started with logging for Workers. Simply invoke one of the standard console APIs, such as console.log(), and we handle the rest. That’s it! There’s no extra setup, no configuration needed, and no hidden logging fees.

function logRequest (request) {
  const { cf, headers } = request
  const { city, region, country, colo, clientTcpRtt  } = cf
  
  console.log("Detected location:", [city, region, country].filter(Boolean).join(", "))
  if (clientTcpRtt) {
     console.debug("Round-trip time from client to", colo, "is", clientTcpRtt, "ms")
  }

  // You can also pass an object, which will be interpreted as JSON.
  // This is great if you want to define your own structured log schema.
  console.log({ headers })
}

In fact, you don’t even need to use console.log to view an event from the dashboard. If your Worker doesn’t generate any logs or exceptions, you will still be able to see the request headers from the event.

Advanced filters, from your terminal

If you need more advanced filters you can use wrangler, our command-line tool for deploying Workers. We’ve updated the wrangler tail command to support sampling and a new set of advanced filters. You also no longer need to install or configure cloudflared to use the command. Not to mention it’s much faster, no more waiting around for logs to appear. Here are a few examples:

# Filter by your own IP address, and if there was an uncaught exception.
wrangler tail --format=pretty --ip-address=self --status=error

# Filter by HTTP method, then apply a 10% sampling rate.
wrangler tail --format=pretty --method=GET --sampling-rate=0.1

# Filter using a generic search query.
wrangler tail --format=pretty --search="TypeError"

We recommend using the “pretty” format, since wrangler will output your logs in a colored, human-readable format. (We’re also working on a similar display for the dashboard.)

However, if you want to access structured logs, you can use the “json” format. This is great if you want to pipe your logs to another tool, such as jq, or save them to a file. Here are a few more examples:

# Parses each log event, but only outputs the url.
wrangler tail --format=json | jq .event.request?.url

# You can also specify --once to disconnect the tail after receiving the first log.
# This is useful if you want to run tests in a CI/CD environment.
wrangler tail --format=json --once > event.json

Try it out!

Both logs from the dashboard and wrangler tail are available and free for existing Workers customers. If you would like more information or a step-by-step guide, check out any of the resources below.

Data protection controls with Cloudflare Browser Isolation

Post Syndicated from Tim Obezuk original https://blog.cloudflare.com/data-protection-browser/

Data protection controls with Cloudflare Browser Isolation

Data protection controls with Cloudflare Browser Isolation

Starting today, your team can use Cloudflare’s Browser Isolation service to protect sensitive data inside the web browser. Administrators can define Zero Trust policies to control who can copy, paste, and print data in any web based application.

In March 2021, for Security Week, we announced the general availability of Cloudflare Browser Isolation as an add-on within the Cloudflare for Teams suite of Zero Trust application access and browsing services. Browser Isolation protects users from browser-borne malware and zero-day threats by shifting the risk of executing untrusted website code from their local browser to a secure browser hosted on our edge.

And currently, we’re democratizing browser isolation for any business by including it with our Teams Enterprise Plan at no additional charge.1

A different approach to zero trust browsing

Web browsers, the same tool that connects users to critical business applications, is one of the most common attack vectors and hardest to control.

Browsers started as simple tools intended to share academic documents over the Internet and over time have become sophisticated platforms that replaced virtually every desktop application in the workplace. The dominance of web-based applications in the workplace has created a challenge for security teams who race to stay patch zero-day vulnerabilities and protect sensitive data stored in self-hosted and SaaS based applications.

In an attempt to protect users and applications from web based attacks, administrators have historically relied on DNS or HTTP inspection to prevent threats from reaching the browser. These tools, while useful for protecting against known threats, are difficult to tune without false-positives (negatively impacting user productivity and increasing IT support burden) and ineffective against zero day vulnerabilities.

Browser isolation technologies mitigate risk by shifting the risk of executing foreign code from the endpoint to a secure environment. Historically administrators have had to make a compromise between performance and security when adopting such a solution. They could either:

  • Prioritize security by choosing a solution that relies on pixel pushing techniques to serve a visual representation to users. This comes at the cost of performance by introducing latency, graphical artifacts and heavy bandwidth usage.

OR

  • Prioritize performance by choosing a solution that relies on code scrubbing techniques to unpack, inspect and repack the webpage. This model is fragile (often failing to repack leading to a broken webpage) and insecure by allowing undetected threats to compromise users.

At Cloudflare, we know that security products do not need to come at the expense of performance. We developed a third option that delivers a remote browsing experience without needing to compromise on performance and security for users.

  • Prioritize security by never sending foreign code to the endpoint and executing it in a secure remote environment.
  • Prioritize performance sending light-weight vector instructions (rather than pixels) over the wire and minimize remote latency on our global edge network.
Data protection controls with Cloudflare Browser Isolation

This unique approach delivers an isolated browser without the security or performance challenges faced by legacy solutions.

Data control through the browser

Malware and zero-day threats are not the only security challenges administrators face with web browsers. The mass adoption of SaaS products has made the web browser the primary tool used to access data. Lack of control over both the application and the browser has left administrators little control over their data once it is delivered to an endpoint.

Data loss prevention tools typically rely on pattern recognition to partially or completely redact the transmission of sensitive data values. This model is useful for protecting against an unexpected breach of PII and PCI data, such as locations and financial information but comes at the loss of visibility.

The redaction model falls short when sensitive data does not fit into easily recognizable patterns, and the end-users require visibility to do their job. In industries such as health care, redacting sensitive data is not feasible as medical professions require visibility of patient notes and appointment data.

Once data lands in the web browser it is trivial for a user to copy-paste and print sensitive data into another website, application, or physical location. These seemingly innocent actions can lead to data being misplaced by naive users leading to a data breach. Administrators have had limited options to protect data in the browser, some even going so far as to deploy virtual desktop services to control access to a SaaS based customer relationship management (CRM) tool. This increased operating costs, and frustrated users who had to learn how to use computer-in-a-computer just to use a website.

One-click to isolate data in the browser

Cloudflare Browser Isolation executes all website code (including HTML) in the remote browser. Since page content remains on the remote browser and draw instructions are only sent to the browser, Cloudflare Browser Isolation is in a powerful position to protect sensitive data on any website or SaaS application.

Administrators can now control copy-paste, and printing functionality with per-rule granularity with one click in the Cloudflare for Teams Dashboard. For example, now administrators can build rules that prevent users from copying information from your CRM or that stop team members from printing data from your ERP—without blocking their attempts to print from external websites where printing does not present a data loss risk.

From the user’s perspective websites look and behave normally until the user performs a restricted action.

Data protection controls with Cloudflare Browser Isolation

Copy-paste and printing control can be configured for both new and existing HTTP policies in the Teams Dashboard.

  1. Navigate to the Cloudflare for Teams dashboard.
  2. Navigate to Gateway → Policies → HTTP.
  3. Create/update an HTTP policy with an Isolate action (docs).
  4. Configure policy settings.
Data protection controls with Cloudflare Browser Isolation

Administrators have flexibility with data protection controls and can enable/disable browser behaviours based on application, hostname, user identity and security risk.

What’s next?

We’re just getting started with zero trust browsing controls. We’re hard at work building controls to protect against phishing attacks, further protect data by controlling file uploading and downloading without needing to craft complex network policies as well as support for a fully clientless browser isolation experience.

Democratizing browser isolation for any business

Historically, only large enterprises had justified the cost to add on remote browser isolation to their existing security deployments. And the resulting loosely-integrated solution fell short of achieving Zero Trust due to poor end-user experiences. Cloudflare has already solved these challenges, so businesses achieve full Zero Trust security including browser-based data protection controls without performance tradeoffs.

Yet it’s not always enough to democratize Zero Trust browser isolation for any business, so we’re currently including it with our Teams Enterprise Plan at no additional charge.1 Get started here.

…….
1. For up to 2000 seats until 31 Dec 2021