Tag Archives: Network Performance Update

Network performance update: Birthday Week 2025

Post Syndicated from Lai Yi Ohlsen original https://blog.cloudflare.com/network-performance-update-birthday-week-2025/

We are committed to being the fastest network in the world because improvements in our performance translate to improvements for the own end users of your application. We are excited to share that Cloudflare continues to be the fastest network for the most peered networks in the world.

We relentlessly measure our own performance and our performance against peers. We publish those results routinely, starting with our first update in June 2021 and most recently with our last post in September 2024.

Today’s update breaks down where we have improved since our update last year and what our priorities are going into the next year. While we are excited to be the fastest in the greatest number of last-mile ISPs, we are never done improving and have more work to do.

How do we measure this metric, and what are the results?

We measure network performance by attempting to capture what the experience is like for Internet users across the globe. To do that we need to simulate what their connection is like from their last-mile ISP to our networks.

We start by taking the 1,000 largest networks in the world based on estimated population. We use that to give ourselves a representation of real users in nearly every geography.

We then measure performance itself with TCP connection time. TCP connection time is the time it takes for an end user to connect to the website or endpoint they are trying to reach. We chose this metric because we believe this most closely approximates what users perceive to be Internet speed, as opposed to other metrics which are either too scientific (ignoring real world challenges like congestion or distance) or too broad.

We take the trimean measurement of TCP connection times to calculate our metric. The trimean is a weighted average of three statistical values: the first quartile, the median, and the third quartile. This approach allows us to reduce some of the noise and outliers and get a comprehensive picture of quality.

For this year’s update, we examined the trimean of TCP connection times measured from August 6 to September 4, Cloudflare is the #1 provider in 40% of the top 1000 networks. In our September 2024 update, we shared that we were the #1 provider in 44% of the top 1000 networks.


The TCP Connection Time (Trimean) graph shows that we are the fastest TCP connection time in 383 networks, but that would make us the fastest in 38% of the top 1,000. We exclude networks that aren’t last-mile ISPs, such as transit networks, since they don’t reflect the end user experience, which brings the number of measured networks to 964 and makes Cloudflare the fastest in 40% of measured ISPs and the fastest across the top networks.

How do we capture this data? 

A Cloudflare-branded error page does more than just display an error; it kicks off a real-world speed test. Behind the scenes, on a selection of our error pages, we use Real User Measurements (RUM), which involves a browser retrieving a small file from multiple networks, including Cloudflare, Amazon CloudFront, Google, Fastly and Akamai.

Running these tests lets us gather performance data directly from the user’s perspective, providing a genuine comparison of different network speeds. We do this to understand where our network is fastest and, more importantly, where we can make further improvements. For a deeper dive into the technical details, the Speed Week blog post covers the full methodology.

By using RUM data, we track key metrics like TCP Connection Time, Time to First Byte (TTFB), and Time to Last Byte (TTLB). These are widely recognized, industry-standard metrics that allow us to objectively measure how quickly and efficiently a website loads for actual users. By monitoring these benchmarks, we can objectively compare our performance against other networks.

We specifically chose the top 1000 networks by estimated population from APNIC, excluding those that aren’t last-mile ISPs. Consistency is key: by analyzing the same group of networks in every cycle, we ensure our measurements and reporting remain reliable and directly comparable over time.

How do the results compare across countries?

The map below shows the fastest providers per country and Cloudflare is fastest in dozens of countries. 


The color coding is generated by grouping all the measurements we generate by which country the measurement originates from. Then we look at the trimean measurements for each provider to identify who is the fastest… Akamai was measured as well, but providers are only represented in the map if they ranked first in a country which Akamai does not anywhere in the world.

These slim margins mean that the fastest provider in a country is often determined by latency differences so small that the fastest provider is often only faster by less than 5%. As an example, let’s look at India, a country where we are currently the second-fastest provider.

India (IN)

Rank

Entity 

Connect Time (Trimean)

#1 Diff

#1

CloudFront

107 ms

#2

Cloudflare

113 ms

+4.81% (+5.16 ms)

#3

Google

117 ms

+8.74% (+9.39 ms)

#4 

Fastly

133 ms

+24% (+26 ms)

#5

Akamai

144 ms

+34% (+37 ms)

In India, Cloudflare is 5ms behind Cloudfront, the #1 provider (To put milliseconds into perspective, the average human eye blink lasts between 100ms and 400ms). The competition for the number one spot in many countries is fierce and often shifts day by day. For example, in Mexico on Tuesday, August 5th, Cloudflare was the second-fastest provider by 0.73 ms but then on Tuesday, August 12th, Cloudflare was the fastest provider by 3.72 ms. 

Mexico (MX)

Date

Rank

Entity 

Connect Time (Trimean)

#1 Diff

August 5, 2025

#1

CloudFront

116 ms

#2

Cloudflare

116 ms

+0.63% (+0.73 ms)

August 12, 2025

#1

Cloudflare

106 ms

#2

CloudFront

109 ms

+3.52% (+3.72 ms)

Because ranking reorderings are common, we also review country and network level rankings to evaluate and benchmark our performance. 

Focusing on where we are not the fastest yet

As mentioned above, in September 2024, Cloudflare was fastest in 44% of measured ISPs. These values can shift as providers constantly make improvements to their networks. One way we focus in on how we are prioritizing improving is to not just observe where we are not the fastest but to measure how far we are from the leader.

In these locations we tend to pace extremely close to the fastest provider, giving us an opportunity to capture the spot as we relentlessly improve. In networks where Cloudflare is 2nd, over 50% of those networks have a less than 5% difference (10ms or less) with the top provider.

Country

ASN

#1

Cloudflare Rank

#1 Diff (ms)

#1 Diff (%)

US

AS36352

Google

2

25 ms

32%

US

AS46475

Google

2

35 ms

29%

US

AS29802

Google

2

8.03 ms

21%

US

AS20473

Google

2

15 ms

13%

US

AS7018

CloudFront

2

23 ms

13%

US

AS4181

CloudFront

2

8.19 ms

11%

US

AS62240

Google

2

18 ms

9.77%

US

AS22773

CloudFront

2

12 ms

9.48%

US

AS6167

CloudFront

2

13 ms

7.55%

US

AS11427

Google

2

9.33 ms

5.27%

US

AS6614

CloudFront

2

6.68 ms

4.12%

US

AS4922

Google

2

3.38 ms

3.86%

US

AS11492

Fastly

2

3.73 ms

3.33%

US

AS11351

Google

2

5.14 ms

3.04%

US

AS396356

Google

2

4.12 ms

2.23%

US

AS212238

Google

2

3.42 ms

1.35%

US

AS20055

Fastly

2

1.22 ms

1.33%

US

AS40021

CloudFront

2

2.06 ms

0.91%

US

AS12271

Fastly

2

1.26 ms

0.89%

US

AS141039

CloudFront

2

1.26 ms

0.88%

In networks where Cloudflare is 3rd, 50% of those networks are less than a 10% difference with the top provider (10ms or less). Margins are small and suggest that in instances where Cloudflare isn’t number one across networks, we’re extremely close to our competitors and the top networks change day over day. 

Country

ASN

#1

Cloudflare Rank

#1 Diff (ms)

#1 Diff (%)

US

AS6461

Google

3

33 ms

39%

US

AS81

Fastly

3

43 ms

35%

US

AS14615

Google

3

24 ms

24%

US

AS13977

CloudFront

3

21 ms

19%

US

AS33363

Google

3

29 ms

18%

US

AS63949

Google

3

9.56 ms

14%

US

AS14593

Fastly

3

17 ms

13%

US

AS23089

CloudFront

3

7.4 ms

11%

US

AS16509

Fastly

3

10 ms

9.48%

US

AS209

CloudFront

3

9.69 ms

6.87%

US

AS27364

CloudFront

3

8.76 ms

6.61%

US

AS11404

CloudFront

3

6.11 ms

6.16%

US

AS46690

CloudFront

3

5.91 ms

5.43%

US

AS136787

CloudFront

3

8.23 ms

5.18%

US

AS6079

Fastly

3

5.45 ms

4.49%

US

AS5650

Google

3

3.91 ms

3.35%

Countries with an abundance of networks, like the United States, have a lot of noise we need to calibrate against. For example, the graph below represents the performance of all providers for a major ISP like AS701 (Verizon Business).

AS701 (Verizon Business) Connect Time (P95) between 2025-08-09 and 2025-09-09


In this chart, the “P95” value, or 95th percentile, refers to one point of a percentile distribution. The P95 shows the value below which 95% of the data points fall and is specifically good at helping identify the slowest or worst-case user experiences, such as those on poor networks or older devices. Additionally, we review the other numbers lower on the percentile chain in the table below, which tell us how performance varies across the full range of data. When we do so, the picture becomes more nuanced.

AS701 (Verizon Business) Provider Rankings for Connect Time at P95, P75 and P50

Rank

Entity 

Connect Time (P95)

Connect Time (P75)

Connect Time (P50)

#1

Fastly

128 ms

66 ms

48 ms

#2

Google

134 ms

72 ms

54 ms

#3

CloudFront

139 ms

67 ms

47 ms

#4 

Cloudflare

141 ms

68 ms

49 ms

#5

Akamai

160 ms

84 ms

61 ms

At the 95th percentile for AS701, Cloudflare ranks 4th but at the 75th and 50th, Cloudflare is only 2 milliseconds slower than the fastest provider. In other words, when reviewing more than one point along the distribution at the network level, Cloudflare is keeping up with the top providers for the less extreme samples. To capture these details, it’s important to look at the range of outcomes, not just one percentile.

To better reflect the full spectrum of user experiences, we started using the trimean in July 2025 to rank providers. This metric combines values from across the distribution of data – specifically the 75th, 50th and 25th percentiles – which gives a more balanced representation of overall performance, rather than only focusing on the extremes. Summarizing user experience with a single number is always challenging, but the trimean helps us compare providers in a way that better reflects how users actually experience the Internet.

Cloudflare is the fastest provider in 40% of networks in the majority of real-world conditions, not just in worst-case scenarios. Still, the 95th percentile remains key to understanding how performance holds up in challenging conditions and where other providers might fall behind in performance. When we review the 95th percentile across the same date range for all the networks, not just AS701, Cloudflare is fastest across roughly the same amount of networks but by 103 more networks than the next fastest provider. Being faster in such a wide margin of networks tells us that Cloudflare is particularly strong in the challenging, long-tail cases that other providers struggle with.


Our performance data shows that even when we are not the top-ranked provider, we remain exceptionally competitive, often trailing the leader by a mere handful of percentage points. Our strength at the 95th percentile also highlights our superior performance in the most challenging scenarios. Cloudflare’s ability to outperform other providers, in the worst-case, is a testament to the resilience and efficiency of our network.

Moving forward, we’ll continue to share multiple metrics and continue to make improvements to our network —and we’ll use this data to do it! Let’s talk about how. 

How does Cloudflare use this data to improve?

Cloudflare applies this data to identify regions and networks that need prioritization. If we are consistently slower than other providers in a network, we want to know why, so we can fix it.

For example, the graph below shows the 95th percentile of Connect Time for AS8966. Prior to June 13, 2025, our performance was suffering, and we were the slowest provider for the network. By referencing our own measurement data, we prioritized partner data centers in the region and almost immediately performance improved for users connecting through AS8966.

Cloudflare’s partner data centers consist of collaborations with local service providers who host Cloudflare’s equipment within their own facilities. This allows us to expand our network to new locations and get closer to users more quickly. In the case of AS8966, adding a new partner data center took us from being ranked last to ranked first and improved latency by roughly 150ms in one day. By using a data-driven approach, we made our network faster and most importantly, improved the end user experience.

TCP Connect Time (P95) for AS8966


What’s next?

We are always working to build a faster network and will continue sharing our process as we go. Our approach is straightforward: identify performance bottlenecks, implement fixes, and report the results. We believe in being transparent about our methods and are committed to a continuous cycle of improvement to achieve the best possible performance. Follow our blog for the latest performance updates as we continue to optimize our network and share our progress.

Network performance update: Developer Week 2025

Post Syndicated from Emily Music original https://blog.cloudflare.com/network-performance-update-developer-week-2025/

As the Internet has become enmeshed in our everyday lives, so has our need for speed. No one wants to wait when adding shoes to our shopping carts, or accessing corporate assets from across the globe. And as the Internet supports more and more of our critical infrastructure, speed becomes more than just a measure of how quickly we can place a takeout order. It becomes the connective tissue between the systems that keep us safe, healthy, and organized. Governments, financial institutions, healthcare ecosystems, transit — they increasingly rely on the Internet. This is why at Cloudflare, building the fastest network is our north star. 

We’re happy to announce that we are the fastest network in 48% of the top 1000 networks by 95th percentile TCP connection time between November 2024, and March 2025, up from 44% in September 2024.

In this post, we’re going to share with you how our network performance has changed since our last post in September 2024, and talk about what makes us faster than other networks.  But first, let’s talk a little bit about how we get this data.

How does Cloudflare get this data?

It’s happened to all of us — you casually click on a site, and suddenly you’ve reached a Cloudflare-branded error page. While you are shaking your fist at the sky, something interesting is happening on the back end. Cloudflare is using Real User Monitoring (RUM) to collect the data used to compare our performance against other networks. The monitoring we do is slightly different than the RUM Cloudflare offers to customers. When the error page loads, a 100 KB file is fetched and loaded. This file is hosted on networks like Cloudflare, Akamai, Amazon CloudFront, Fastly, and Google Cloud CDN. Your browser processes the performance data, and sends it to Cloudflare, where we use it to get a clear view of how these different networks stack up in terms of speed. 

We’ve been collecting and refining this data since June 2021.  You can read more about how we collect that data here, and we regularly track our performance during Innovation Weeks to hold ourselves accountable to you that we are always in pursuit of being the fastest network in the world.

How are we doing?

In order to evaluate Cloudflare’s speed relative to others, we measure performance across the top 1000 “eyeball” networks using the list provided by the Asia Pacific Network Information Centre (APNIC). So-called “eyeball” networks are those with a large concentration of subscribers/end users.  This information is important, because it gives us signals for where we can expand our presence or peering, or optimize our traffic engineering. When benchmarking, we assess the 95th percentile TCP connection time. This is the time it takes a user to establish a TCP connection to the server they are trying to reach. This metric helps us illustrate how Cloudflare’s network makes your traffic faster by serving your customers as locally as possible. 

When we look at Cloudflare’s performance across the top 1000 networks, we can see that we’re fastest in 487, or over 48%, of these networks, between November 2024 and March 2025:


In September 2024, we ranked #1 in 44% of these networks:


So why did we jump?  To get a better understanding of why, let’s take a look at the countries where we improved, which will give us a better sense of where to dive in.  This is what our network map looked like in September 2024 (grey countries mean we do not have enough data or users to derive insights):


(September 2024)

Today, using those same 95th percentile TCP connect times, we rank #1 in 48% of networks and the network map looks like this:


(March 2025)

We made most of our gains in Africa, where countries that previously didn’t have enough samples saw an increase in samples, and Cloudflare pulled ahead. This could mean that there was either an increase in Cloudflare users, or an increase in error pages shown. These countries got faster almost exclusively due to the presence of our Edge Partner deployments, which are Cloudflare locations embedded in last mile networks.  In next-generation markets like many African countries, these locations are crucial towards being faster as connectivity to end users tends to fall back to places like South Africa or London if in-country peering does not exist.

But let’s take a look at a couple of other places and see why we got faster.

In Canada, we were not the fastest in September 2024, but we are the fastest today. Today, we are the fastest in 40% of networks, which is the most out of all of our competitors:


But when you look at the overall country numbers, we see that the race for the fastest network is quite close:

Canada 95th Percentile TCP Connect Time by Provider

Rank

Entity

Connect Time (P95)

#1 Diff

1

Cloudflare

179 ms

2

Fastly

180 ms

+0.48% (+0.87 ms)

3

Google

180 ms

+0.74% (+1.32 ms)

4

CloudFront

182 ms

+1.74% (+3.11 ms)

5

Akamai

215 ms 

+20% (+36 ms)

The difference between Cloudflare and the third-fastest network is a little over a millisecond!  As we’ve pointed out previously, such fluctuations are quite common, especially at higher percentiles.  But there is still a significant difference between us and the slowest network; we’re around 20% faster.

However, looking at a place like Japan where were not the fastest in September 2024 but are now the fastest, there is a significant difference between Cloudflare and the number two network:

Japan 95th Percentile TCP Connect Time by Provider

Rank

Entity

Connect Time (P95)

#1 Diff

1

Cloudflare

116 ms

2

Fastly

122 ms

+5.23% (+6.08 ms)

3

Google

124 ms

+6.21% (+7.22 ms)

4

CloudFront

127 ms

+8.91% (+10 ms)

5

Akamai

153 ms 

+32% (+37 ms)

Why is this? We are in more locations in Japan than our competitors and added more Edge Partner deployments in these locations, bringing us even closer to end-users. Edge Partner deployments are collaborations with ISPs, where we take space in their data centers, and peer with them directly. 

Why?

Why do we track our network performance like this? The answer is simple: to improve user experience. This data allows us to track a key performance metric for Cloudflare and the other networks. When we see that we’re lagging in a region, it serves as a signal to dig deeper into our network. 

This data is a gold mine for the teams tasked with improving Cloudflare’s network. When there are countries where Cloudflare is behind, it gives us signals for where we should expand or investigate. If we’re slow, we may need to invest in additional peering. If a region we have invested in heavily is slower, we may need to investigate our hardware.  The example from Japan shows exactly how this can benefit: we took a location where we were previously on par with our competitors, added peering in new locations, and we pulled ahead. 

On top of this map, we have autonomous system (ASN) level granularity on how we are performing on each one of the top 1000 eyeball networks, and we continuously optimize our traffic flow with each of them.  This allows us to track individual networks that may lag and improve the customer experience in those networks through turning up peering, or even adding new deployments in those regions. 

What’s next?

We’re sharing our updates on our journey to become #1 everywhere so that you can see what goes into running the fastest network in the world. From here, our plan is the same as always: identify where we’re slower, fix it, and then tell you how we’ve gotten faster.

Network performance update: Birthday Week 2024

Post Syndicated from Emily Music original https://blog.cloudflare.com/network-performance-update-birthday-week-2024

When it comes to the Internet, everyone wants to be the fastest. At Cloudflare, we’re no different. We want to be the fastest network in the world, and are constantly working towards that goal. Since June 2021, we’ve been measuring and ranking our network performance against the top global networks. We use this data to improve our performance, and to share the results of those initiatives. 

In this post, we’re going to share with you how our network performance has changed since our last post in March 2024, and discuss the tools and processes we are using to assess network performance. 

Digging into the data

Cloudflare has been measuring network performance across these top networks from the top 1,000 ISPs by estimated population (according to the Asia Pacific Network Information Centre (APNIC)), and optimizing our network for ISPs and countries where we see opportunities to improve. For performance benchmarking, we look at TCP connection time. This is the time it takes an end user to connect to the website or endpoint they are trying to reach. We chose this metric to show how our network helps make your websites faster by serving your traffic where your customers are. Back in June 2021, Cloudflare was ranked #1 in 33% of the networks.

As of September 2024, examining 95th percentile (p95) TCP connect times measured from September 4 to September 19, Cloudflare is the #1 provider in 44% of the top 1000 networks:


This graph shows that we are fastest in 410 networks, but that would only make us the fastest in 41% of the top 1,000. To make sure we’re looking at the networks that eyeballs connect from, we exclude networks like transit networks that aren’t last-mile ISPs. That brings the number of measured networks to 932, which makes us fastest in 44% of ISPs.

Now let’s take a look at the fastest provider by country. The map below shows the top network by 95th percentile TCP connection time, and Cloudflare is fastest in many countries. For those where we weren’t, we were within a few milliseconds of our closest competitor.


This color coding is generated by grouping all the measurements we generate by which country the measurement originates from, and then looking at the 95th percentile measurements for each provider to see who is the fastest. This is in contrast to how we measure who is fastest in individual networks, which involves grouping the measurements by ISP and measuring which provider is fastest. Cloudflare is still the fastest provider in 44% of the measured networks, which is consistent with our March report. Cloudflare is also the fastest in many countries, but the map is less orange than it was when we published our measurements from March 2024:


This can be explained by the fact that the fastest provider in a country is often determined by latency differences so small it is often less than 5% faster than the second-fastest provider. As an example, let’s take a look at India, a country where we are now the fastest:

India performance by provider

Rank

Entity

95th percentile TCP Connect (ms)

Difference from #1

1

Cloudflare

290 ms

2

Google

291 ms

+0.28% (+0.81 ms)

3

CloudFront

304 ms

+4.61% (+13 ms)

4

Fastly

325 ms

+12% (+35 ms)

5

Akamai

373 ms

+29% (+83 ms)

In India, we are the fastest network, but we are beating the runner-up by less than a millisecond, which shakes out to less than 1% difference between us and the number two! The competition for the number one spot in many countries is fierce and sometimes can be determined by what days you’re looking at the data, because variance in connectivity or last-mile outages can materially impact this data.

For example, on September 17, there was an outage on a major network in India, which impacted many users’ ability to access the Internet. People using this network were significantly impacted in their ability to connect to Cloudflare, and you can actually see that impact in the data. Here’s what the data looked like on the day of the outage, as we dropped from the number one spot that day:

India performance by provider

Rank

Entity

95th percentile TCP Connect (ms)

Difference from #1

1

Google

219 ms

2

CloudFront

230 ms

+5% (+11 ms)

3

Cloudflare

236 ms

+7.47% (+16 ms)

4

Fastly

261 ms

+19% (+41 ms)

5

Akamai

286 ms

+30% (+67 ms)

We were impacted more than other providers here because our automated traffic management systems detected the high packet loss as a result of the outage and aggressively moved all of our traffic away from the impacted provider. After review internally, we have identified opportunities to improve our traffic management to be more fine-grained in our approach to outages of this type, which would help us continue to be fast despite changes in the surrounding ecosystem. These unplanned occurrences can happen to any network, but these events also provide us opportunities to improve and mitigate the randomness we see on the Internet.

The phenomenon of providers having fluctuating latencies can also work against us. Consider Poland, a country where we were the fastest provider in March, but are no longer the fastest provider today. Digging into the data a bit more, we can see that even though we are no longer the fastest, we’re slower by less than a millisecond, giving us confidence that our architecture is optimized for performance in the region:

Poland performance by provider

Rank

Entity

95th percentile TCP Connect (ms)

Difference from #1

1

Google

246 ms

2

Cloudflare

246 ms

+0.15% (+0.36 ms)

3

CloudFront

250 ms

+1.7% (+4.17 ms)

4

Akamai

272 ms

+11% (+26 ms)

5

Fastly

295 ms

+20% (+50 ms)

These nuances in the data can make us look slower in more countries than we actually are. From a numbers’ perspective we’re neck-and-neck with our competitors and still fastest in the most networks around the world. Going forward, we’re going to take a longer look at how we’re visualizing our network performance to paint a clearer picture for you around performance. But let’s go into more about how we actually get the underlying data we use to measure ourselves.

How we measure performance

When you see a Cloudflare-branded error page, something interesting happens behind the scenes. Every time one of these error pages is displayed, Cloudflare gathers Real User Measurements (RUM) by fetching a tiny file from various networks, including Cloudflare, Akamai, Amazon CloudFront, Fastly, and Google Cloud CDN. Your browser sends back performance data from the end-user’s perspective, helping us get a clear view of how these different networks stack up in terms of speed. The main goal? Figure out where we’re fast, and more importantly, where we can make Cloudflare even faster. If you’re curious about the details, the original Speed Week blog post dives deeper into the methodology.

Using this RUM data, we track key performance metrics such as TCP Connection Time, Time to First Byte (TTFB), and Time to Last Byte (TTLB) for Cloudflare and the other networks. 

Starting from March, we fixed the list of networks we look at to be the top 1000 networks by estimated population as determined by APNIC, and we removed networks that weren’t last-mile ISPs. This change makes our measurements and reporting more consistent because we look at the same set of networks for every reporting cycle.

How does Cloudflare use this data?

Cloudflare uses this data to improve our network performance in lagging regions. For example, in 2022 we recognized that performance on a network in Finland was not as fast as some comparable regions. Users were taking 300+ ms to connect to Cloudflare at the 95th percentile:

Performance for Finland network

Rank

Entity

95th percentile TCP Connect (ms)

Difference from #1

1

Fastly

15 ms

2

CloudFront

19 ms

+19% (+3 ms)

3

Akamai

20 ms

+28% (+4.3 ms)

4

Google

72 ms

+363% (+56 ms)

5

Cloudflare

368 ms

+2378% (+353 ms)

After investigating, we recognized that one major network in Finland was seeing high latency due to issues resulting from congestion. Simply put, we were using all the capacity we had. We immediately planned an expansion, and within two weeks of that expansion completion, our latency decreased, and we became the fastest provider in the region, as you can see in the map above.

We are constantly improving our network and infrastructure to better serve our customers. Data like this helps us identify where we can be most impactful, and improve service for our customers. 

What’s next 

We’re sharing our updates on our journey to become as fast as we can be everywhere so that you can see what goes into running the fastest network in the world. From here, our plan is the same as always: identify where we’re slower, fix it, and then tell you how we’ve gotten faster.

Network performance update: Security Week 2024

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-security-week-2024


We constantly measure our own network’s performance against other networks, look for ways to improve our performance compared to them, and share the results of our efforts. Since June 2021, we’ve been sharing benchmarking results we’ve run against other networks to see how we compare.

In this post we are going to share the most recent updates since our last post in September, and talk about how we are getting as fast as we are.

How we stack up

Since June 2021, we’ve been taking a close look at the most reported eyeball-facing ISPs and taking actions for the specific networks where we have some room for improvement. Cloudflare was already the fastest provider for TCP Connection time at the 95th percentile for 44% of the networks around the world (we define a network as country and AS number pair). We chose this metric to show how our network helps make your websites faster by getting you to where your customers are. Taking a look at the numbers, in July 2022, Cloudflare was ranked #1 in 33% of the networks and was within 2 ms (95th percentile TCP Connection Time) or 5% of the #1 provider for 8% of the networks that we measured. For reference, our closest competitor was the fastest for 20% of networks.

As of August 30, 2023, Cloudflare was the fastest provider for 44% of networks — and was within 2 ms (95th percentile TCP Connection Time) or 5% of the fastest provider for 10% of the networks that we measured—whereas our closest competitor (Amazon Cloudfront) was the fastest for 19% of networks. As of February 15, 2024, we are still #1 in 44% of networks for 95th percentile TCP Connection Time. Let’s dig into the data.

Lightning fast

Looking at 95th percentile TCP connect times from November 18, 2023, to February 15, 2024, Cloudflare is the #1 provider in 44% of the top 1000 networks:

Our P95 TCP Connection time has been trending down since November, and we are consistently 50ms faster at P95 than our closest competitor (Amazon CloudFront):

Connect time comparisons between providers at 50th and 95th percentile
P50 Connect (ms) P95 Connect (ms)
Cloudflare 130 579
Amazon 145 637
Google 190 772
Akamai 195 774
Fastly 189 734

These graphs show that day over day, Cloudflare was consistently the fastest provider. They also show the gaps between Cloudflare and the other competitors.  When you look at the 95th percentile, Cloudflare is almost 200ms faster than Akamai across the world for connect times. This shows that our network reaches more places and allows users to get their content faster than Akamai on a consistent basis.

When we aggregate this data over the whole time period, Cloudflare is the fastest in the most networks. For that whole time span of November 18, 2023, to February 15, 2024, Cloudflare was number 1 in 73% of networks for mean TCP connection time:

Looking at a map plotting by 95th percentile TCP connect time, Cloudflare is the fastest in the most countries, and you can see this by the fact that most of the map is orange:

For comparison, here’s what the map looked like in September 2023:

These numbers show that we’re reducing the overall TCP connection time around the world while simultaneously staying ahead of the competition. Let’s talk about how we get these numbers and what we’re doing to make you even faster.

Measuring What Matters

As a quick reminder, here’s how we get the data for our measurements: when users receive a Cloudflare-branded error page, we use Real User Measurements (RUM) and fetch a small file from Cloudflare, Akamai, Amazon CloudFront, Fastly, and Google Cloud CDN. Browsers around the world report the performance of those providers from the perspective of the end-user network they are on. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post.

Using the RUM data, we measure various performance metrics, such as TCP Connection Time, Time to First Byte (TTFB), and Time to Last Byte (TTLB), for ourselves and other providers.

If we only collect data from a browser when we return an error page, you could see how variable the data can get: if one network or website is having a problem in a certain country, that country could overreport, meaning those networks would be more highly weighted in the calculations because more users reported from that network during a given time period.

For example, if a lot of users connecting over a small Brazilian network were generating reports because their websites were throwing errors more frequently, that could make this small network look a lot bigger to us. This small network in Brazil could have as many reports as Claro, a major network in the region, despite them being totally different when you look at the number of subscribers.  If we only look at the networks that report to us the most, it could cause smaller networks with fewer subscribers to be treated as more important because of point-in-time error conditions.

This phenomenon could cause the networks we look at to change week over week. Going back to the Brazil example, if the website that was throwing a bunch of errors fixed their problem, and we no longer saw measurements from that network, they may not show up as a “most reported network” depending on when we look at the data. This means that the networks we look at to consider where we are fastest are dependent on which networks are sending us the most reports at any given time, which is not optimal if we’re trying to get faster in these networks. We need to be able to get a consistent signal on these networks to understand where we’re faster and where we’re not.

We’ve addressed this issue by creating a fixed list of the networks we want to look at. We did this by looking at public stats on user population by network and then comparing that with our sample sizes by network until we identified the 1000 networks we want to examine.  This ensures that day over day, the networks we look at are the same.

Now let’s talk about what makes us faster in more places than other networks: HTTP/3.

Blazing fast speeds with HTTP/3

One reason why Cloudflare is the fastest in the most networks is because we’ve been leading the charge with adoption and usage of HTTP/3 on our platform.  HTTP/3 allows for faster connectivity behavior which means we can get connections established faster and get data flowing. HTTP/3 is currently used by around 31% of Internet traffic:

To show that HTTP/3 improves connection times, we looked at two different Cloudflare endpoints that these tests ran against: one with HTTP/3 enabled and one with HTTP/3 disabled. The performance difference between the two is night and day.  Here’s a table showing the performance difference for 95th percentile connect time between Cloudflare zones when one zone has HTTP/3 enabled:

P50 connect (ms) P95 connect (ms)
Cloudflare HTTP/3 130 579
Cloudflare non-HTTP/3 174 695

At P95, Cloudflare is 116 ms faster for connection times when HTTP/3 is enabled. This performance gain helps us be the fastest in the most networks.

But why does HTTP/3 help make us faster? HTTP/3 allows for faster connection setup times, which lets us take greater advantage of our global network footprint to be the fastest in the most networks. HTTP/3 is built on top of the QUIC protocol, which multiplexes UDP packets to allow for parallel streams to be sent at the same time. This means that TLS encryption can happen in parallel with connection establishment, shortening the amount of time that is needed to set up a secure connection. Paired with Cloudflare’s network that is incredibly close to end-users, this makes for significant latency reductions on user Connect times. All major browsers have HTTP/3 enabled by default, so you too can realize these latency improvements by enabling HTTP/3 on your website today.

What’s next

We’re sharing our updates on our journey to become #1 everywhere so that you can see what goes into running the fastest network in the world. From here, our plan is the same as always: identify where we’re slower, fix it, and then tell you how we’ve gotten faster.

Network performance update: Birthday Week 2023

Post Syndicated from David Tuber original http://blog.cloudflare.com/network-performance-update-birthday-week-2023/

Network performance update: Birthday Week 2023

Network performance update: Birthday Week 2023

We constantly measure our own network’s performance against other networks, look for ways to improve our performance compared to them, and share the results of our efforts. Since June 2021, we’ve been sharing benchmarking results we’ve run against other networks to see how we compare.

In this post we are going to share the most recent updates since our last post in June, and tell you about our tools and processes that we use to monitor and improve our network performance.

How we stack up

Since June 2021, we’ve been taking a close look at every single network and taking actions for the specific networks where we have some room for improvement. Cloudflare was already the fastest provider for most of the networks around the world (we define a network as country and AS number pair). Taking a closer look at the numbers; in July 2022, Cloudflare was ranked #1 in 33% of the networks and was within 2 ms (95th percentile TCP Connection Time) or 5% of the #1 provider for 8% of the networks that we measured. For reference, our closest competitor on that front was the fastest for 20% of networks.

As of August 30, 2023, Cloudflare is the fastest provider for 44% of networks—and was within 2 ms (95th percentile TCP Connection Time) or 5% of the fastest provider for 10% of the networks that we measured—whereas our closest competitor is now the fastest for 19% of networks.

Below is the change in percentage of networks in which each provider is the fastest plotted over time.

Network performance update: Birthday Week 2023

Cloudflare is maintaining our steady growth in the percentage of networks where we’re the fastest. Despite the slight tick down the past couple of months, the trendline is still positive and with a higher rate of increase than other networks.

Now that we’ve reviewed how we stack up compared to other networks, let’s dig a little more into the other metrics we use to make us the fastest.

Our tooling

To provide insight into network performance, we use Real User Measurements (RUM) and fetch a small file from Cloudflare, Akamai, Amazon CloudFront, Fastly and Google Cloud CDN. Browsers around the world report the performance of those providers from the perspective of the end-user network they are on. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

Using the RUM data, we are able to measure various performance metrics, such as TCP Connection Time, Time to First Byte (TTFB), Time to Last Byte (TTLB), for ourselves and other networks.

Let’s take a look at some of the metrics we monitor and what’s changed since our last blog in June.

The first metric we closely monitor is the percent of networks that we are ranked #1 in terms of TCP Connection Time. That's a key performance indicator that we evaluate ourselves against. This first line of the table shows that Cloudflare was ranked #1 in 45% of networks in June 2023 and 44% in August 2023. Here’s the full picture of how we looked in June versus how we look today.

Cloudflare’s rank by TCP connection time % of networks in June 2023 % of networks in August 2023
1 45 44
2 26 24
3 16 16
4 9 10
5 4 6

Overall, these metrics align with what we saw above: Cloudflare is still the fastest provider in the most last mile networks, and while there has been slight changes in the month-to-month fluctuations, the overall trend shows us as being the fastest.

The second metric we monitor is our overall performance in each country. This gives us visibility into the countries or regions that we need to pay closer attention to and take action towards improving our performance. Those actions will be listed later. Orange indicates the countries that Cloudflare is the fastest provider based on the TCP Connection Time. Here’s how we look as of September 2023:

Network performance update: Birthday Week 2023

For comparison, this is what that map looks like from June 2023:

Network performance update: Birthday Week 2023

We’ve become faster in Iran and Paraguay, and in the few cases where we are no longer number 1, we are within 2ms of the fastest provider. In Brazil and Norway for example, we trail Fastly by only 1ms. In various countries in Africa, Amazon CloudFront pulled ahead but only by 2ms. We aim to fix that in the coming weeks and months and return to the #1 spot there also.

The third set of metrics we use are TCP Connection Time and TTLB. The number of networks where we are #1 in terms of 95th percentile TCP Connection Time is one of our key performance indicators. We actively monitor and work on improving that metric so that we are #1 in the most metrics for 95th percentile TCP Connection Time. For September 2023, we are still #1 in the most networks for TCP Connection Time, more than double the next best provider.

Network performance update: Birthday Week 2023

Provider # of networks where the provider is fastest for 95th percentile TCP connection time
Cloudflare 826
Google 392
Fastly 348
Cloudfront 337
Akamai 52

The way we achieve these great results is by having our engineering teams constantly investigate the underlying reasons for degraded performance if there are any, and we track open work items until they are resolved.

What’s next

We’re sharing our updates on our journey to become #1 everywhere so that you can see what goes into running the fastest network in the world. From here, our plan is the same as always: identify where we’re slower, fix it, and then tell you how we’ve gotten faster.

Network performance update: Speed Week 2023

Post Syndicated from Onur Karaagaoglu original http://blog.cloudflare.com/speed-week-2023-network-performance-update/

Network performance update: Speed Week 2023

Network performance update: Speed Week 2023

We constantly measure our own and other networks' performance, and look for ways to improve our performance; and share our results.

In this post we are going to share the most recent updates, and tell you about our tools and processes that we use to monitor and improve our network performance.

First, the results

In July, 2022, we started taking a more granular look down to every single network and taking actions for the specific networks where we have some room for improvement. Cloudflare was already the fastest provider for most of the networks around the world (we define a network as country and AS number pair). Taking a closer look at the numbers, Cloudflare was ranked #1 in 33% of the networks and was within 2 ms or 5% of the #1 provider for 8% of the networks that we measured in terms of the 95th percentile TCP Connection Time. For reference, our closest competitor on that front was the fastest for 20% of networks.

As of May 31, 2023 those numbers have improved significantly. Today, Cloudflare is the fastest provider for 46% of networks—and was within 2 ms (95th percentile TCP Connection Time) or 5% of the fastest provider for 10% of the networks that we measured—whereas our closest competitor is now the fastest for 18% of networks.

Below is the change in percentage of networks that each provider is the fastest over time for Cloudflare and other services.

Network performance update: Speed Week 2023

Our tooling and process

We use Real User Measurements (RUM) and fetch a small file from Cloudflare, Akamai, CloudFront, Fastly and Google Cloud CDN. Browsers around the world report the performance of those providers from the perspective of the end-user network they are on. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

Using the RUM data, we are able to measure various performance metrics, such as TCP Connection Time, Time to First Byte (TTFB), Time to Last Byte (TTLB), for ourselves and other networks.

One of the most important tools that we use for measuring and improving our performance is what we call Performance Benchmarks Dashboard. That's the dashboard where we can analyze the data that we collect in different dimensions.

Here are the metrics that we monitor based on some of the dimensions.

The first metric we closely monitor is the percent of networks that we are ranked #1 in terms of TCP Connection Time. That's a key performance indicator that we evaluate ourselves against.

Network performance update: Speed Week 2023

The second metric we monitor is our overall performance in each country. This gives us visibility into the countries or regions that we need to pay closer attention to and take action towards improving our performance. Those actions will be listed later. Orange indicates the countries that Cloudflare is the fastest provider based on the TCP Connection Time.

Network performance update: Speed Week 2023

The third set of metrics we use are TCP Connection Time and TTLB. The number of networks where we are #1 in terms of 95th percentile TCP Connection Time is one of our key performance indicators. We actively monitor and work on improving that metric. More on that later.

Network performance update: Speed Week 2023

Using all the metrics listed above, the Performance Benchmarks Dashboard helps us to find networks where we can improve our performance. Our engineering teams monitor that dashboard and investigate the underlying reasons for degraded performance if there are any and the action items are displayed on the dashboard until they are resolved.

Network performance update: Speed Week 2023

Once we identify a particular network to improve, we investigate the root cause and document the action items to improve our performance. Those actions generally fall under three categories.

The first category is establishing peering with that network in a specific location so that users can take the optimum path. That’s a critical component of a better Internet! Here is our more detailed blog post about that from earlier this week.

The second category is expanding our compute capacity in a specific data center so that we can serve the users at the closest data center.

And finally, we apply traffic engineering actions to make sure that the network is served in the optimum way. Traffic engineering actions are generally manual configurations that we apply, in case the path that’s chosen by the routing protocols is not the most performant path.

What’s next

The data we collect gives us a granular view of every network that connects to Cloudflare and we constantly optimize our infrastructure to improve Cloudflare’s performance. We won’t rest until we’re #1 everywhere.

Spotlight on Zero Trust: We’re fastest and here’s the proof

Post Syndicated from David Tuber original http://blog.cloudflare.com/spotlight-on-zero-trust/

Spotlight on Zero Trust: We're fastest and here's the proof

Spotlight on Zero Trust: We're fastest and here's the proof

In January and in March we posted blogs outlining how Cloudflare performed against others in Zero Trust. The conclusion in both cases was that Cloudflare was faster than Zscaler and Netskope in a variety of Zero Trust scenarios. For Speed Week, we’re bringing back these tests and upping the ante: we’re testing more providers against more public Internet endpoints in more regions than we have in the past.

For these tests, we tested three Zero Trust scenarios: Secure Web Gateway (SWG), Zero Trust Network Access (ZTNA), and Remote Browser Isolation (RBI). We tested against three competitors: Zscaler, Netskope, and Palo Alto Networks. We tested these scenarios from 12 regions around the world, up from the four we’d previously tested with. The results are that Cloudflare is the fastest Secure Web Gateway in 42% of testing scenarios, the most of any provider. Cloudflare is 46% faster than Zscaler, 56% faster than Netskope, and 10% faster than Palo Alto for ZTNA, and 64% faster than Zscaler for RBI scenarios.

In this blog, we’ll provide a refresher on why performance matters, do a deep dive on how we’re faster for each scenario, and we’ll talk about how we measured performance for each product.

Performance is a threat vector

Performance in Zero Trust matters; when Zero Trust performs poorly, users disable it, opening organizations to risk. Zero Trust services should be unobtrusive when the services become noticeable they prevent users from getting their job done.

Zero Trust services may have lots of bells and whistles that help protect customers, but none of that matters if employees can’t use the services to do their job quickly and efficiently. Fast performance helps drive adoption and makes security feel transparent to the end users. At Cloudflare, we prioritize making our products fast and frictionless, and the results speak for themselves. So now let’s turn it over to the results, starting with our secure web gateway.

Cloudflare Gateway: security at the Internet

A secure web gateway needs to be fast because it acts as a funnel for all of an organization’s Internet-bound traffic. If a secure web gateway is slow, then any traffic from users out to the Internet will be slow. If traffic out to the Internet is slow, users may see web pages load slowly, video calls experience jitter or loss, or generally unable to do their jobs. Users may decide to turn off the gateway, putting the organization at risk of attack.

In addition to being close to users, a performant web gateway needs to also be well-peered with the rest of the Internet to avoid slow paths out to websites users want to access. Many websites use CDNs to accelerate their content and provide a better experience. These CDNs are often well-peered and embedded in last mile networks. But traffic through a secure web gateway follows a forward proxy path: users connect to the proxy, and the proxy connects to the websites users are trying to access. If that proxy isn’t as well-peered as the destination websites are, the user traffic could travel farther to get to the proxy than it would have needed to if it was just going to the website itself, creating a hairpin, as seen in the diagram below:

Spotlight on Zero Trust: We're fastest and here's the proof

A well-connected proxy ensures that the user traffic travels less distance making it as fast as possible.

To compare secure web gateway products, we pitted the Cloudflare Gateway and WARP client against Zscaler, Netskope, and Palo Alto which all have products that perform the same functions. Cloudflare users benefit from Gateway and Cloudflare’s network being embedded deep into last mile networks close to users, being peered with over 12,000 networks. That heightened connectivity shows because Cloudflare Gateway is the fastest network in 42% of tested scenarios:

Spotlight on Zero Trust: We're fastest and here's the proof

Number of testing scenarios where each provider is fastest for 95th percentile HTTP Response time (higher is better)
Provider Scenarios where this provider is fastest
Cloudflare 48
Zscaler 14
Netskope 10
Palo Alto Networks 42

This data shows that we are faster to more websites from more places than any of our competitors. To measure this, we look at the 95th percentile HTTP response time: how long it takes for a user to go through the proxy, have the proxy make a request to a website on the Internet, and finally return the response. This measurement is important because it’s an accurate representation of what users see. When we look at the 95th percentile across all tests, we see that Cloudflare is 2.5% faster than Palo Alto Networks, 13% faster than Zscaler, and 6.5% faster than Netskope.

95th percentile HTTP response across all tests
Provider 95th percentile response (ms)
Cloudflare 515
Zscaler 595
Netskope 550
Palo Alto Networks 529

Cloudflare wins out here because Cloudflare’s exceptional peering allows us to succeed in places where others were not able to succeed. We are able to get locally peered in hard-to-reach places on the globe, giving us an edge. For example, take a look at how Cloudflare performs against the others in Australia, where we are 30% faster than the next fastest provider:

Spotlight on Zero Trust: We're fastest and here's the proof

Cloudflare establishes great peering relationships in countries around the world: in Australia we are locally peered with all of the major Australian Internet providers, and as such we are able to provide a fast experience to many users around the world. Globally, we are peered with over 12,000 networks, getting as close to end users as we can to shorten the time requests spend on the public Internet. This work has previously allowed us to deliver content quickly to users, but in a Zero Trust world, it shortens the path users take to get to their SWG, meaning they can quickly get to the services they need.

Previously when we performed these tests, we only tested from a single Azure region to five websites. Existing testing frameworks like Catchpoint are unsuitable for this task because performance testing requires that you run the SWG client on the testing endpoint. We also needed to make sure that all of the tests are running on similar machines in the same places to measure performance as well as possible. This allows us to measure the end-to-end responses coming from the same location where both test environments are running.

In our testing configuration for this round of evaluations, we put four VMs in 12 cloud regions side by side: one running Cloudflare WARP connecting to our gateway, one running ZIA, one running Netskope, and one running Palo Alto Networks. These VMs made requests every five minutes to the 11 different websites mentioned below and logged the HTTP browser timings for how long each request took. Based on this, we are able to get a user-facing view of performance that is meaningful. Here is a full matrix of locations that we tested from, what websites we tested against, and which provider was faster:

Endpoints
SWG Regions Shopify Walmart Zendesk ServiceNow Azure Site Slack Zoom Box M365 GitHub Bitbucket
East US Cloudflare Cloudflare Palo Alto Networks Cloudflare Palo Alto Networks Cloudflare Palo Alto Networks Cloudflare
West US Palo Alto Networks Palo Alto Networks Cloudflare Cloudflare Palo Alto Networks Cloudflare Palo Alto Networks Cloudflare
South Central US Cloudflare Cloudflare Palo Alto Networks Cloudflare Palo Alto Networks Cloudflare Palo Alto Networks Cloudflare
Brazil South Cloudflare Palo Alto Networks Palo Alto Networks Palo Alto Networks Zscaler Zscaler Zscaler Palo Alto Networks Cloudflare Palo Alto Networks Palo Alto Networks
UK South Cloudflare Palo Alto Networks Palo Alto Networks Palo Alto Networks Palo Alto Networks Palo Alto Networks Palo Alto Networks Cloudflare Palo Alto Networks Palo Alto Networks Palo Alto Networks
Central India Cloudflare Cloudflare Cloudflare Palo Alto Networks Palo Alto Networks Cloudflare Cloudflare Cloudflare
Southeast Asia Cloudflare Cloudflare Cloudflare Cloudflare Palo Alto Networks Cloudflare Cloudflare Cloudflare
Canada Central Cloudflare Cloudflare Palo Alto Networks Cloudflare Cloudflare Palo Alto Networks Palo Alto Networks Palo Alto Networks Zscaler Cloudflare Zscaler
Switzerland North netskope Zscaler Zscaler Cloudflare netskope netskope netskope netskope Cloudflare Cloudflare netskope
Australia East Cloudflare Cloudflare netskope Cloudflare Cloudflare Cloudflare Cloudflare Cloudflare
UAE Dubai Zscaler Zscaler Cloudflare Cloudflare Zscaler netskope Palo Alto Networks Zscaler Zscaler netskope netskope
South Africa North Palo Alto Networks Palo Alto Networks Palo Alto Networks Zscaler Palo Alto Networks Palo Alto Networks Palo Alto Networks Palo Alto Networks Zscaler Palo Alto Networks Palo Alto Networks

Blank cells indicate that tests to that particular website did not report accurate results or experienced failures for over 50% of the testing period. Based on this data, Cloudflare is generally faster, but we’re not as fast as we’d like to be. There are still some areas where we need to improve, specifically in South Africa, UAE, and Brazil. By Birthday Week in September, we want to be the fastest to all of these websites in each of these regions, which will bring our number up from fastest in 54% of tests to fastest in 79% of tests.

To summarize, Cloudflare’s Gateway is still the fastest SWG on the Internet. But Zero Trust isn’t all about SWG. Let’s talk about how Cloudflare performs in Zero Trust Network Access scenarios.

Instant (Zero Trust) access

Access control needs to be seamless and transparent to the user: the best compliment for a Zero Trust solution is for employees to barely notice it’s there. Services like Cloudflare Access protect applications over the public Internet, allowing for role-based authentication access instead of relying on things like a VPN to restrict and secure applications. This form of access management is more secure, but with a performant ZTNA service, it can even be faster.

Cloudflare outperforms our competitors in this space, being 46% faster than Zscaler, 56% faster than Netskope, and 10% faster than Palo Alto Networks:

Spotlight on Zero Trust: We're fastest and here's the proof

Zero Trust Network Access P95 HTTP Response times
Provider P95 HTTP response (ms)
Cloudflare 1252
Zscaler 2388
Netskope 2974
Palo Alto Networks 1471

For this test, we created applications hosted in three different clouds in 12 different locations: AWS, GCP, and Azure. However, it should be noted that Palo Alto Networks was the exception, as we were only able to measure them using applications hosted in one cloud from two regions due to logistical challenges with setting up testing: US East and Singapore.

For each of these applications, we created tests from Catchpoint that accessed the application from 400 locations around the world. Each of these Catchpoint nodes attempted two actions:

  • New Session: log into an application and receive an authentication token
  • Existing Session: refresh the page and log in passing the previously obtained credentials

We like to measure these scenarios separately, because when we look at 95th percentile values, we would almost always be looking at new sessions if we combined new and existing sessions together. For the sake of completeness though, we will also show the 95th percentile latency of both new and existing sessions combined.

Cloudflare was faster in both US East and Singapore, but let’s spotlight a couple of regions to delve into. Let’s take a look at a region where resources are heavily interconnected equally across competitors: US East, specifically Ashburn, Virginia.

In Ashburn, Virginia, Cloudflare handily beats Zscaler and Netskope for ZTNA 95th percentile HTTP Response:

95th percentile HTTP Response times (ms) for applications hosted in Ashburn, VA
AWS East US Total (ms) New Sessions (ms) Existing Sessions (ms)
Cloudflare 2849 1749 1353
Zscaler 5340 2953 2491
Netskope 6513 3748 2897
Palo Alto Networks
Azure East US
Cloudflare 1692 989 1169
Zscaler 5403 2951 2412
Netskope 6601 3805 2964
Palo Alto Networks
GCP East US
Cloudflare 2811 1615 1320
Zscaler
Netskope 6694 3819 3023
Palo Alto Networks 2258 894 1464

You might notice that Palo Alto Networks looks to come out ahead of Cloudflare for existing sessions (and therefore for overall 95th percentile). But these numbers are misleading because Palo Alto Networks’ ZTNA behavior is slightly different than ours, Zscaler’s, or Netskope’s. When they perform a new session, it does a full connection intercept and returns a response from its processors instead of directing users to the login page of the application they are trying to access.

This means that Palo Alto Networks' new session response times don’t actually measure the end-to-end latency we’re looking for. Because of this, their numbers for new session latency and total session latency are misleading, meaning we can only meaningfully compare ourselves to them for existing session latency. When we look at existing sessions, when Palo Alto Networks acts as a pass-through, Cloudflare still comes out ahead by 10%.

This is true in Singapore as well, where Cloudflare is 50% faster than Zscaler and Netskope, and also 10% faster than Palo Alto Networks for Existing Sessions:

95th percentile HTTP Response times (ms) for applications hosted in Singapore
AWS Singapore Total (ms) New Sessions (ms) Existing Sessions (ms)
Cloudflare 2748 1568 1310
Zscaler 5349 3033 2500
Netskope 6402 3598 2990
Palo Alto Networks
Azure Singapore
Cloudflare 1831 1022 1181
Zscaler 5699 3037 2577
Netskope 6722 3834 3040
Palo Alto Networks
GCP Singapore
Cloudflare 2820 1641 1355
Zscaler 5499 3037 2412
Netskope 6525 3713 2992
Palo Alto Networks 2293 922 1476

One critique of this data could be that we’re aggregating the times of all Catchpoint nodes together at P95, and we’re not looking at the 95th percentile of Catchpoint nodes in the same region as the application. We looked at that, too, and Cloudflare’s ZTNA performance is still better. Looking at only North America-based Catchpoint nodes, Cloudflare performs 50% better than Netskope, 40% better than Zscaler, and 10% better than Palo Alto Networks at P95 for warm connections:

Spotlight on Zero Trust: We're fastest and here's the proof

Zero Trust Network Access 95th percentile HTTP Response times for warm connections with testing locations in North America
Provider P95 HTTP response (ms)
Cloudflare 810
Zscaler 1290
Netskope 1351
Palo Alto Networks 871

Finally, one thing we wanted to show about our ZTNA performance was how well Cloudflare performed per cloud per region. This below chart shows the matrix of cloud providers and tested regions:

Fastest ZTNA provider in each cloud provider and region by 95th percentile HTTP Response
AWS Azure GCP
Australia East Cloudflare Cloudflare Cloudflare
Brazil South Cloudflare Cloudflare N/A
Canada Central Cloudflare Cloudflare Cloudflare
Central India Cloudflare Cloudflare Cloudflare
East US Cloudflare Cloudflare Cloudflare
South Africa North Cloudflare Cloudflare N/A
South Central US N/A Cloudflare Zscaler
Southeast Asia Cloudflare Cloudflare Cloudflare
Switzerland North N/A N/A Cloudflare
UAE Dubai Cloudflare Cloudflare Cloudflare
UK South Cloudflare Cloudflare netskope
West US Cloudflare Cloudflare N/A

There were some VMs in some clouds that malfunctioned and didn’t report accurate data. But out of 30 available cloud regions where we had accurate data, Cloudflare was the fastest ZT provider in 28 of them, meaning we were faster in 93% of tested cloud regions.

To summarize, Cloudflare also provides the best experience when evaluating Zero Trust Network Access. But what about another piece of the puzzle: Remote Browser Isolation (RBI)?

Remote Browser Isolation: a secure browser hosted in the cloud

Remote browser isolation products have a very strong dependency on the public Internet: if your connection to your browser isolation product isn’t good, then your browser experience will feel weird and slow. Remote browser isolation is extraordinarily dependent on performance to feel smooth and seamless to the users: if everything is fast as it should be, then users shouldn’t even notice that they’re using browser isolation.

For this test, we’re again pitting Cloudflare against Zscaler. While Netskope does have an RBI product, we were unable to test it due to it requiring a SWG client, meaning we would be unable to get full fidelity of testing locations like we would when testing Cloudflare and Zscaler. Our tests showed that Cloudflare is 64% faster than Zscaler for remote browsing scenarios: Here’s a matrix of fastest provider per cloud per region for our RBI tests:

Fastest RBI provider in each cloud provider and region by 95th percentile HTTP Response
AWS Azure GCP
Australia East Cloudflare Cloudflare Cloudflare
Brazil South Cloudflare Cloudflare Cloudflare
Canada Central Cloudflare Cloudflare Cloudflare
Central India Cloudflare Cloudflare Cloudflare
East US Cloudflare Cloudflare Cloudflare
South Africa North Cloudflare Cloudflare
South Central US Cloudflare Cloudflare
Southeast Asia Cloudflare Cloudflare Cloudflare
Switzerland North Cloudflare Cloudflare Cloudflare
UAE Dubai Cloudflare Cloudflare Cloudflare
UK South Cloudflare Cloudflare Cloudflare
West US Cloudflare Cloudflare Cloudflare

This chart shows the results of all of the tests run against Cloudflare and Zscaler to applications hosted on three different clouds in 12 different locations from the same 400 Catchpoint nodes as the ZTNA tests. In every scenario, Cloudflare was faster. In fact, no test against a Cloudflare-protected endpoint had a 95th percentile HTTP Response of above 2105 ms, while no Zscaler-protected endpoint had a 95th percentile HTTP response of below 5000 ms.

To get this data, we leveraged the same VMs to host applications accessed through RBI services. Each Catchpoint node would attempt to log into the application through RBI, receive authentication credentials, and then try to access the page by passing the credentials. We look at the same new and existing sessions that we do for ZTNA, and Cloudflare is faster in both new sessions and existing session scenarios as well.

Gotta go fast(er)

Our Zero Trust customers want us to be fast not because they want the fastest Internet access, but because they want to know that employee productivity won’t be impacted by switching to Cloudflare. That doesn’t necessarily mean that the most important thing for us is being faster than our competitors, although we are. The most important thing for us is improving our experience so that our users feel comfortable knowing we take their experience seriously. When we put out new numbers for Birthday Week in September and we’re faster than we were before, it won’t mean that we just made the numbers go up: it means that we are constantly evaluating and improving our service to provide the best experience for our customers. We care more that our customers in UAE have an improved experience with Office365 as opposed to beating a competitor in a test. We show these numbers so that we can show you that we take performance seriously, and we’re committed to providing the best experience for you, wherever you are.

Developer Week Performance Update: Spotlight on R2

Post Syndicated from David Tuber original http://blog.cloudflare.com/r2-is-faster-than-s3/

Developer Week Performance Update: Spotlight on R2

Developer Week Performance Update: Spotlight on R2

For developers, performance is everything. If your app is slow, it will get outclassed and no one will use it. In order for your application to be fast, every underlying component and system needs to be as performant as possible. In the past, we’ve shown how our network helps make your apps faster, even in remote places. We’ve focused on how Workers provides the fastest compute, even in regions that are really far away from traditional cloud datacenters.

For Developer Week 2023, we’re going to be looking at one of the newest Cloudflare developer offerings and how it compares to an alternative when retrieving assets from buckets: R2 versus Amazon Simple Storage Service (S3). Spoiler alert: we’re faster than S3 when serving media content via public access. Our test showed that on average, Cloudflare R2 was 20-40% faster than Amazon S3. For this test, we used 95th percentile Response tests, which measures the time it takes for a user to make a request to the bucket, and get the entirety of the response. This test was designed with the goal of measuring end-user performance when accessing content in public buckets.

In this blog we’re going to talk about why your object store needs to be fast, how much faster R2 is, why that is, and how we measured it.

Storage performance is user-facing

Storage performance is critical to a snappy user experience. Storage services are used for many scenarios that directly impact the end-user experience, particularly in the case where the data stored doesn’t end up in a cache (uncacheable or dynamic content). Compute and database services can rely on storage services, so if they’re not fast, the services using them won’t be either. Even the basic content fetching scenarios that use a CDN require storage services to be fast if the asset is either uncacheable or was not cached on the request: if the storage service is slow or far away, users will be impacted by that performance. And as every developer knows, nobody remembers the nine fast API calls if the last call was slow. Users don’t care about API calls, they care about overall experience. One slow API call, one slow image, one slow anything can muck up the works and provide users with a bad experience.

Because there are lots of different ways to use storage services, we’re going to focus on a relatively simple scenario: fetching static images from these services. Let’s talk about R2 and how it compares to one of the alternatives in this scenario: Amazon S3.

Benchmarking storage performance

When looking at uncached raw file delivery for users in North America retrieving content from a bucket in Ashburn, Virginia (US-East) and examining 95th percentile Response, R2 is 38% faster than S3:

Developer Week Performance Update: Spotlight on R2

Storage Performance: Response in North America (US-East)
95th percentile (ms)
Cloudflare R2 1,262
Amazon S3 2,055

For content hosted in US-East (Ashburn, VA) and only looking at North America-based eyeballs, R2 beats S3 by 30% in response time. When we look at why this is the case, the answer lies in our closeness to users and highly optimized HTTP stacks. Let’s take a look at the TCP connect and SSL times for these tests, which are the times it takes to reach the storage bucket (TCP connect) and the time to complete TLS handshake (SSL time):

Developer Week Performance Update: Spotlight on R2

Storage Performance: Connect and SSL in North America (US-East)
95th percentile connect (ms) 95th percentile SSL (ms)
Cloudflare R2 32 59
Amazon S3 78 180

Cloudflare’s cumulative connect + SSL time is almost 1/2 of Amazon’s SSL time alone. Being able to be fast on connection establishment gives us an edge right off the bat, especially in North America where cloud and storage providers tend to optimize for performance, and connect times tend to be low because ISPs have good peering with cloud and storage providers. But this isn’t just true in North America. Let’s take a look at Europe (EMEA) and Asia (APAC), where Cloudflare also beats out AWS in 95th percentile response time when we look at eyeballs in region for both regions:

Developer Week Performance Update: Spotlight on R2

Storage Performance: Response in EMEA (EU-East)
95th percentile (ms)
Cloudflare R2 1,303
Amazon S3 1,729

Cloudflare beats Amazon by 20% in EMEA. And when you look at the SSL times, you’ll see the same trends that were present in North America: faster Connect and SSL times:

Storage Performance: Connect and SSL in EMEA (EU-East)
95th percentile connect (ms) 95th percentile SSL (ms)
Cloudflare R2 57 94
Amazon S3 80 178

Again, the separator is how optimized Cloudflare is at setting up connections to deliver content. This is also true in APAC, where objects stored in Tokyo are served about 1.5 times faster on Cloudflare than for AWS:

Developer Week Performance Update: Spotlight on R2

Storage Performance: Response in APAC (Tokyo)
95th percentile (ms)
Cloudflare R2 4,057
Amazon S3 6,850

Focus on cross-region

Up until this point, we’ve been looking at scenarios where users are accessing data that is stored in the same region as they are. But what if that isn’t the case? What if a user in Germany is accessing content in Ashburn? In those cases, Cloudflare also pulls ahead. This is a chart showing 95th percentile response times for cases where users outside the United States are accessing content hosted in Ashburn, VA, or US-East:

Developer Week Performance Update: Spotlight on R2

Storage Performance: Response for users outside of US connecting to US-East
95th percentile (ms)
Cloudflare R2 3,224
Amazon S3 6,387

Cloudflare wins again, at almost 2x faster than S3 at P95. This data shows that not only do our in-region calls win out, but we win across the world. Even if you don’t have the money to buy storage everywhere in the world, R2 can still give you world-class performance because not only is R2 faster cross-region, R2’s default no-region setup ensures your data will be close to your users as often as possible.

Testing methodology

To measure these tests, we set up over 400 Catchpoint backbone nodes embedded in last mile ISPs around the world to retrieve a 1 MB file from R2 and S3 in specific locations: Ashburn, Tokyo, and Frankfurt. We recognize that many users will store larger files than the one we tested with, and we plan to test with larger files next showing that we’re faster delivering larger files as well.

We had these 400 nodes retrieve the file uncached from each storage service every 30 minutes for four days. We configured R2 to disable caching. This allows us to make sure that we aren’t reaping any benefits from our CDN pipeline and are only retrieving uncached files from the storage services.

Finally, we had to fix where the public buckets were stored in R2 to get an equivalent test compared to S3. You may notice that when configuring R2, you aren’t able to select specific datacenter locations like you can in AWS. Instead, you can provide a location hint to a general region. Cloudflare will store data anywhere in that region.

Developer Week Performance Update: Spotlight on R2

This feature is designed to make it easier for developers to deploy storage that benefits larger ranges of users as opposed to needing to know where specific datacenters are. However, that makes performance comparisons difficult, so for this test we configured R2 to store data in those specific locations (consistent with the S3 placement) on the backend as opposed to in any location in that region to ensure we would get better apples-to-apples results.

Putting the pieces together

Storage services like R2 are only part of the equation. Developers will often use storage services in conjunction with other compute services for a complete end-to-end application experience. Previously, we performed comparisons of Workers and other compute products such as Fastly’s Compute@Edge and AWS’s Lambda@Edge. We’ve rerun the numbers, and for compute times, Workers is still the fastest compute around, beating AWS Lambda@Edge and Fastly’s Compute@Edge for end-to-end performance for Rust hard tests:

Developer Week Performance Update: Spotlight on R2

Cloudflare is faster than Fastly for both JavaScript and Rust tests, while also being faster than AWS at JavaScript, which is the only test Lambda@Edge supports

To run these tests, we perform two tests against each provider: a complex JavaScript function, and a complex Rust function. These tests run as part of our network benchmarking tests that run from real user browsers around the world. For a more in-depth look at how we collect this data for Workers scenarios, check our previous Developer Week posts.

Here are the functions for both complex functions in JavaScript and Rust:

JavaScript complex function:

function testHardBusyLoop() {
  let value = 0;
  let offset = Date.now();

  for (let n = 0; n < 15000; n++) {
    value += Math.floor(Math.abs(Math.sin(offset + n)) * 10);
  }

  return value;
}

Rust complex function:

fn test_hard_busy_loop() -> i32 {
  let mut value = 0;
  let offset = Date::now().as_millis();

  for n in 0..15000 {
    value += (((offset + n) as f64).sin().abs() * 10.0) as i32;
  }

  value
}

By combining Workers and R2, you get a much simpler developer experience and a much faster user experience than you would get with any of the competition.

Storage, sped up and simplified

R2 is a unique storage service that doesn’t require the knowledge of specific locations, has a more global footprint, and integrates easily with existing Cloudflare Developer Platform products for a simple, performant experience for both users and developers. However, because it’s built on top of Cloudflare, it comes with performance baked in, and that’s evidenced by R2 being faster than its primary alternatives.

At Cloudflare, we believe that developers shouldn’t have to think about performance, you have so many other things to think about. By choosing Cloudflare, you should be able to rest easy knowing that your application will be faster because it’s built on Cloudflare, not because you’re manipulating Cloudflare to be faster for you. And by using R2 and the rest of our developer platform, we’re happy to say that we’re delivering on our vision to make performance easy for you.

Cloudflare Access is the fastest Zero Trust proxy

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-security-week-2023/

Cloudflare Access is the fastest Zero Trust proxy

Cloudflare Access is the fastest Zero Trust proxy

During every Innovation Week, Cloudflare looks at our network’s performance versus our competitors. In past weeks, we’ve focused on how much faster we are compared to reverse proxies like Akamai, or platforms that sell serverless compute that compares to our Supercloud, like Fastly and AWS. This week, we’d like to provide an update on how we compare to other reverse proxies as well as an update to our application services security product comparison against Zscaler and Netskope. This product is part of our Zero Trust platform, which helps secure applications and Internet experiences out to the public Internet, as opposed to our reverse proxy which protects your websites from outside users.

In addition to our previous post showing how our Zero Trust platform compared against Zscaler, we also have previously shared extensive network benchmarking results for reverse proxies from 3,000 last mile networks around the world. It’s been a while since we’ve shown you our progress towards being #1 in every last mile network. We want to show that data as well as revisiting our series of tests comparing Cloudflare Access to Zscaler Private Access and Netskope Private Access. For our overall network tests, Cloudflare is #1 in 47% of the top 3,000 most reported networks. For our application security tests, Cloudflare is 50% faster than Zscaler and 75% faster than Netskope.

In this blog we’re going to talk about why performance matters for our products, do a deep dive on what we’re measuring to show that we’re faster, and we’ll talk about how we measured performance for each product.

Why does performance matter?

We talked about it in our last blog, but performance matters because it impacts your employees’ experience and their ability to get their job done. Whether it’s accessing services through access control products, connecting out to the public Internet through a Secure Web Gateway, or securing risky external sites through Remote Browser Isolation, all of these experiences need to be frictionless.

A quick summary: say Bob at Acme Corporation is connecting from Johannesburg out to Slack or Zoom to get some work done. If Acme’s Secure Web Gateway is located far away from Bob in London, then Bob’s traffic may go out of Johannesburg to London, and then back into Johannesburg to reach his email. If Bob tries to do something like a voice call on Slack or Zoom, his performance may be painfully slow as he waits for his emails to send and receive. Zoom and Slack both recommend low latency for optimal performance. That extra hop Bob has to take through his gateway could decrease throughput and increase his latency, giving Bob a bad experience.

As we’ve discussed before, if these products or experiences are slow, then something worse might happen than your users complaining: they may find ways to turn off the products or bypass them, which puts your company at risk. A Zero Trust product suite is completely ineffective if no one is using it because it’s slow. Ensuring Zero Trust is fast is critical to the effectiveness of a Zero Trust solution: employees won’t want to turn it off and put themselves at risk if they barely know it’s there at all.

Much like Zscaler, Netskope may outperform many older, antiquated solutions, but their network still fails to measure up to a highly performant, optimized network like Cloudflare’s. We’ve tested all of our Zero Trust products against Netskope equivalents, and we’re even bringing back Zscaler to show you how Zscaler compares against them as well. So let’s dig into the data and show you how and why we’re faster in a critical Zero Trust scenario, comparing Cloudflare Access to Zscaler Private Access and Netskope Private Access.

Cloudflare Access: the fastest Zero Trust proxy

Access control needs to be seamless and transparent to the user: the best compliment for a Zero Trust solution is employees barely notice it’s there. These services allow users to cache authentication information on the provider network, ensuring applications can be accessed securely and quickly to give users that seamless experience they want. So having a network that minimizes the number of logins required while also reducing the latency of your application requests will help keep your Internet experience snappy and reactive.

Cloudflare Access does all that 75% faster than Netskope and 50% faster than Zscaler, ensuring that no matter where you are in the world, you’ll get a fast, secure application experience:

Cloudflare Access is the fastest Zero Trust proxy

Cloudflare measured application access across ourselves, Zscaler and Netskope from 300 different locations around the world connecting to 6 distinct application servers in Hong Kong, Toronto, Johannesburg, São Paulo, Phoenix, and Switzerland. In each of these locations, Cloudflare’s P95 response time was faster than Zscaler and Netskope. Let’s take a look at the data when the application is hosted in Toronto, an area where Zscaler and Netskope should do well as it’s in a heavily interconnected region: North America.

Cloudflare Access is the fastest Zero Trust proxy

ZT Access – Response time (95th Percentile) – Toronto
95th Percentile Response (ms)
Cloudflare 2,182
Zscaler 4,071
Netskope 6,072

Cloudflare really stands out in regions with more diverse connectivity options like South America or Asia Pacific, where Zscaler compares better to Netskope than it does Cloudflare:

Cloudflare Access is the fastest Zero Trust proxy

When we look at application servers hosted locally in South America, Cloudflare stands out:

ZT Access – Response time (95th Percentile) – South America
95th Percentile Response (ms)
Cloudflare 2,961
Zscaler 9,271
Netskope 8,223

Cloudflare’s network shines here, allowing us to ingress connections close to the users. You can see this by looking at the Connect times in South America:

ZT Access – Connect time (95th Percentile) – South America
95th Percentile Connect (ms)
Cloudflare 369
Zscaler 1,753
Netskope 1,160

Cloudflare’s network sets us apart here because we’re able to get users onto our network faster and find the optimal routes around the world back to the application host. We’re twice as fast as Zscaler and three times faster than Netskope because of this superpower. Across all the different tests, Cloudflare’s Connect times is consistently faster across all 300 testing nodes.

Cloudflare Access is the fastest Zero Trust proxy

In our last blog, we looked at two distinct scenarios that need to be measured individually when we compared Cloudflare and Zscaler. The first scenario is when a user logs into their application and has to authenticate. In this case, the Zero Trust Access service will direct the user to a login page, the user will authenticate, and then be redirected to their application.

This is called a new session, because no authentication information is cached or exists on the Access network. The second scenario is called an existing session, when a user has already been authenticated and that authentication information can be cached. This scenario is usually much faster, because it doesn’t require an extra call to an identity provider to complete.

We like to measure these scenarios separately, because when we look at 95th percentile values, we would almost always be looking at new sessions if we combined new and existing sessions together. But across both scenarios, Cloudflare is consistently faster in every region. Let’s go back and look at an application hosted in Toronto, where users connecting to us connect faster than Zscaler and Netskope for both new and existing sessions.

ZT Access – Response Time (95th Percentile) – Toronto
New Sessions (ms) Existing Sessions (ms)
Cloudflare 1,276 1,022
Zscaler 2,415 1,797
Netskope 5,741 1,822

You can see that new sessions are generally slower as expected, but Cloudflare’s network and optimized software stack provides a consistently fast user experience. In scenarios where end-to-end connectivity can be more challenging, Cloudflare stands out even more. Let’s take a look at users in Asia connecting through to an application in Hong Kong.

ZT Access – Response Time (95th Percentile) – Hong Kong
New Sessions (ms) Existing Sessions (ms)
Cloudflare 2,582 2,075
Zscaler 4,956 3,617
Netskope 5,139 3,902

One interesting thing that stands out here is that while Cloudflare’s network is hyper-optimized for performance, Zscaler more closely compares to Netskope on performance than they do to Cloudflare. Netskope also performs poorly on new sessions, which indicates that their service does not react well when users are establishing new sessions.

We like to separate these new and existing sessions because it’s important to look at similar request paths to do a proper comparison. For example, if we’re comparing a request via Zscaler on an existing session and a request via Cloudflare on a new session, we could see that Cloudflare was much slower than Zscaler because of the need to authenticate. So when we contracted a third party to design these tests, we made sure that they took that into account.

For these tests, Cloudflare configured five application instances hosted in Toronto, Los Angeles, Sao Paulo, and Hong Kong. Cloudflare then used 300 different Catchpoint nodes from around the world to mimic a browser login as follows:

  • User connects to the application from a browser mimicked by a Catchpoint instance – new session
  • User authenticates against their identity provider
  • User accesses resource
  • User refreshes the browser page and tries to access the same resource but with credentials already present – existing session

This allows us to look at Cloudflare versus all the other products for application performance for both new and existing sessions, and we’ve shown that we’re faster. As we’ve mentioned, a lot of that is due to our network and how we get close to our users. So now we’re going to talk about how we compare to other large networks and how we get close to you.

Network effects make the user experience better

Getting closer to users improves the last mile Round Trip Time (RTT). As we discussed in the Access comparison, having a low RTT improves customer performance because new and existing sessions don’t have to travel very far to get to Cloudflare’s Zero Trust network. Embedding ourselves in these last mile networks helps us get closer to our users, which doesn’t just help Zero Trust performance, it helps web performance and developer performance, as we’ve discussed in prior blogs.

To quantify network performance, we have to get enough data from around the world, across all manner of different networks, comparing ourselves with other providers. We used Real User Measurements (RUM) to fetch a 100kb file from several different providers. Users around the world report the performance of different providers. The more users who report the data, the higher fidelity the signal is. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week 2021 blog post here.

We are constantly going through the process of figuring out why we were slow — and then improving. The challenges we faced were unique to each network and highlighted a variety of different issues that are prevalent on the Internet. We’re going to provide an overview of some of the efforts we use to improve our performance for our users.

But before we do, here are the results of our efforts since Developer Week 2022, the last time we showed off these numbers. Out of the top 3,000 networks in the world (by number of IPv4 addresses advertised), here’s a breakdown of the number of networks where each provider is number one in p95 TCP Connection Time, which represents the time it takes for a user on a given network to connect to the provider:

Cloudflare Access is the fastest Zero Trust proxy

Here’s what those numbers look like as of this week, Security Week 2023:

Cloudflare Access is the fastest Zero Trust proxy

As you can see, Cloudflare has extended its lead in being faster in more networks, while other networks that previously were faster like Akamai and Fastly lost their lead. This translates to the effects we see on the World Map. Here’s what that world map looked like in Developer Week 2022:

Cloudflare Access is the fastest Zero Trust proxy

Here’s how that world map looks today during Security Week 2023:

Cloudflare Access is the fastest Zero Trust proxy

As you can see, Cloudflare has gotten faster in Brazil, many countries in Africa including South Africa, Ethiopia, and Nigeria, as well as Indonesia in Asia, and Norway, Sweden, and the UK in Europe.

A lot of these countries benefited from the Edge Partner Program that we discussed in the Impact Week blog. A quick refresher: the Edge Partner Program encourages last mile ISPs to partner with Cloudflare to deploy Cloudflare locations that are embedded in the last mile ISP. This improves the last mile RTT and improves performance for things like Access. Since we last showed you this map, Cloudflare has deployed more partner locations in places like Nigeria, and Saudi Arabia, which have improved performance for users in all scenarios. Efforts like the Edge Partner Program help improve not just the Zero Trust scenarios like we described above, but also the general web browsing experience for end users who use websites protected by Cloudflare.

Next-generation performance in a Zero Trust world

In a non-Zero Trust world, you and your IT teams were the network operator — which gave you the ability to control performance. While this control was comforting, it was also a huge burden on your IT teams who had to manage middle mile connections between offices and resources. But in a Zero Trust world, your network is now… well, it’s the public Internet. This means less work for your teams — but a lot more responsibility on your Zero Trust provider, which has to manage performance for every single one of your users. The better your Zero Trust provider is at improving end-to-end performance, the better an experience your users will have and the less risk you expose yourself to. For real-time applications like authentication and secure web gateways, having a snappy user experience is critical.

A Zero Trust provider needs to not only secure your users on the public Internet, but it also needs to optimize the public Internet to make sure that your users continuously stay protected. Moving to Zero Trust doesn’t just reduce the need for corporate networks, it also allows user traffic to flow to resources more naturally. However, given your Zero Trust provider is going to be the gatekeeper for all your users and all your applications, performance is a critical aspect to evaluate to reduce friction for your users and reduce the likelihood that users will complain, be less productive, or turn the solutions off. Cloudflare is constantly improving our network to ensure that users always have the best experience, through programs like the Edge Partner Program and constantly improving our peering and interconnectivity. It’s this tireless effort that makes us the fastest Zero Trust provider.

Network Performance Update: Developer Week 2022

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-developer-week/

Network Performance Update: Developer Week 2022

Network Performance Update: Developer Week 2022

Cloudflare is building the fastest network in the world. But we don’t want you to just take our word for it. To demonstrate it, we are continuously testing ourselves versus everyone else to make sure we’re the fastest. Since it’s Developer Week, we wanted to provide an update on how our Workers products perform against the competition, as well as our overall network performance.

Earlier this year, we compared ourselves to Fastly’s Compute@Edge and overall we were faster. This time, not only did we repeat the tests, but we also added AWS Lambda@Edge to help show how we stack up against more and more competitors. The summary: we offer the fastest developer platform on the market. Let’s talk about how we build our network to help make you faster, and then we’ll get into how that translates to our developer platform.

Latest update on network performance

We have two updates on data: a general network performance update, and then data on how Workers compares with Compute@Edge and Lambda@Edge.

To quantify global network performance, we have to get enough data from around the world, across all manner of different networks, comparing ourselves with other providers. We used Real User Measurements (RUM) to fetch a 100kB file from different providers. Users around the world report the performance of different providers. The more users who report the data, the higher fidelity the signal is. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

During Cloudflare One Week (June 2022), we shared that we were faster in more of the most reported networks than our competitors. Out of the top 3,000 networks in the world (by number of IPv4 addresses advertised), here’s a breakdown of the number of networks where each provider is number one in p95 TCP Connection Time, which represents the time it takes for a user on a given network to connect to the provider. This data is from Cloudflare One Week (June 2022):

Network Performance Update: Developer Week 2022

Here is what the distribution looks like for the top 3,000 networks for Developer Week (November 2022):

Network Performance Update: Developer Week 2022

In addition to being the fastest across popular networks, Cloudflare is also committed to being the fastest provider in every country.

Using data on the top 3,000 networks from Cloudflare One Week (June 2022), here’s what the world map looks like (Cloudflare is in orange):

Network Performance Update: Developer Week 2022

And here’s what the world looks like while looking at the top 3,000 networks for Developer Week (November 2022):

Network Performance Update: Developer Week 2022

Cloudflare became #1 in more countries in Europe and Asia, specifically Russia, Ukraine, Kazakhstan, India, and China, further delivering on our mission to be the fastest network in the world. So let’s talk about how that network helps power the Supercloud to be the fastest developer platform around.

How we’re comparing developer platforms

It’s been six months since we published our initial tests, but here’s a quick refresher. We make comparisons by measuring time to connect to the network, time spent completing requests, and overall time to respond. We call these numbers connect, wait, and response. We’ve chosen these numbers because they are critical components of a request that need to be as fast as possible in order for users to see a good experience. We can reduce the connect times by peering as close as possible to the users. We can reduce the wait times by optimizing code execution to be as fast as possible. If we optimize those two processes, we’ve optimized the response, which represents the end-to-end latency of a request.

Test methodology

To measure connect, wait, and response, we perform three tests against each provider: a simple no-op JavaScript function, a complex JavaScript function, and a complex Rust function.  We don’t do a simple Rust function because we expect it to take almost no time at all, and we already have a baseline for end-to-end functionality in the no-op JavaScript function since many providers will often compile both down to WebAssembly.

Here are the functions for each of them:

JavaScript no-op:

async function getErrorResponse(event, message, status) {
  return new Response(message, {status: status, headers: {'Content-Type': 'text/plain'}});
}

JavaScript hard function:

function testHardBusyLoop() {
  let value = 0;
  let offset = Date.now();

  for (let n = 0; n < 15000; n++) {
    value += Math.floor(Math.abs(Math.sin(offset + n)) * 10);
  }

  return value;
}

Rust hard function:

fn test_hard_busy_loop() -> i32 {
  let mut value = 0;
  let offset = Date::now().as_millis();

  for n in 0..15000 {
    value += (((offset + n) as f64).sin().abs() * 10.0) as i32;
  }

  value
}

We’re trying to test how good each platform is at optimizing compute in addition to evaluating how close each platform is to end-users. However, for this test, we did not run a Rust test on Lambda@Edge because it did not natively support our Rust function without uploading a WASM binary that you compile yourself. Since Lambda@Edge does not have a true first-class developer platform and tooling to run Rust, we decided to exclude the Rust scenarios for Lambda@Edge. So when we compare numbers for Lambda@Edge, it will only be for the JavaScript simple and JavaScript hard tests.

Measuring Workers performance from real users

To collect data, we use two different methods: one from a third party service called Catchpoint, and a second from our own network performance benchmarking tests. First, we used Catchpoint to gather a set of data from synthetic probes. Catchpoint is an industry standard “synthetic” testing tool, and measurements collected from real users distributed around the world. Catchpoint is a monitoring platform that has around 2,000 total endpoints distributed around the world that can be configured to fetch specific resources and time for each test. Catchpoint is useful for network providers like us as it provides a consistent, repeatable way to measure end-to-end performance of a workload, and delivers a best-effort approximation for what a user sees.

Catchpoint has backbone nodes that are embedded in ISPs around the world. That means that these nodes are plugged into ISP routers just like you are, and the traffic goes through the ISP network to each endpoint they are monitoring. These can approximate a real user, but they will never truly replicate a real user. For example, the bandwidth for these nodes is 100% dedicated for platform monitoring, as opposed to your home Internet connection, where your Internet experience will be a mixed bag of different use cases, some of which won’t talk to Workers applications at all.

For this new test, we chose 300 backbone nodes that are embedded in last mile ISPs around the world. We filtered out nodes in cloud providers, or in metro areas with multiple transit options, trying to remove duplicate paths as much as possible.

We cross-checked these tests with our own data set, which is collected from users connecting to free websites when they are served 1xxx error pages, just like how we collect data for global network performance. When a user sees this error page, that page that will execute these tests as a part of rendering and upload performance metrics on these calls to Cloudflare.

We also changed our test methodology to use paid accounts for Fastly, Cloudflare, and AWS.

Workers vs Compute@Edge vs Lambda@Edge

This time, let’s start off with the response times to show how we’re doing end-to-end:

Network Performance Update: Developer Week 2022

Test 95th percentile response (ms)
Cloudflare JavaScript no-op 479
Fastly JavaScript no-op 634
AWS JavaScript no-op 1,400
Cloudflare JavaScript hard 471
Fastly JavaScript hard 683
AWS JavaScript hard 1,411
Cloudflare Rust hard 472
Fastly Rust hard 638

We’re fastest in all cases. Now let’s look at connect times, which show us how fast users connect to the compute platform before doing any actual compute:

Network Performance Update: Developer Week 2022

Test 95th percentile connect (ms)
Cloudflare JavaScript no-op 82
Fastly JavaScript no-op 94
AWS JavaScript no-op 295
Cloudflare JavaScript hard 82
Fastly JavaScript hard 94
AWS JavaScript hard 297
Cloudflare Rust hard 79
Fastly Rust hard 94

Note that we don’t expect these times to differ based on the code being run, but we extract them from the same set of tests, so we’ve broken them out here.

But what about wait times? Remember, wait times represent time spent computing the request, so who has optimized their platform best? Again, it’s Cloudflare, although Fastly still has a slight edge on the hard Rust test (which we plan to beat by further optimization):

Network Performance Update: Developer Week 2022

Test 95th percentile wait (ms)
Cloudflare JavaScript no-op 110
Fastly JavaScript no-op 122
AWS JavaScript no-op 362
Cloudflare JavaScript hard 115
Fastly JavaScript hard 178
AWS JavaScript hard 367
Cloudflare Rust hard 125
Fastly Rust hard 122

To verify these results, we compared the Catchpoint results to our own data set. Here is the p95 TTFB for the JavaScript and Rust hard loops for Fastly, AWS, and Cloudflare from our data:

Network Performance Update: Developer Week 2022

Cloudflare is faster on JavaScript and Rust calls. These numbers also back up the slight compute advantage for Fastly on Rust calls.

The big takeaway from this is that in addition to Cloudflare being faster for the time spent processing requests in nearly every test, Cloudflare’s network and performance optimizations as a whole set us apart and make our Workers platform even faster for everything. And, of course, we plan to keep it that way.

Your application, but faster

Latency is an important component of the user experience, and for developers, being able to ensure their users can do things as fast as possible is critical for the success of an application. Whether you’re building applications in Workers, D1, and R2, hosting your documentation in Pages, or even leveraging Workers as part of your SaaS platform, having your code run in the SuperCloud that is our global network will ensure that your users see the best experience they possibly can.

Our network is hyper-optimized to make your code as fast as possible. By using Cloudflare’s network to run your applications, you can focus on making the best possible application possible and rest easy knowing that Cloudflare is providing you the best user experience possible. This is because Cloudflare’s developer platform is built on top of the world’s fastest network. So go out and build your dreams, and know that we’ll make your dreams as fast as they can possibly be.

Network performance update: Cloudflare One Week June 2022

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-cloudflare-one-week-june-2022/

Network performance update: Cloudflare One Week June 2022

Network performance update: Cloudflare One Week June 2022

In September 2021, we shared extensive benchmarking results of 1,000 networks all around the world. The results showed that on a range of tests (TCP connection time, time to first byte, time to last byte), and on different measures (p95, mean), Cloudflare was the fastest provider in 49% of the top 1,000 networks around the world.

Since then, we’ve expanded our testing to cover not just 1,000 but 3,000 networks, and we’ve worked to continuously improve performance, with the ultimate goal of being the fastest everywhere and an intermediate goal to grow the number of networks where we’re the fastest by at least 10% every Innovation Week. We met that goal Platform Week May 2022), and we’re carrying the work over to Cloudflare One Week (June 2022).

We’re excited to share that Cloudflare has the fastest provider in 1,290 of the top 3,000 most reported networks, up from 1,280 even one month ago during Platform Week.

Measuring what matters

To quantify global network performance, we have to get enough data from around the world, across all manner of different networks, comparing ourselves with other providers. We use Real User Measurements (RUM) to fetch a 100kB file from different providers. Users around the world report the performance of different providers.

The more users who report the data, the higher fidelity the signal is. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

Latest data on network performance

Here’s how the breakdown of the fastest networks looked during Platform Week (May 2022):

Network performance update: Cloudflare One Week June 2022

Here’s how that graph looks now during Cloudflare One Week (June 2022):

Network performance update: Cloudflare One Week June 2022

In addition to being the fastest across popular networks, Cloudflare is also committed to being the fastest provider in every country.

Here’s how the map of the fastest provider by country looked during Platform Week (May 2022):

Network performance update: Cloudflare One Week June 2022

And here’s how that map looks during Cloudflare One Week (June 2022):

Network performance update: Cloudflare One Week June 2022

Cloudflare became faster in more Eastern European countries during this time specifically.

Network performance in a Zero Trust world

A zero trust provider needs to not only secure your users on the public Internet, but it also needs to optimize the public Internet. Moving to Zero Trust doesn’t just reduce the need for corporate networks, it also allows user traffic to flow to resources more naturally.

However, given your Zero Trust provider is going to be the gatekeeper for all your users and all your applications, performance is a critical aspect to evaluate Cloudflare is constantly improving our network to ensure that users always have the best experience, and this comes not just from routing fixes, but also through expanding peering arrangements and adding new locations.

This tireless effort helps make us faster in more networks than anyone else, and allows us to deliver all of our services with high performance that customers expect. We know many organizations are just starting their Zero Trust journey, and that a priority of that project is to improve user experience, and we’re excited to keep obsessing over the performance in our network to make sure your teams have a seamless experience in any location.

Interested in learning more about how our Zero Trust products benefit from these improvements? Check out the full roster of our announcements from Cloudflare One Week.

Network performance update: Platform Week

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-platform-week/

Network performance update: Platform Week

Network performance update: Platform Week

In September 2021, we shared extensive benchmarking results of 1,000 networks all around the world. The results showed that on a range of tests (TCP connection time, time to first byte, time to last byte), and on different measures (p95, mean), Cloudflare was the fastest provider in 49% of networks around the world. Since then, we’ve worked to continuously improve performance, with the ultimate goal of being the fastest everywhere and an intermediate goal to grow the number of networks where we’re the fastest by at least 10% every Innovation Week. We met that goal during Security Week (March 2022), and we’re carrying the work over to Platform Week (May 2022).

We’re excited to update you on the latest results, but before we do: after running with this benchmark for nine months, we’ve also been looking for ways to improve the benchmark itself — to make it even more representative of speeds in the real world. To that end, we’re expanding our measured networks from 1,000 to 3,000, to give an even more accurate sense of real world performance across the globe.

In terms of results: using the old benchmark of 1,000 networks, we’re the fastest in 69% of them. In this new expanded view of 3,000 networks, we’re the fastest in 42% of them. We’ve demonstrated a consistent ability to improve our performance against what we measure, and we’re excited to optimize our performance and lift our ranking in these smaller networks all around the world.

In addition to sharing a general update on where our network performance stands, we’re also sharing updated performance metrics on our Workers platform (given that it’s Platform Week!). We’ve done an extensive benchmark of Cloudflare Workers vs Fastly’s Compute@Edge.

We’ve got the results below, but before we get to that, we want to spend a bit of time on our revamped measurements.

Revamped measurements

A few months ago, we discussed the performance of Cloudflare Workers, as compared to other similar offerings out there. We compared our performance to Fastly’s Compute@Edge, showing that Workers was significantly faster.

After we published our results, there were questions and suggestions on how to improve our testing methodology (including sharing more detail about where and how we ran the tests). As we re-ran tests for this iteration, we made some small changes, and also worked to address the suggestions from the community. Let’s talk about what’s changed and why.

Measuring what matters

To quantify global network performance, we have to get enough data from around the world, across all manner of different networks, comparing ourselves with other providers. We used Real User Measurements (RUM) to fetch a 100kB file from different providers. Users around the world report the performance of different providers. The more users who report the data, the higher fidelity the signal is. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

In the process of quantifying network performance, it became clear where we needed to expand our scope in measuring performance. After Security Week, we were fastest in 71% of networks, and we decided that we wanted to expand the pool of networks from the top 1,000 most reported networks to the top 3,000 most reported networks.

We’ve shown the graph detailing the number of networks where Cloudflare is #1 in network performance, but from here on out, we will only be showing this graph in percentages of networks where we are number one out of 100%, since the denominator will be changing going forward as we set ourselves harder and harder challenges.

Benchmarking for everyone

At the time we published our earlier set of benchmarks, we had an unnecessary “no benchmarking” clause in our Terms of Service. It has since been removed. It’s been a while since we’ve worried about such things, and the clause lived past its intended life in our ToS.

We’ve done the work to show where we are the fastest provider, and it’s important that everyone else be able to validate that independently. We’re also confident that well run benchmarks will only help further improve performance for Workers, other Cloudflare products, and the Internet as a whole. Game on.

Measuring Workers performance from real users

To run our tests, we used Catchpoint, an industry standard “synthetic” testing tool, and measurements collected from real users distributed around the world. Catchpoint is a monitoring platform that has around 2,000 total endpoints distributed around the world that can be configured to fetch specific resources and time each test. Catchpoint is useful for network providers like us as it provides a consistent, repeatable way to measure end-to-end performance of a workload, and delivers a best-effort approximation for what a user sees.

Catchpoint has a series of backbone nodes that are embedded in ISPs around the world. That means that these nodes are plugged into ISP routers just like you are, and the traffic goes through the ISP network to each endpoint they are monitoring. These can approximate a real user, but they will never truly approximate a real user. For example, the bandwidth for these nodes is 100% dedicated for platform monitoring, as opposed to your home Internet connection, where your Internet experience will be a mixed bag of different use cases, some of which won’t talk to Workers applications at all.

For this new test, we chose 300 backbone nodes that are embedded in last mile ISPs around the world. We filtered out nodes in cloud providers, or in metro areas with multiple transit options, trying to remove duplicate paths as much as possible.

We cross-checked these tests with our own data set, which is collected from users connecting to free websites when they are served 1xxx error pages, similar to how we collect data for global network performance. When a user sees this error page, that page will contain a call that will execute these tests and upload performance metrics on these calls to Cloudflare. Users would run these calls independently of the error page, ensuring that Cloudflare did not get a head start in these tests.

We also changed our test methodology to use paid accounts for both Fastly and Cloudflare.

Changing the numbers we look at

For this new test, we decided to look at Wait time in addition to Time to First Byte. Time to First Byte is the time it takes for a web server to establish a connection, fetch the content, and deliver the first byte of a response to a user, and Wait time is the time the client spends waiting for the server to send back that first byte after the connection is established. We are using this particular measurement to look at the actual time it takes for the server to compute the request on the machine. Wait time is a subcomponent of TTFB, and contains the machine processing time, the time it takes the server to give the response to the socket to be sent back to the user, and the time it takes the response to reach the user. There could be latency in passing the request to the socket on the machine, but we measured the actual time it took the server to send the request for all tests and found it to be zero:

Network performance update: Platform Week

So we would calculate the wait time as the amount of time spent on the box plus the time it took the server to send the packet over the wire to the client.

However, others have noted that using Time to First Byte to measure performance of serverless computing solutions could potentially be misleading because user performance measurements can be influenced by more than just the time spent computing functions on the machine. For example, things like the time to connect to the server, DNS resolution, and cache times can impact Time to First Byte. Time to connect to the server (connect) can be impacted by how well peered a network is (or how poorly). DNS resolution can be impacted by client behavior, local DNS behavior, or by the performance of the provider’s DNS. Cache times can be driven by the performance of the cache on the server itself.

We agree that it is difficult to tease out computing time from Time to First Byte. To look at that particular aspect of serverless computing, we look at Wait, which is a value that contains the least amount of variables: we aren’t caching anything during our tests, DNS isn’t part of the wait time, and the only thing impacting Wait aside from the time spent on the machine is the Connect time.

However, since we’re trying to measure the end user experience and not just the amount of time spent on a server, we want to use both the Wait and TTFB so that we can show the time spent both on the server and how that impacts end-to-end performance.

What’s in the test?

For our test this time around, we decided to measure three things: a simple JavaScript function like last time, a complex JavaScript function, and a complex Rust function. Here are the functions for each of them:

async function getErrorResponse(event, message, status) {
  return new Response(message, {status: status, headers: {'Content-Type': 'text/plain'}});
}

function testHardBusyLoop() {
  let value = 0;
  let offset = Date.now();

  for (let n = 0; n < 15000; n++) {
    value += Math.floor(Math.abs(Math.sin(offset + n)) * 10);
  }

  return value;
}

fn test_hard_busy_loop() -> i32 {
  let mut value = 0;
  let offset = Date::now().as_millis();

  for n in 0..15000 {
    value += (((offset + n) as f64).sin().abs() * 10.0) as i32;
  }

  value
}

The goals of each of these tests is simple: test the ability of Workers and Compute@Edge to perform compute actions.

Latest update on network performance

We have two updates on data: a general network performance update, and then data on how Workers compares with Compute@Edge.

At Security Week (March 2022), we shared that we were faster in more of the most reported networks than our competitors. Out of the top 1,000 networks in the world (by number of IPv4 addresses advertised), here’s a breakdown of how many providers are number one in p95 TCP Connection Time, which represents the time it takes for a user to connect to the provider. This data is from Security Week (March 2022):

Network performance update: Platform Week

Recognizing that we are now looking at different numbers of networks, here is what the distribution looks like for the top 3,000 networks for Platform Week (May 2022):

Network performance update: Platform Week

In addition to being the fastest across popular networks, Cloudflare is also committed to being the fastest provider in every country.

Using data on the top 1,000 networks from Full Stack Week (March 2022), here’s what the world map looks like:

Network performance update: Platform Week

And here’s what the world looks like while looking at the top 3,000 networks:

Network performance update: Platform Week

Cloudflare became #1 in more countries in Africa and Europe, some of which previously did not have enough samples to officially be counted to one provider or another.

Workers vs Compute@Edge

Moving on from general network results, what does performance look like when comparing serverless compute products across providers — in this case, Cloudflare Workers performance to Compute@Edge? Looking at the Catchpoint data, the first thing we noticed was Cloudflare is faster than Fastly in all tests at the 95th percentile for Time to First Byte:

Network performance update: Platform Week

Test 95th percentile TTFB (ms)
Cloudflare JS no-op 469
Fastly JS no-op 596
Cloudflare JS hard 481
Fastly JS hard 631
Cloudflare Rust hard 493
Fastly Rust hard 577

Cloudflare is significantly faster at all tests compared to Fastly. But let’s dig into why that is. Looking at p95 Wait, we can see that Cloudflare does have an edge in most tests related to compute on-box:

Network performance update: Platform Week

Test 95th Percentile Wait (ms)
Cloudflare JS no-op 123
Fastly JS no-op 123
Cloudflare JS hard 136
Fastly JS hard 170
Cloudflare Rust hard 160
Fastly Rust hard 121

Looking at the Wait times, you can see that Cloudflare does have a significant edge in on-box performance, except in Rust, where Fastly claims their workloads are the most optimized. This data backs up that claim. But why is Fastly so much slower on Time to First Byte? The answer lies in the rest of the request. While latency spent in compute matters, it only matters in conjunction with the rest of the network performance. And Cloudflare has an advantage over Fastly in Connect times and SSL establishment times:

Network performance update: Platform Week

Test 95th Percentile Connect (ms) 95th Percentile SSL (ms)
Cloudflare JS no-op 81 289
Fastly JS no-op 88 293
Cloudflare JS hard 81 286
Fastly JS hard 88 291
Cloudflare Rust hard 81 288
Fastly Rust hard 88 295

Cloudflare’s hyper-optimized web stack allows us to process all requests faster, meaning that Workers code gets started faster. Having that extra head start in Connect and SSL allows us to further increase the distance between us and Compute@Edge.

To verify these results, we compared the Catchpoint results to our own data set. Here is the p95 TTFB for the JavaScript and Rust hard loops for both Fastly and Cloudflare from our data:

Network performance update: Platform Week

As you can see, Cloudflare is faster on JavaScript and Rust calls. And when you look at wait time, you will see that outside of Catchpoint results, Cloudflare even beats Fastly in the Rust hard tests:

Network performance update: Platform Week

The big takeaway from this is that in addition to Cloudflare being faster for the time spent processing requests, Cloudflare’s network and performance optimizations as a whole set us apart and make our Workers platform even faster.

A fast network makes for a faster developer platform

Cloudflare’s commitment to building the fastest network allows us to deliver unparalleled performance for all applications, including our developer tools. Whether it’s by accelerating Cloudflare Pages performance by hosting on every single Cloudflare server, or by deploying Workers in 275 cities without developers needing to configure a thing, Cloudflare’s developer tools are all built on top of Cloudflare’s global network. We’re committed to making our network faster so that our developer products are as performant as they could possibly be.

And it’s not just the developer platform. Cloudflare runs an integrated and optimized stack that includes DDoS protection, WAF, rate limiting, bot management, caching, SSL, smart routing, and more. By having a single software stack we are able to offer the widest range of features while ensuring that performance (no matter which of our products you use) remains excellent. We don’t want you to have to compromise performance to get security or vice versa.

Network performance update: Security Week

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-security-week/

Network performance update: Security Week

Network performance update: Security Week

Almost a year ago, we shared extensive benchmarking results of last mile networks all around the world. The results showed that on a range of tests (TCP connection time, time to first byte, time to last byte), and on different measures (p95, mean), Cloudflare was the fastest provider in 49% of networks around the world. Since then, we’ve worked to continuously improve performance towards the ultimate goal of being the fastest everywhere. We set a goal to grow the number of networks where we’re the fastest by 10% every Innovation Week. We met that goal last year, and we’re carrying the work over to 2022.

Today, we’re proud to report we are the fastest provider in 71% of the top 1,000 most reported networks around the world. Of course, we’re not done yet, but we wanted to share the latest results and explain how we did it.

Measuring what matters

To quantify network performance, we have to get enough data from around the world, across all manner of different networks, comparing ourselves with other providers. We used Real User Measurements (RUM) to fetch a 100kb file from several different providers. Users around the world report the performance of different providers. The more users who report the data, the higher fidelity the signal is. The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

In the process of quantifying network performance, it became clear where we were not the fastest everywhere. After Full Stack Week, we found 596 country/network pairs where we were more than 100ms behind the leading provider (where a country/network pair is defined as the performance of a network within a particular country).

We are constantly going through the process of figuring out why we were slow — and then improving. The challenges we faced were unique to each network and highlighted a variety of different issues that are prevalent on the Internet. We’re going to deep dive into a couple of networks, and show how we diagnosed and then improved performance.

But before we do, here are the results of our efforts since Full Stack Week.

Network performance update: Security Week

*Performance is defined by p95 TCP connection time across top 1,000 networks in the world by number of IPv4 addresses advertised

Network performance update: Security Week

*Performance is defined by p95 TCP connection time across top 1,000 networks in the world by number of IPv4 addresses advertised

Curing congestion in Canada

In the spirit of Security Week, we want to highlight how a Magic Transit (Cloudflare’s network layer DDoS security) customer’s network problems provided line of sight into their internal congestion issues, and how our network was able to mitigate the problem in the short term.

One Magic Transit customer saw congestion in Canada due to insufficient peering with the Internet at key interconnection points. Congestion for customers means bad performance for users: for games, it can lead to lag and jittery gameplay, for video streaming, it can lead to buffering and poor resolution, and for video/VoIP applications, it can lead to calls dropping, garbled video/voice, and sections of calls missing entirely. Fixing congestion in this case means improving the way this customer connects to the rest of the Internet to make the user experience better for both the customer and users.

When customers connect to the Internet, they can do so in several ways: through an ISP that connects to other networks, through an Internet Exchange which houses many different providers interconnecting at a singular point, or point-to-point connections with other providers.

Network performance update: Security Week

In the case of this customer, they had direct connections to other providers and to Internet exchanges. They ran out of bandwidth with their point-to-point connections, meaning that they had too much traffic for the size of the links they had bought, which meant that the excess traffic had to go over Internet Exchanges that were out of the way, creating suboptimal network paths which increased latency.

We were able to use our network to help solve this problem. In the short term, we spread the traffic away from the congestion points. This removed hairpins to immediately improve the user experience. This restored service for this customer and all of their users.

Then, we went into action by accelerating previously planned upgrades of all of our Internet Exchange (IX) ports across Canada to ensure that we had enough capacity to handle them, even though the congestion wasn’t happening on our network. Finally, we reached out to the customer’s provider and quickly set up direct peering with them in Canada to provide direct interconnection close to the customer, so that we could provide them with a much better Internet experience. These actions made us the fastest provider on networks in Canada as well.

Keeping traffic in Australia

Next, we turn to a network that had poor performance in Australia. Users for that network were going all the way out of the country before going to Cloudflare. This created what is called a network hairpin. A network hairpin is caused by suboptimal connectivity in certain locations, which can cause users to traverse a network path that takes longer than it should. This hairpin effect made Cloudflare one of the slower providers for this network in Australia.

Network performance update: Security Week

To fix this, Cloudflare set up peering with this network in Sydney, and this allowed traffic from this network to go to Cloudflare within the country the network was based in. This reduced our connection time from 65ms to 45ms, catapulting us to be the #1 provider for this network in the region.

Update on Full Stack Week

All of this work and more has helped us optimize our network even further. At Full Stack Week, we announced that we were faster in more of the most reported networks than our competitors.  Out of the top 1,000 networks in the world (by number of IPv4 addresses advertised), here’s a breakdown of how many providers are number 1 in p95 TCP Connection Time, which represents the time it takes for a user to connect to the provider.  This data is from Full Stack Week (November 2021):

Network performance update: Security Week

As of Security Week, we improved our position to be faster in 19 new networks:

Network performance update: Security Week

Cloudflare is also committed to being the fastest provider in every country. This is a world map using the data that was to show the countries with the fastest network provider during Full Stack Week (November 2021):

Network performance update: Security Week

Here’s how the map of the world looks during Security Week (March 2022):

Network performance update: Security Week

We moved to number 1 in all of South America, more countries in Africa, the UK, Sweden, Iceland, and also more countries in the Asia Pacific region.

A fast network means fast security products

Cloudflare’s commitment to building the fastest network allows us to deliver unparalleled performance for all applications, including our security applications. There’s an adage in the industry that you have to sacrifice performance for security, but Cloudflare believes that you should be able to have your security and performance without having to sacrifice either. We’ve unveiled a ton of awesome new products and features for Security Week and all of them are built on top of our lightning-fast network. That means that all of these products will continue to get faster as we relentlessly pursue our goal of being the fastest network everywhere.

Network Performance Update: Full Stack Week

Post Syndicated from David Tuber original https://blog.cloudflare.com/network-performance-update-full-stack-week/

Network Performance Update: Full Stack Week

This blog was published on November 20, 2021. As we continue to optimize our network we’re publishing regular updates, which are available here.

Network Performance Update: Full Stack Week

A little over two months ago, we shared extensive benchmarking results of last mile networks all around the world. The results showed that on a range of tests (TCP connection time, time to first byte, time to last byte), and on a range of measurements (p95, mean), that Cloudflare was the fastest provider in 49% of networks around the world. Since then, we’ve worked to continuously improve performance until we’re the fastest everywhere. We set a goal to grow the number of networks where we’re the fastest by 10% every Innovation Week. We met that goal during Birthday Week (September 2021).

Today, we’re proud to report we blew the goal away for Full Stack Week (November 2021). Cloudflare measured our performance against the top 1,000 networks in the world (by number of IPv4 addresses advertised). Out of those, Cloudflare has become the fastest provider in 79 new networks, an increase of 14% of these 1,000 networks. Of course, we’re not done yet, but we wanted to share the latest results and explain how we did it.

However, before we go into more detail on our network performance, we wanted to share new performance metrics on our Workers platform (given it’s Full Stack Week!). We’ve crunched the numbers of Cloudflare Workers vs Fastly’s Compute@Edge, and the results are in: Workers is 196% faster.

Faster Network Means Faster Stack

A few months ago, we also discussed the performance of Cloudflare Workers, as compared to other similar offerings out there. We compared our performance to Lambda and Lambda@Edge, where Cloudflare Workers outperformed at 210% and 298% respectively.

At the time, we wanted to see how we measured up against all comparable offerings, but not all offerings were generally available. As a result, we weren’t able to report on how Workers compared to another solution: Fastly’s Compute@Edge.

Today, we’re excited to report that Cloudflare Workers is 196% faster than Fastly’s Compute@Edge based on the time to first byte from the tests we ran on 50 nodes using Catchpoint’s data from across the world.

As we have done in the past, we executed a function that simply returns the current time and measured wait time to first byte (the length of time between a client making an HTTP request to when the client receives the first byte of the request’s response, after DNS, connection, and TLS handshake). The tests were performed on November 8, 2021, using a free tier account for both Cloudflare Workers and Fastly’s Compute@Edge.

The code we ran on both providers was exactly identical — a small function that returns all request headers:

addEventListener('fetch', event => event.respondWith(handleRequest(event)));


async function handleRequest(event) {
  let requestHeaders = Object.fromEntries(event.request.headers)

  return new Response(JSON.stringify(requestHeaders), {status: 200})
};

Blue: Cloudflare Workers
Green: Compute@Edge

Network Performance Update: Full Stack Week
Network Performance Update: Full Stack Week
Network Performance Update: Full Stack Week

If you want to explore the results on your own, here is a link to the data.

By building on our global network that we’re constantly accelerating, leveraging isolates, and driving cold starts to zero, we’re able to offer our customers ludicrously fast speeds across the board.

Now, let’s move on to an update on how Cloudflare’s broader network performance has continued to improve!

Measuring What Matters

To quantify network performance, we have to get enough data from around the world across all manner of different networks comparing ourselves with other providers. We used Real User Measurements (RUM) to fetch a 100kb file from several different providers. Users around the world report the performance of different providers.  The more users who report the data, the higher fidelity the signal is.  The goal is to provide an accurate picture of where different providers are faster, and more importantly, where Cloudflare can improve. You can read more about the methodology in the original Speed Week blog post here.

In the process of quantifying network performance, it became clear where we were not the fastest everywhere. After Birthday Week, we found 601 country/network pairs where we were more than 100ms behind the leading provider (where a country/network pair is defined as the performance of a network within a particular country).

We are constantly going through the process of figuring out why we were slow — and then improve. The challenges we faced were unique to each network and highlighted a variety of different issues that are prevalent on the Internet. We’re going to deep dive into a couple of networks, and show how we diagnosed and then improved performance.

But before we do, here are the results of our efforts in the past two weeks.

Cloudflare has become number one in TCP Connection Time in 79 new networks. This graph shows the number of networks where we ranked number 1 for TCP Connection Time during Full Stack Week compared to Birthday Week:

Network Performance Update: Full Stack Week
*Performance is defined by TCP connection time across top 1000 networks in the world by number of IPv4 addresses advertised

We’ve become faster in 79 more networks thanks to our efforts, which have represented a growth of 14% in networks where we were the fastest.  Here’s a chart showing our ranking distribution comparing Birthday Week and Full Stack Week:

Network Performance Update: Full Stack Week

Now that we’ve talked about how we’ve improved, let’s share our stories about chasing peak performance across the world — each with a different set of challenges.

Placing Traffic Properly in Peru

Our first stop for improving network performance was in Peru. We observed that a lot of users in Lima were actually getting sent to Chile to be served. Cloudflare has multiple locations in Peru, so this shouldn’t have happened. Sending traffic to Chile caused us to be ranked fourth on that particular network in the country. Our engineers knew that the best way to get to number one was to ensure that all the Lima traffic stayed within the country, so we decided to look at why so much of our traffic was getting routed outside the country.

The reason that so much traffic was being routed outside the country was due to the network provider distributing traffic to Cloudflare unevenly, and too many users were sent to one specific location. Our network has a series of checks and fail-safes that allow us to ensure that even if this happens, our users will continue to see a good experience. The checks were being engaged here because of the uneven distribution of traffic to our locations in Lima; however, the traffic was being sent out of the country.

To fix the situation in the short term, we decided to do a bit of manual load balancing across our locations in Lima while building automation to remove the need for manual actions in the future. We took one of the locations that was seeing the most traffic and stopped advertising some prefixes from that location. The hypothesis we had was that the traffic would simply flow to the other Lima locations instead of Chile, and everything would balance out, improving the TCP connect time for everyone while keeping the traffic in the country. We started to make the change on a small portion of our free traffic, and our hypothesis  proved correct.  At that point, we deployed the change in a larger scope, and the P90 Client TCP RTT dropped from 240ms to 60ms.

Network Performance Update: Full Stack Week

As a result, Cloudflare is now number one in network performance in Peru.

Slimming Down Latencies in Sri Lanka

Our next example takes us halfway around the world to Sri Lanka, where we found a network provider who was routing requests from their users to Newark.

1 * * *
2 100.85.0.1 3.061ms 2.522ms 2.728ms
3 198.51.100.146 AS29766 3.651ms 1.855ms 2.715ms
4 198.51.100.145 AS29766 3.438ms 3.225ms 2.805ms
5 222.165.177.150 AS9329 2.233ms 2.272ms 2.843ms
6 222.165.177.145 AS9329 2.703ms 2.862ms 2.291ms
7 103.87.125.253 AS45489 3.658ms 3.708ms 3.613ms
8 103.87.124.245 AS45489 120.027ms 120.665ms 120.471ms
9 103.87.124.146 AS45489 115.597ms 115.863ms 115.178ms
10 50.208.235.157 be-107-2008-pe01.60hudson.ny.ibone.comcast.net AS7922 249.884ms 249.475ms 250.063ms -> going from Sri Lanka to New York
11 96.110.41.145 be-4101-cs01.newyork.ny.ibone.comcast.net AS7922 267.839ms 267.979ms 268.719ms
12 96.110.34.34 be-3112-pe12.111eighthave.ny.ibone.comcast.net AS7922 262.647ms 261.272ms 262.272ms
13 66.208.233.106 AS7922 262.378ms 258.948ms 258.057ms
14 172.70.108.4 AS13335 268.974ms 280.475ms 268.158ms
15 172.67.182.209 AS13335 267.329ms 266.466ms 266.593ms

This understandably caused significant latency problems, and Cloudflare was ranked fourth in Sri Lanka on this network as a result. Even though Colombo is a relatively small location, we moved as much traffic as possible and advertised through the location to improve the user experience and reduce the potential amount of traffic sent to Newark.

Once this was done, we noticed that the P90 Client TCP RTT dropped from 150ms to 50ms.

Network Performance Update: Full Stack Week

However, even though we were advertising all of our ranges through Colombo and our performance in aggregate improved, this provider was still sending traffic for some Cloudflare prefixes to Newark. We reached out to the provider and let them know about this user-impacting change they made.

After doing all of these things, Cloudflare moved from fourth in Sri Lanka to number one.

Update on Birthday Week

All of these network changes and more have allowed Cloudflare to become number one in network performance in more networks than before. During Birthday Week, we announced that we were faster in more networks than our competitors. Out of the top 1,000 networks in the world (by number of IPv4 addresses advertised), here’s how Cloudflare performed during Birthday Week (September 2021):

Network Performance Update: Full Stack Week

As of Full Stack Week (November 2021), we further improved our position to be faster in 79 new networks:

Network Performance Update: Full Stack Week

But we haven’t just increased our performance on the last mile, we’ve even gotten better on Time to Last Byte as well. Here’s how the landscape looked leading up to Birthday Week (September 2021):

Network Performance Update: Full Stack Week

And here’s the network landscape now (November 2021):

Network Performance Update: Full Stack Week

Cloudflare is also committed to being the fastest provider in every country. Network performance by country is a moving target, and is largely driven by users who are accessing at any given day. Also, looking at network performance in aggregate across countries for long time frames can leave a lot of data out. That being said, this is a world map using the data that was to show the countries with the fastest network provider during Birthday Week (Sept 2021):

Network Performance Update: Full Stack Week

Here’s how it looks two months later during Full Stack Week (November 2021):

Network Performance Update: Full Stack Week

Long Tail Latency

The running theme of these performance updates has always been the long tail of issues to solve. Ironing out these kinks on our network is critical to ensure that we provide premiere performance as we grow.

Our team has put in a lot of effort and yielded some great results, but we’re constantly trying to be faster. We’ve automated the discovery of performance issues like these, and we’re looking to build automation that will detect and remediate different classes of these issues to stay on top of network performance in the future.

Tracking performance like this doesn’t just make one number faster; it helps improve the performance of your entire stack, making everything lightning-fast.

We have one more innovation week coming in 2021, and we’ll be back to report on further progress on optimizing our performance globally.