Tag Archives: Cache

Introducing: Smarter Tiered Cache Topology Generation

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/introducing-smarter-tiered-cache-topology-generation/

Introducing: Smarter Tiered Cache Topology Generation

Introducing: Smarter Tiered Cache Topology Generation

Caching is a magic trick. Instead of a customer’s origin responding to every request, Cloudflare’s 200+ data centers around the world respond with content that is cached geographically close to visitors. This dramatically improves the load performance for web pages while decreasing the bandwidth costs by having Cloudflare respond to a request with cached content.

However, if content is not in cache, Cloudflare data centers must contact the origin server to receive the content. This isn’t as fast as delivering content from cache. It also places load on an origin server, and is more costly compared to serving directly from cache. These issues can be amplified depending on the geographic distribution of a website’s visitors, the number of data centers contacting the origin, and the available origin resources for responding to requests.

To decrease the number of times our network of data centers communicate with an origin, we organize data centers into tiers so that only upper-tier data centers can request content from an origin and then they spread content to lower tiers. This means content that loads faster for visitors, is cheaper to serve, and reduces origin resource consumption.

Today, I’m thrilled to announce a fundamental improvement to Argo Tiered Cache we’re calling Smart Tiered Cache Topology. When enabled, Argo Tiered Cache will now dynamically select the single best upper tier for each of your website’s origins while providing tiered cache analytics showing how your custom topology is performing.

Smarter Tiered Cache Topology Generation

Tiered Cache is part of Argo, a constellation of products that analyzes and optimizes routing decisions across the global Internet in real-time by processing information from every Cloudflare request to determine which routes to an origin are fast, which are slow, and what the optimum path from visitor to content is at any given moment. Previously, Argo Tiered Cache would use a static collection of upper-tier data centers to communicate with the origin. With the improvements we’re announcing today, Tiered Cache can now dynamically find the single best upper tier for an origin using Argo performance and routing data. When Argo is enabled and a request for particular content is made, we collect latency data for each request to pick the optimal path. Using this latency data, we can determine how well any upper-tier data center is connected with an origin and can empirically select the best data center with the lowest latency to be the upper tier for an origin.

Argo Tiered Cache

Taking one step back, tiered caching is a practice where Cloudflare’s network of global data centers are subdivided into a hierarchy of upper tiers and lower tiers. In order to control bandwidth and number of connections between an origin and Cloudflare, only upper tiers are permitted to request content from an origin and must propagate that information to the lower tiers. In this way, Cloudflare data centers first talk to each other to find content before asking the origin. This practice improves bandwidth efficiency by limiting the number of data centers that can ask the origin for content, reduces origin load, and makes websites more cost-effective to operate. Argo Tiered Cache customers only pay for data transfer between the client and edge, and we take care of the rest. Tiered caching also allows for improved performance for visitors, because distances and links traversed between Cloudflare data centers are generally shorter and faster than the links between data centers and origins.

Introducing: Smarter Tiered Cache Topology Generation

Previously, when Argo Tiered Cache was enabled for a website, several of Cloudflare’s largest and most-connected data centers were determined to be upper tiers and could pull content from an origin on a cache MISS. While utilizing a topology consisting of numerous upper-tier data centers may be globally performant, we found that cost-sensitive customers generally wanted to find the single best upper tier for their origin to ensure efficient data transfer of their content to Cloudflare’s network. We built Smart Tiered Cache Topology for this reason.

How to enable Smart Tiered Cache Topology

When you enable Argo Tiered Cache, Cloudflare now by default concentrates connections to origin servers so they come from a single data center. This is done without needing to work with our Customer Success or Solutions Engineering organization to custom configure the best single upper tier. Argo customers can generate this topology by:

  • Logging into your Cloudflare account.
  • Navigating to the Traffic tab in the dashboard.
  • Ensuring you have Argo enabled.
  • From there, Non-Enterprise Argo customers will automatically be enrolled in Smart Tiered Cache Topology without needing to make any additional changes.

Enterprise customers can select the type of topology they’d like to generate.

Introducing: Smarter Tiered Cache Topology Generation

Self-serve Argo customers are automatically enrolled in Smart Tiered Cache Topology

Introducing: Smarter Tiered Cache Topology Generation

Enterprise customers can determine the tiered cache topology that works best for them.

More data, fewer problems

Once enabled, in addition to performance and cost improvements, Smart Tiered Cache Topology also delivers summary analytics for how the upper tiers are performing so that you can monitor the cost and performance benefits your website is receiving. These analytics are available in the Cache Tab of the dashboard in the Tiered Cache section. The “Primary Data Center” and “Secondary Data Center” fields show you which data centers were determined to be the best upper tier and failover for your origin. “Cached Hits” and the “Hit Ratio” shows you the proportion of requests that were served by the upper tier and how many needed to be forwarded on to the origin for a response. “Bytes Saved” indicates the total transfer from the upper-tier data center to the lower tiers, showing the total bandwidth saved by having Cloudflare’s lower tier data centers ask the upper tiers for the content instead of the origin.

Introducing: Smarter Tiered Cache Topology Generation

Smart Tiered Cache Topology works with Cloudflare’s existing products to deliver you a seamless, easy, and performant experience that saves you money and provides you useful information about how your upper tiers are working with your origins. Smart Tiered Cache Topology stands on the shoulders of some of the most resilient and useful products at Cloudflare to provide even more benefits to webmasters.

If you’re interested in seeing how Argo and Smart Tiered Cache Topology can benefit your web property, please login to your Cloudflare account and find more information in the Traffic tab of the dashboard here.

Tiered Cache Smart Topology

Post Syndicated from Brian Bradley original https://blog.cloudflare.com/tiered-cache-smart-topology/

Tiered Cache Smart Topology

Tiered Cache Smart Topology

A few years ago, we released Argo to help make the Internet faster and more efficient. Argo observes network conditions and finds the optimal route across the Internet for origin server requests, avoiding congestion along the way.

Tiered Cache is an Argo feature that reduces the number of data centers responsible for requesting assets from the origin. With Tiered Cache active, a request in South Africa won’t go directly to an origin in North America, but, instead, look in a large, nearby data center to see if the data requested is cached there first. The number and location of the data centers used by Tiered Cache is controlled by a piece of configuration called the topology. By default, we use a generic topology for every customer that strikes a balance between cache hit ratios and latency that is suitable for most users.

Today we’re introducing Smart Topology, which maximizes cache hit ratios by building on Argo’s internal infrastructure to identify the single best data center for making requests to the origin.

Standard Cache

The standard method for caching assets is to let each data center be a reverse proxy for the origin server. In this scheme, a miss in any data center causes a request to the origin for an asset. A request to the origin for one asset could be made as many times as there are data centers.

Tiered Cache Smart Topology

A cache miss in any data center will result in a request being sent to the origin server even if the asset is cached in some other data center. This is because the data centers are completely oblivious of each other.

Theoretically, a request for the asset would have to be sent to every data center in order to reduce the cache misses to the minimum possible. However, sending every request to every data center is not practical.

The minimum possible cache hit latency is achieved if the asset is moved into the nearest cache before the request for it is made, but this kind of prediction is generally not possible. Instead, a good heuristic is to move the asset into the nearest cache after the first cache miss.

However, the asset has to be copied from somewhere and it isn’t possible to know where in the network it might be without querying each data center.

To avoid querying each data center, a copy of the asset has to be stored in a known location after the first cache miss so it is available to other data centers. This is precisely what Tiered Cache does.

Tiered Cache

Tiered Cache improves cache hit ratios by allowing some data centers to serve as caches for others, before the latter has to make a request to the origin. With Tiered Cache, certain data centers are reverse proxies to the origin for other data centers.

Tiered Cache Smart Topology

If the proxied data centers make requests for the same asset, the asset will already be cached in the proxying data center and can be retrieved from there rather than from the origin. Fewer requests to the origin are made overall.

Custom Topology

In Tiered Cache, the topology describes which data center should serve as a proxy for others.

For customers, devising an optimal topology is a challenge requiring optimization and continuous maintenance. The best topology is a configuration based on information that is privately held by the customer and other information held only by Cloudflare.

For instance, knowing the desired balance of latency versus cache hit ratio is information only the customer has, but how to best make use of the Internet is something we would know. Enterprise customers usually have dedicated infrastructure teams that work with our solutions engineers to manually optimize and maintain their tiered cache topology.

Not every customer would want to personalize their topology. For this reason a generic topology exists.

Generic Topology

The generic topology is designed to achieve good latency and cache efficiency for any origin, regardless of location. A balance is struck between two constraints —  cache efficiency and  latency.

The generic topology has multiple proxying data centers that are distributed around the world in order to ensure that requests that result in a cache miss do not take a very long detour before going to the origin. There is a balance between the number of proxying data centers and the cache hit ratio, because the proxying data centers are oblivious to each other.

If a proxying data center is taken offline, the proxied data centers either use a fallback (if the fallback is online) or revert to behaving like Tiered Cache is disabled.

To achieve the best balance for general usage, the generic topology instructs the smaller data centers to be proxied by the larger data centers in the same geographic region.

Smart Topology

Smart Topology assumes the origin is in one place and then automatically configures itself to be optimal once the customer just flips a switch in the dashboard. In order to actually do this, Cloudflare needs to be able to determine which data center has the lowest latency to the origin without making the customer tell Cloudflare where the origin is.

Methods for Latency Determination

There are a few ways to determine which data center has the lowest latency with respect to the origin.

IP geolocation
Physical distance can be used as an approximation for latency, but Smart Topology was not built this way for a couple of reasons. First, even the best commercial IP geo database doesn’t have the required coverage and accuracy. Second, even with perfect accuracy, physical distance is a questionable approximation of Internet latency.

Probing
Latency to an IP address can be determined exactly by probing that address. The probe can just be the time required to perform the TCP handshake. Each data center probes the origin so that the latencies can be directly measured and the minimum can be found. Except for edge cases involving Anycast and TCP termination, we can assume that the latency to an IP address is the same as the latency to the origin server behind that IP address.

Topology Selection Algorithm

The goal of the topology selection algorithm is to minimize cache misses and latency. The topology chooses a single proxying data center in order to maximize the cache hit ratio. The proxying data center is chosen to be close to the origin so that the latencies of cache misses in the proxied data centers are not much worse than they would be with tiered cache turned off.

The choice should eventually become stable. Stability is important because each time the choice changes, cache misses in proxied data centers are likely to cause cache misses in the new proxying data center. Capacity is important because when a data center goes offline, it can cause a large number of cache misses. Minimizing latency to the origin is important to ensure that the network is used efficiently.

The data center selection algorithm is rather like a leaderboard of the fastest data center for each origin. As data is collected, a faster data center can knock others off a given origin’s leaderboard. This competition is based on the 24 hour median latency and is held each hour. Only a subset of data centers deemed large enough are permitted to compete.

Eventually, the choice for proxying data centers becomes stable. Over time, data centers produce competing records for each origin and less competitive records in the leaderboard are replaced as necessary. Thus, latencies for any origin on the leaderboard can only monotonically decrease. There are always physical limits in the real world, so eventually the ideal data center will set a record that is too good to beat.

Also, the leaderboard actually includes both the lowest latency data center and the second lowest latency data center. The second lowest latency data center serves as a fallback if the preferred data center is taken offline for maintenance.

Anycast Networks
We are measuring the latency to the origin IP address and assuming that it represents the latency to the origin server, but this can break down in certain cases. A few cloud providers other than Cloudflare also use Anycast technology to provide their services. In Anycast, multiple machines can share an IP address regardless of where they are connected to the Internet, and the Internet will typically route packets destined for that address to the closest machine. If an Anycast network is used to proxy an origin server, then the apparent latency to the IP address for the origin server is actually the latency to the edge of the Anycast network rather than the latency to the origin server. The real latency to the origin server cannot be determined by probing.

Tiered Cache Smart Topology

The algorithm would fail to select the single best proxying data center if the latencies are not representative of the actual latency between data center and origin. Selecting the wrong data center would adversely affect latencies for requests to the origin, and could be expensive.

For instance, imagine a cloud provider provides an IP address that actually routes to multiple data centers all over the world. Packets are routed through private infrastructure to the correct destination once they enter the network. The lowest latency data center to this Anycast IP address could potentially even be on a different continent than the actual origin server. Therefore, the apparent latency cannot actually be trusted as a true measure of actual latency to the origin.

The data center selection algorithm assumes that the origin is in a single geographic location and can be probed to determine latency from each data center. These networks break one or both of these assumptions, so a procedure had to be developed in order to detect them. First, it is assumed that the IP appears in a single geographic location and is not proxied by such a network. The latency to the origin is bounded by the speed of light through fiber. Although the distance between any data center and the origin server is not known, the distances between data centers is known by Cloudflare.

Tiered Cache Smart Topology

Imagine putting the origin server as a pitstop in that journey. Then, the theoretical minimum possible observable pair of latencies between the origin server and any two data centers can be computed. We have the latency probe data from both of these data centers and the origin, so we can check to see whether the observed latency is lower than what is possible.

The original assumption was that the origin IP address identifies an origin server that is in one location and the latency to that IP address is the latency to the origin server. If the observed latencies are faster than light then clearly the assumption is false. Smart Topology falls back to the generic topology when the original assumption does not hold. To be extra sure, we check this constraint on a bunch of data centers around the world and fall back if there is even a single physically impossible observation.

The Big Picture

When Smart Topology is enabled many Cloudflare systems work together to ensure the correct data center is eventually used to request assets from the origin.

Tiered Cache Smart Topology

When the customer enables Tiered Cache Smart Topology, one of a few things can happen from the perspective of the origin. If a proxying data center has already been assigned to the CIDR block that encompasses the origin IP, the preferred or fallback data center is used to request assets from the origin. Otherwise, the generic topology is used to determine which proxying data centers to use to pull assets from the origin. The latency to the proxying data center should only decrease as the choice for proxying data center is updated over time.

Conclusion

Developing this technology offered a lot of opportunities to exercise great engineering and build an impactful product. It was not done in a vacuum; we used infrastructure that Cloudflare had already built, and we moved along that exponential gradient of using existing progress to make more progress. Building this framework opens a lot of doors to future progress too; for instance, in the future, we can explore ways to select the ideal proxying data center even for origins behind Anycast networks that hide the true latency to the origin.

Introducing Cache Analytics

Post Syndicated from Jon Levine original https://blog.cloudflare.com/introducing-cache-analytics/

Introducing Cache Analytics

Today, I’m delighted to announce Cache Analytics: a new tool that gives deeper exploration capabilities into what Cloudflare’s caching and content delivery services are doing for your web presence.

Caching is the most effective way to improve the performance and economics of serving your website to the world. Unsurprisingly, customers consistently ask us how they can optimize their cache performance to get the most out of Cloudflare.

With Cache Analytics, it’s easier than ever to learn how to speed up your website, and reduce traffic sent to your origin. Some of my favorite capabilities include:

  • See what resources are missing from cache, expired, or never eligible for cache in the first place
  • Slice and dice your data as you see fit: filter by hostnames, or see a list of top URLs that miss cache
  • Switch between views of requests and data Transfer to understand both performance and cost
Introducing Cache Analytics
An overview of Cache Analytics

Cache Analytics is available today for all customers on our Pro, Business, and Enterprise plans.

In this blog post, I’ll explain why we built Cache Analytics and how you can get the most out of it.

Why do we need analytics focused on caching?

If you want to scale the delivery of a fast, high-performance website, then caching is critical. Caching has two main goals:

First, caching improves performance. Cloudflare data centers are within 100ms of 90% of the planet; putting your content in Cloudflare’s cache gets it physically closer to your customers and visitors, meaning that visitors will see your website faster when they request it! (Plus, reading assets on our edge SSDs is really fast, rather than waiting for origins to generate a response.)

Second, caching helps reduce bandwidth costs associated with operating a presence on the Internet. Origin data transfer is one of the biggest expenses of running a web service, so serving content out of Cloudflare’s cache can significantly reduce costs incurred by origin infrastructure.

Because it’s not safe to cache all content (we wouldn’t want to cache your bank balance by default), Cloudflare relies on customers to tell us what’s safe to cache with HTTP Cache-Control headers and page rules. But even with page rules, it can be hard to understand what’s actually getting cached — or more importantly, what’s not getting cached, and why. Is a resource expired? Or was it even eligible for cache in the first place?

Faster or cheaper? Why not both!

Cache Analytics was designed to help users understand how Cloudflare’s cache is performing, but it can also be used as a general-purpose analytics tool. Here I’ll give a quick walkthrough of the interface.

First, at the top-left, you should decide if you want to focus on requests or data transfer.

Introducing Cache Analytics
Cache Analytics enables you to toggle between views of requests and data transfer.

As a rule of thumb, requests (the default view) is more useful for understanding performance, because every request that misses cache results in a performance hit. Data transfer is useful for understanding cost, because most hosts charge for every byte that leaves their network — every gigabyte served by Cloudflare translates into money saved at the origin.

You can always toggle between these two views while keeping filters enabled.

A filter for every occasion

Let’s say you’re focused on improving the performance of a specific subdomain on your zone. Cache Analytics allows flexible filtering of the data that’s important to you:

Introducing Cache Analytics
Cache Analytics enables flexible filtering of data.

Filtering is essential for zooming in on the chunk of traffic that you’re most interested in. You can filter by cache status, hostname, path, content type, and more. This is helpful, for example, if you’re trying to reduce data transfer for a specific subdomain, or are trying to tune the performance of your HTML pages.

Seeing the big picture

When analyzing traffic patterns, it’s essential to understand how things change over time. Perhaps you just applied a configuration change and want to see the impact, or just launched a big sale on your e-commerce site.

Introducing Cache Analytics

“Served by Cloudflare” indicates traffic that we were able to serve from our edge without reaching your origin server. “Served by Origin” indicates traffic that was proxied back to origin servers. (It can be really satisfying to add a page rule and see the amount of traffic “Served by Cloudflare” go up!)

Note that this graph will change significantly when you switch between “Requests” and “Data Transfer.” Revalidated requests are particularly interesting; because Cloudflare checks with the origin before returning a result from cache, these count as “Served by Cloudflare” for the purposes of data transfer, but “Served by Origin” for the purposes of “requests.”

Slicing the pie

After the high-level summary, we show an overview of cache status, which explains why traffic might be served from Cloudflare or from origin. We also show a breakdown of cache status by Content-Type to give an overview on how different components of your website perform:

Introducing Cache Analytics

Cache statuses are also essential for understanding what you need to do to optimize cache ratios. For example:

  • Dynamic indicates that a request was never eligible for cache, and went straight to origin. This is the default for many file types, including HTML. Learn more about making more content eligible for cache using page rules. Fixing this is one of the fastest ways to reduce origin data transfer cost.
  • Revalidated indicates content that was expired, but after Cloudflare checked the origin, it was still fresh! If you see a lot of revalidated content, it’s a good sign you should increase your Edge Cache TTLs through a page rule or max-age origin directive. Updating TTLs is one of the easiest ways to make your site faster.
  • Expired resources are ones that were in our cache, but were expired. Consider if you can extend TTLs on these, or at least support revalidation at your origin.
  • A miss indicates that Cloudflare has not seen that resource recently. These can be tricky to optimize, but there are a few potential remedies: Enable Argo Tiered Caching to check another datacenter’s cache before going to origin, or use a Custom Cache Key to make multiple URLs match the same cached resource (for example, by ignoring query string)

For a full explanation of each cache status, see our help center.

To the Nth dimension

Finally, Cache Analytics shows a number of what we call “Top Ns” — various ways to slice and dice the above data on useful dimensions.

Introducing Cache Analytics

It’s often helpful to apply filters (for example, to a specific cache status) before looking at these lists. For example, when trying to tune performance, I often filter to just “expired” or “revalidated,” then see if there are a few URLs that dominate these stats.

But wait, there’s more

Cache Analytics is available now for customers on our Pro, Business, and Enterprise plans. Pro customers have access to up to 3 days of analytics history. Business and Enterprise customers have access to up to 21 days, with more coming soon.

This is just the first step for Cache Analytics. We’re planning to add more dimensions to drill into the data. And we’re planning to add even more essential statistics — for example, about how cache keys are being used.

Finally, I’m really excited about Cache Analytics because it shows what we have in store for Cloudflare Analytics more broadly. We know that you’ve asked for many features— like per-hostname analytics, or the ability to see top URLs — for a long time, and we’re hard at work on bringing these to Zone Analytics. Stay tuned!

Using Multiple Dynamic Caches With Spring

Post Syndicated from Bozho original https://techblog.bozho.net/using-multiple-dynamic-caches-with-spring/

In a third post about cache managers in spring (over a long period of time), I’d like to expand on the previous two by showing how to configure multiple cache managers that dynamically create caches.

Spring has CompositeCacheManager which, in theory, should allow using more than one cache manager. It works by asking the underlying cache managers whether they have a cache with the requested name or not. The problem with that is when you need dynamically-created caches, based on some global configuration. And that’s the common scenario, when you don’t want to manually define caches, but instead want to just add @Cacheable and have spring (and the underlying cache manager) create the cache for you with some reasonable defaults.

That’s great until you need to have more than one cache managers. For example – one for local cache and one for a distributed cache. In many cases a distributed cache is needed; however not all method calls need to be distributed – some can be local to the instance that handles it and you don’t want to burden your distributed cache with stuff that can be kept locally. Whether you can configure a distributed cache provider to designate some cache to be local, even though it’s handled by the distributed cache provider – maybe, but I don’t guarantee it will be trivial.

So, faced with that issue, I had to devise some simple mechanism of designating some caches as “distributed” and some as “local”. Using CompositeCacheManager alone would not do it, so I extended the distributed cache manager (in this case, Hazelcast, but it can be done with any provider):

/**
 * Hazelcast cache manager that handles only cache names with a specified prefix for distributed caches
 */
public class OptionalHazelcastCacheManager extends HazelcastCacheManager {

    private static final String DISTRIBUTED_CACHE_PREFIX = "d:";

    public OptionalHazelcastCacheManager(HazelcastInstance hazelcast) {
        super(hazelcast);
    }

    @Override
    public Cache getCache(String name) {
        if (name == null || !name.startsWith(DISTRIBUTED_CACHE_PREFIX)) {
            return null;
        }

        return super.getCache(name);
    }
}

And the corresponding composite cache manager configuration:

    <bean id="cacheManager" class="org.springframework.cache.support.CompositeCacheManager">
        <property name="cacheManagers">
            <list>
                <bean id="hazelcastCacheManager" class="com.yourcompany.util.cache.OptionalHazelcastCacheManager">
                    <constructor-arg ref="hazelcast" />
                </bean>

                <bean id="caffeineCacheManager" class="com.yourcompany.util.cache.FlexibleCaffeineCacheManager">
                    <property name="cacheSpecification" value="expireAfterWrite=10m"/>
                    <property name="cacheSpecs">
                        <map>
                            <entry key="statistics" value="expireAfterWrite=1h"/>
                        </map>
                    </property>
                </bean>
            </list>
        </property>
    </bean>

That basically means that any cache with a name starting with d: (for “distributed”) should be handled by the distributed cache manager. Otherwise, proceed to the next cache manager (Caffeine in this case). So when you want to define a method with a cacheable result, you have to decide whether it’s @Cacheable("d:cachename") or just @Cacheable("cachename")

That’s probably one of many ways to approach that issue, but I like it for its simplicity. Caching is hard (distributed caching even more so), and while we are lucky to have Spring abstract most of that, we sometimes have to handle special cases ourselves.

The post Using Multiple Dynamic Caches With Spring appeared first on Bozho's tech blog.

Technical Details of Why Cloudflare Chose AMD EPYC for Gen X Servers

Post Syndicated from Nitin Rao original https://blog.cloudflare.com/technical-details-of-why-cloudflare-chose-amd-epyc-for-gen-x-servers/

Technical Details of Why Cloudflare Chose AMD EPYC for Gen X Servers

From the very beginning Cloudflare used Intel CPU-based servers (and, also, Intel components for things like NICs and SSDs). But we’re always interested in optimizing the cost of running our service so that we can provide products at a low cost and high gross margin.

We’re also mindful of events like the Spectre and Meltdown vulnerabilities and have been working with outside parties on research into mitigation and exploitation which we hope to publish later this year.

We looked very seriously at ARM-based CPUs and continue to keep our software up to date for the ARM architecture so that we can use ARM-based CPUs when the requests per watt is interesting to us.

Technical Details of Why Cloudflare Chose AMD EPYC for Gen X Servers

In the meantime, we’ve deployed AMD’s EPYC processors as part of Gen X server platform and for the first time are not using any Intel components at all. This week, we announced details of this tenth generation of servers. Below is a recap of why we’re excited about the design, specifications, and performance of our newest hardware.

Servers for an Accelerated Future

Every server can run every service. This architectural decision has helped us achieve higher efficiency across the Cloudflare network. It has also given us more flexibility to build new software or adopt the newest available hardware.

Notably, Intel is not inside. We are not using their hardware for any major server components such as the CPU, board, memory, storage, network interface card (or any type of accelerator).

This time, AMD is inside.

Compared with our prior server (Gen 9), our Gen X server processes as much as 36% more requests while costing substantially less. Additionally, it enables a ~50% decrease in L3 cache miss rate and up to 50% decrease in NGINX p99 latency, powered by a CPU rated at 25% lower TDP (thermal design power) per core.

Gen X CPU benchmarks

To identify the most efficient CPU for our software stack, we ran several benchmarks for key workloads such as cryptography, compression, regular expressions, and LuaJIT. Then, we simulated typical requests we see, before testing servers in live production to measure requests per watt.    

Based on our findings, we selected the single socket 48-core AMD 2nd Gen EPYC 7642.

Technical Details of Why Cloudflare Chose AMD EPYC for Gen X Servers

Impact of Cache Locality

The single AMD EPYC 7642 performed very well during our lab testing, beating our Gen 9 server with dual Intel Xeon Platinum 6162 with the same total number of cores. Key factors we noticed were its large L3 cache, which led to a low L3 cache miss rate, as well as a higher sustained operating frequency.

Gen X Performance Tuning

Partnering with AMD, we tuned the 2nd Gen EPYC 7642 processor to achieve additional 6% performance. We achieved this by using power determinism and configuring the CPU’s Thermal Design Power (TDP).

Securing Memory at EPYC Scale

Finally, we described how we use Secure Memory Encryption (SME), an interesting security feature within the System on a Chip architecture of the AMD EPYC line. We were impressed by how we could achieve RAM encryption without significant decrease in performance. This reduces the worry that any data could be exfiltrated from a stolen server.

We enjoy designing hardware that improves the security, performance and reliability of our global network, trusted by over 26 million Internet properties.

Want to help us evaluate new hardware? Join us!