Tag Archives: HTTP2

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

Post Syndicated from Suleman Ahmad original http://blog.cloudflare.com/connection-coalescing-with-origin-frames-fewer-dns-queries-fewer-connections/

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

This blog reports and summarizes the contents of a Cloudflare research paper which appeared at the ACM Internet Measurement Conference, that measures and prototypes connection coalescing with ORIGIN Frames.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

Some readers might be surprised to hear that a single visit to a web page can cause a browser to make tens, sometimes even hundreds, of web connections. Take this very blog as an example. If it is your first visit to the Cloudflare blog, or it has been a while since your last visit, your browser will make multiple connections to render the page. The browser will make DNS queries to find IP addresses corresponding to blog.cloudflare.com and then subsequent requests to retrieve any necessary subresources on the web page needed to successfully render the complete page. How many? Looking below, at the time of writing, there are 32 different hostnames used to load the Cloudflare Blog. That means 32 DNS queries and at least 32 TCP (or QUIC) connections, unless the client is able to reuse (or coalesce) some of those connections.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

Each new web connection not only introduces additional load on a server's processing capabilities – potentially leading to scalability challenges during peak usage hours – but also exposes client metadata to the network, such as the plaintext hostnames being accessed by an individual. Such meta information can potentially reveal a user’s online activities and browsing behaviors to on-path network adversaries and eavesdroppers!

In this blog we’re going to take a closer look at “connection coalescing”. Since our initial look at IP-based coalescing in 2021, we have done further large-scale measurements and modeling across the Internet, to understand and predict if and where coalescing would work best. Since IP coalescing is difficult to manage at large scale, last year we implemented and experimented with a promising standard called the HTTP/2 ORIGIN Frame extension that we leveraged to coalesce connections to our edge without worrying about managing IP addresses.

All told, there are opportunities being missed by many large providers. We hope that this blog (and our publication at ACM IMC 2022 with full details) offers a first step that helps servers and clients take advantage of the ORIGIN Frame standard.

Setting the stage

At a high level, as a user navigates the web, the browser renders web pages by retrieving dependent subresources to construct the complete web page. This process bears a striking resemblance to the way physical products are assembled in a factory. In this sense, a modern web page can be considered like an assembly plant. It relies on a ‘supply chain’ of resources that are needed to produce the final product.

An assembly plant in the physical world can place a single order for different parts and get a single shipment from the supplier (similar to the kitting process for maximizing value and minimizing response time); no matter the manufacturer of those parts or where they are made — one ‘connection’ to the supplier is all that is needed. Any single truck from a supplier to an assembly plant can be filled with parts from multiple manufacturers.

The design of the web causes browsers to typically do the opposite in nature. To retrieve the images, JavaScript, and other resources on a web page (the parts), web clients (assembly plants) have to make at least one connection to every hostname (the manufacturers) defined in the HTML that is returned by the server (the supplier). It makes no difference if the connections to those hostnames go to the same server or not, for example they could go to a reverse proxy like Cloudflare. For each manufacturer a ‘new’ truck would be needed to transfer the materials to the assembly plant from the same supplier, or more formally, a new connection would need to be made to request a subresource from a hostname on the same web page.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections
Without connection coalescing

The number of connections used to load a web page can be surprisingly high. It is also common for the subresources to need yet other sub-subresources, and so new connections emerge as a result of earlier ones. Remember, too, that HTTP connections to hostnames are often preceded by DNS queries! Connection coalescing allows us to use fewer connections, or ‘reuse’ the same set of trucks to carry parts from multiple manufacturers from a single supplier.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections
With connection coalescing 

Connection coalescing in principle

Connection coalescing was introduced in HTTP/2, and carried over into HTTP/3. We’ve blogged about connection coalescing previously (for a detailed primer we encourage going over that blog). While the idea is simple, implementing it can present a number of engineering challenges. For example, recall from above that there are 32 hostnames (at the time of writing) to load the web page you are reading right now. Among the 32 hostnames are 16 unique domains (defined as “Effective TLD+1”). Can we create fewer connections or ‘coalesce’ existing connections for each unique domain? The answer is ‘Yes, but it depends’.

The exact number of connections to load the blog page is not at all obvious, and hard to know. There may be 32 hostnames attached to 16 domains but, counter-intuitively, this does not mean the answer to “how many unique connections?” is 16. The true answer could be as few as one connection if all the hostnames are reachable at a single server; or as many as 32 independent connections if a different and distinct server is needed to access each individual hostname.

Connection reuse comes in many forms, so it’s important to define “connection coalescing” in the HTTP space. For example, the reuse of an existing TCP or TLS connection to a hostname to make multiple requests for subresources from that same hostname is connection reuse, but not coalescing.

Coalescing occurs when an existing TLS channel for some hostname can be repurposed or used for connecting to a different hostname. For example, upon visiting blog.cloudflare.com, the HTML points to subresources at cdnjs.cloudflare.com. To reuse the same TLS connection for the subresources, it is necessary for both hostnames to appear together in the TLS certificate's “Server Alternative Name (SAN)” list, but this step alone is not sufficient to convince browsers to coalesce. After all, the cdnjs.cloudflare.com service may or may not be reachable at the same server as blog.cloudflare.com, despite being on the same certificate. So how can the browser know? Coalescing only works if servers set up the right conditions, but clients have to decide whether to coalesce or not – thus, browsers require a signal to coalesce beyond the SANs list on the certificate. Revisiting our analogy, the assembly plant may order a part from a manufacturer directly, not knowing that the supplier already has the same part in its warehouse.

There are two explicit signals a browser can use to decide whether connections can be coalesced: one is IP-based, the other ORIGIN Frame-based. The former requires the server operators to tightly bind DNS records to the HTTP resources available on the server. This is difficult to manage and deploy, and actually creates a risky dependency, because you have to place all the resources behind a specific set or a single IP address. The way IP addresses influence coalescing decisions varies among browsers, with some choosing to be more conservative and others more permissive. Alternatively, the HTTP ORIGIN Frame is an easier signal for the servers to orchestrate; it’s also flexible and has graceful failure with no interruption to service (for a specification compliant implementation).

A foundational difference between both these coalescing signals is: IP-based coalescing signals are implicit, even accidental, and force clients to infer coalescing possibilities that may exist, or not. None of this is surprising since IP addresses are designed to have no real relationship with names! In contrast, ORIGIN Frame is an explicit signal from servers to clients that coalescing is available no matter what DNS says for any particular hostname.

We have experimented with IP-based coalescing previously; for the purpose of this blog we will take a deeper look at ORIGIN Frame-based coalescing.

What is the ORIGIN Frame standard?

The ORIGIN Frame is an extension to the HTTP/2 and HTTP/3 specification, a special Frame sent on stream 0 or the control stream of the connection respectively. The Frame allows the servers to send an ‘origin-set’ to the clients on an existing established TLS connection, which includes hostnames that it is authorized for and will not incur any HTTP 421 errors. Hostnames in the origin-set MUST also appear in the certificate SAN list for the server, even if those hostnames are announced on different IP addresses via DNS.

Specifically, two different steps are required:

  1. Web servers must send a list enumerating the Origin Set (the hostnames that a given connection might be used for) in the ORIGIN Frame extension.
  2. The TLS certificate returned by the web server must cover the additional hostnames being returned in the ORIGIN Frame in the DNS names SAN entries.

At a high-level ORIGIN Frames are a supplement to the TLS certificate that operators can attach to say, “Psst! Hey, client, here are the names in the SANs that are available on this connection — you can coalesce!” Since the ORIGIN Frame is not part of the certificate itself, its contents can be made to change independently. No new certificate is required. There is also no dependency on IP addresses. For a coalesceable hostname, existing TCP/QUIC+TLS connections can be reused without requiring new connections or DNS queries.

Many websites today rely on content which is served by CDNs, like Cloudflare CDN service. The practice of using external CDN services offers websites the advantages of speed, reliability, and reduces the load of content served by their origin servers. When both the website, and the resources are served by the same CDN, despite being different hostnames, owned by different entities, it opens up some very interesting opportunities for CDN operators to allow connections to be reused and coalesced since they can control both the certificate management and connection requests for sending ORIGIN frames on behalf of the real origin server.

Unfortunately, there has been no way to turn the possibilities enabled by ORIGIN Frame into practice. To the best of our knowledge, until today, there has been no server implementation that supports ORIGIN Frames. Among browsers, only Firefox supports ORIGIN Frames. Since IP coalescing is challenging and ORIGIN Frame has no deployed support, is the engineering time and energy to better support coalescing worth the investment? We decided to find out with a large-scale Internet-wide measurement to understand the opportunities and predict the possibilities, and then implemented the ORIGIN Frame to experiment on production traffic.

Experiment #1: What is the scale of required changes?

In February 2021, we collected data for 500K of the most popular websites on the Internet, using a modified Web Page Test on 100 virtual machines. An automated Chrome (v88) browser instance was launched for every visit to a web page to eliminate caching effects (because we wanted to understand coalescing, not caching). On successful completion of each session, Chrome developer tools were used to retrieve and write the page load data as an HTTP Archive format (HAR) file with a full timeline of events, as well as additional information about certificates and their validation. Additionally, we parsed the certificate chains for the root web page and new TLS connections triggered by subresource requests to (i) identify certificate issuers for the hostnames, (ii) inspect the presence of the Subject Alternative Name (SAN) extension, and (iii) validate that DNS names resolve to the IP address used. Further details about our methodology and results can be found in the technical paper.

The first step was to understand what resources are requested by web pages to successfully render the page contents, and where these resources were present on the Internet. Connection coalescing becomes possible when subresource domains are ideally co-located. We approximated the location of a domain by finding its corresponding autonomous system (AS). For example, the domain attached to cdnjs is reachable via AS 13335 in the BGP routing table, and that AS number belongs to Cloudflare. The figure below describes the percentage of web pages and the number of unique ASes needed to fully load a web page.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

Around 14% of the web pages need two ASes to fully load i.e. pages that have a dependency on one additional AS for subresources. More than 50% of the web pages need to contact no more than six ASes to obtain all the necessary subresources. This finding as shown in the plot above implies that a relatively small number of operators serve the sub-resource content necessary for a majority (~50%) of the websites, and any usage of ORIGIN Frames would need only a few changes to have its intended impact. The potential for connection coalescing can therefore be optimistically approximated to the number of unique ASes needed to retrieve all subresources in a web page. In practice however, this may be superseded by operational factors such as SLAs or helped by flexible mappings between sockets, names, and IP addresses which we worked on previously at Cloudflare.

We then tried to understand the impact of coalescing on connection metrics. The measured and ideal number of DNS queries and TLS connections needed to load a web page are summarized by their CDFs in the figure below.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

Through modeling and extensive analysis, we identify that connection coalescing through ORIGIN Frames could reduce the number of DNS and TLS connections made by browsers by over 60% at the median. We performed this modeling by identifying the number of times the clients requested DNS records, and combined them with the ideal ORIGIN Frames to serve.

Many multi-origin servers such as those operated by CDNs tend to reuse certificates and serve the same certificate with multiple DNS SAN entries. This allows the operators to manage fewer certificates through their creation and renewal cycles. While theoretically one can have millions of names in the certificate, creating such certificates is unreasonable and a challenge to manage effectively. By continuing to rely on existing certificates, our modeling measurements bring to light the volume of changes required to enable perfect coalescing, while presenting information about the scale of changes needed, as highlighted in the figure below.

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

We identify that over 60% of the certificates served by websites do not need any modifications and could benefit from ORIGIN Frames, while with no more than 10 additions to the DNS SAN names in certificates we’re able to successfully coalesce connections to over 92% of the websites in our measurement. The most effective changes could be made by CDN providers by adding three or four of their most popular requested hostnames into each certificate.

Experiment #2: ORIGIN Frames in action

In order to validate our modeling expectations, we then took a more active approach in early 2022. Our next experiment focused on 5,000 websites that make extensive use of cdnjs.cloudflare.com as a subresource. By modifying our experimental TLS termination endpoint we deployed HTTP/2 ORIGIN Frame support as defined in the RFC standard. This involved changing the internal fork of net and http dependency modules of Golang which we have open sourced (see here, and here).

During the experiments, connecting to a website in the experiment set would return cdnjs.cloudflare.com in the ORIGIN frame, while the control set returned an arbitrary (unused) hostname. All existing edge certificates for the 5000 websites were also modified. For the experimental group, the corresponding certificates were renewed with cdnjs.cloudflare.com added to the SAN. To ensure integrity between control and experimental sets, control group domains certificates were also renewed with a valid and identical size third party domain used by none of the control domains. This is done to ensure that the relative size changes to the certificates is kept constant avoiding potential biases due to different certificate sizes. Our results were striking!

Connection coalescing with ORIGIN Frames: fewer DNS queries, fewer connections

Sampling 1% of the requests we received from Firefox to the websites in the experiment, we identified over 50% reduction in new TLS connections per second indicating a lesser number of cryptographic verification operations done by both the client and reduced server compute overheads. As expected there were no differences in the control set indicating the effectiveness of connection re-use as seen by the CDN or server operators.

Discussion and insights

While our modeling measurements indicated that we could anticipate some performance improvements, in practice it was not significantly better suggesting that ‘no-worse’ is the appropriate mental model regarding performance. The subtle interplay between resource object sizes, competing connections, and congestion control is subject to network conditions. Bottleneck-share capacity, for example, diminishes as fewer connections compete for bottleneck resources on network links. It would be interesting to revisit these measurements as more operators deploy support on their servers for ORIGIN Frames.

Apart from performance, one major benefit of ORIGIN frames is in terms of privacy. How? Well, each coalesced connection hides client metadata that is otherwise leaked from non-coalesced connections. Certain resources on a web page are loaded depending on how one is interacting with the website. This means for every new connection for retrieving some resource from the server, TLS plaintext metadata like SNI (in the absence of Encrypted Client Hello) and at least one plaintext DNS query, if transmitted over UDP or TCP on port 53, is exposed to the network. Coalescing connections helps remove the need for browsers to open new TLS connections, and the need to do extra DNS queries. This prevents metadata leakage from anyone listening on the network. ORIGIN Frames help minimize those signals from the network path, improving privacy by reducing the amount of cleartext information leaked on path to network eavesdroppers.

While the browsers benefit from reduced cryptographic computations needed to verify multiple certificates, a major advantage comes from the fact that it opens up very interesting future opportunities for resource scheduling at the endpoints (the browsers, and the origin servers) such as prioritization, or recent proposals like HTTP early hints to provide clients experiences where connections are not overloaded or competing for those resources. When coupled with CERTIFICATE Frames IETF draft, we can further eliminate the need for manual certificate modifications as a server can prove its authority of hostnames after connection establishment without any additional SAN entries on the website’s TLS certificate.

Conclusion and call to action

In summary, the current Internet ecosystem has a lot of opportunities for connection coalescing with only a few changes to certificates and their server infrastructure. Servers can significantly reduce the number of TLS handshakes by roughly 50%, while reducing the number of render blocking DNS queries by over 60%. Clients additionally reap these benefits in privacy by reducing cleartext DNS exposure to network on-lookers.

To help make this a reality we are currently planning to add support for both HTTP/2 and HTTP/3 ORIGIN Frames for our customers. We also encourage other operators that manage third party resources to adopt support of ORIGIN Frame to improve the Internet ecosystem.
Our paper submission was accepted to the ACM Internet Measurement Conference 2022 and is available for download. If you’d like to work on projects like this, where you get to see the rubber meet the road for new standards, visit our careers page!

The state of HTTP in 2022

Post Syndicated from Mark Nottingham original https://blog.cloudflare.com/the-state-of-http-in-2022/

The state of HTTP in 2022

The state of HTTP in 2022

At over thirty years old, HTTP is still the foundation of the web and one of the Internet’s most popular protocols—not just for browsing, watching videos and listening to music, but also for apps, machine-to-machine communication, and even as a basis for building other protocols, forming what some refer to as a “second waist” in the classic Internet hourglass diagram.

What makes HTTP so successful? One answer is that it hits a “sweet spot” for most applications that need an application protocol. “Building Protocols with HTTP” (published in 2022 as a Best Current Practice RFC by the HTTP Working Group) argues that HTTP’s success can be attributed to factors like:

– familiarity by implementers, specifiers, administrators, developers, and users;
– availability of a variety of client, server, and proxy implementations;
– ease of use;
– availability of web browsers;
– reuse of existing mechanisms like authentication and encryption;
– presence of HTTP servers and clients in target deployments; and
– its ability to traverse firewalls.

Another important factor is the community of people using, implementing, and standardising HTTP. We work together to maintain and develop the protocol actively, to assure that it’s interoperable and meets today’s needs. If HTTP stagnates, another protocol will (justifiably) replace it, and we’ll lose all the community’s investment, shared understanding and interoperability.

Cloudflare and many others do this by sending engineers to participate in the IETF, where most Internet protocols are discussed and standardised. We also attend and sponsor community events like the HTTP Workshop to have conversations about what problems people have, what they need, and understand what changes might help them.

So what happened at all of those working group meetings, specification documents, and side events in 2022? What are implementers and deployers of the web’s protocol doing? And what’s coming next?

New Specification: HTTP/3

Specification-wise, the biggest thing to happen in 2022 was the publication of HTTP/3, because it was an enormous step towards keeping up with the requirements of modern applications and sites by using the network more efficiently to unblock web performance.

Way back in the 90s, HTTP/0.9 and HTTP/1.0 used a new TCP connection for each request—an astoundingly inefficient use of the network. HTTP/1.1 introduced persistent connections (which were backported to HTTP/1.0 with the `Connection: Keep-Alive` header). This was an improvement that helped servers and the network cope with the explosive popularity of the web, but even back then, the community knew it had significant limitations—in particular, head-of-line blocking (where one outstanding request on a connection blocks others from completing).

That didn’t matter so much in the 90s and early 2000s, but today’s web pages and applications place demands on the network that make these limitations performance-critical. Pages often have hundreds of assets that all compete for network resources, and HTTP/1.1 wasn’t up to the task. After some false starts, the community finally addressed these issues with HTTP/2 in 2015.

However, removing head-of-line blocking in HTTP exposed the same problem one layer lower, in TCP. Because TCP is an in-order, reliable delivery protocol, loss of a single packet in a flow can block access to those after it—even if they’re sitting in the operating system’s buffers. This turns out to be a real issue for HTTP/2 deployment, especially on less-than-optimal networks.

The answer, of course, was to replace TCP—the venerable transport protocol that so much of the Internet is built upon. After much discussion and many drafts in the QUIC Working Group, QUIC version 1 was published as that replacement in 2021.

HTTP/3 is the version of HTTP that uses QUIC. While the working group effectively finished it in 2021 along with QUIC, its publication was held until 2022 to synchronise with the publication of other documents (see below). 2022 was also a milestone year for HTTP/3 deployment; Cloudflare saw increasing adoption and confidence in the new protocol.

While there was only a brief gap of a few years between HTTP/2 and HTTP/3, there isn’t much appetite for working on HTTP/4 in the community soon. QUIC and HTTP/3 are both new, and the world is still learning how best to implement the protocols, operate them, and build sites and applications using them. While we can’t rule out a limitation that will force a new version in the future, the IETF built these protocols based upon broad industry experience with modern networks, and have significant extensibility available to ease any necessary changes.

New specifications: HTTP “core”

The other headline event for HTTP specs in 2022 was the publication of its “core” documents — the heart of HTTP’s specification. The core comprises: HTTP Semantics – things like methods, headers, status codes, and the message format; HTTP Caching – how HTTP caches work; HTTP/1.1 – mapping semantics to the wire, using the format everyone knows and loves.

Additionally, HTTP/2 was republished to properly integrate with the Semantics document, and to fix a few outstanding issues.

This is the latest in a long series of revisions for these documents—in the past, we’ve had the RFC 723x series, the (perhaps most well-known) RFC 2616, RFC 2068, and the grandparent of them all, RFC 1945. Each revision has aimed to improve readability, fix errors, explain concepts better, and clarify protocol operation. Poorly specified (or implemented) features are deprecated; new features that improve protocol operation are added. See the ‘Changes from…’ appendix in each document for the details. And, importantly, always refer to the latest revisions linked above; the older RFCs are now obsolete.

Deploying Early Hints

HTTP/2 included server push, a feature designed to allow servers to “push” a request/response pair to clients when they knew the client was going to need something, so it could avoid the latency penalty of making a request and waiting for the response.

After HTTP/2 was finalised in 2015, Cloudflare and many other HTTP implementations soon rolled out server push in anticipation of big performance wins. Unfortunately, it turned out that’s harder than it looks; server push effectively requires the server to predict the future—not only what requests the client will send but also what the network conditions will be. And, when the server gets it wrong (“over-pushing”), the pushed requests directly compete with the real requests that the browser is making, representing a significant opportunity cost with real potential for harming performance, rather than helping it. The impact is even worse when the browser already has a copy in cache, so it doesn’t need the push at all.

As a result, Chrome removed HTTP/2 server push in 2022. Other browsers and servers might still support it, but the community seems to agree that it’s only suitable for specialised uses currently, like the browser notification-specific Web Push Protocol.

That doesn’t mean that we’re giving up, however. The 103 (Early Hints) status code was published as an Experimental RFC by the HTTP Working Group in 2017. It allows a server to send hints to the browser in a non-final response, before the “real” final response. That’s useful if you know that the content is going to include some links to resources that the browser will fetch, but need more time to get the response to the client (because it will take more time to generate, or because the server needs to fetch it from somewhere else, like a CDN does).

Early Hints can be used in many situations that server push was designed for — for example, when you have CSS and JavaScript that a page is going to need to load. In theory, they’re not as optimal as server push, because they only allow hints to be sent when there’s an outstanding request, and because getting the hints to the client and acted upon adds some latency.

In practice, however, Cloudflare and our partners (like Shopify and Google) spent 2022 experimenting with Early Hints and finding them much safer to use, with promising performance benefits that include significant reductions in key web performance metrics.

We’re excited about the potential that Early Hints show; so excited that we’ve integrated them into Cloudflare Pages. We’re also evaluating new ways to improve performance using this new capability in the protocol.

Privacy-focused intermediation

For many, the most exciting HTTP protocol extensions in 2022 focused on intermediation—the ability to insert proxies, gateways, and similar components into the protocol to achieve specific goals, often focused on improving privacy.

The MASQUE Working Group, for example, is an effort to add new tunneling capabilities to HTTP, so that an intermediary can pass the tunneled traffic along to another server.

While CONNECT has enabled TCP tunnels for a long time, MASQUE enabled UDP tunnels, allowing more protocols to be tunneled more efficiently–including QUIC and HTTP/3.

At Cloudflare, we’re enthusiastic to be working with Apple to use MASQUE to implement iCloud Private Relay and enhance their customers’ privacy without relying solely on one company. We’re also very interested in the Working Group’s future work, including IP tunneling that will enable MASQUE-based VPNs.
Another intermediation-focused specification is Oblivious HTTP (or OHTTP). OHTTP uses sets of intermediaries to prevent the server from using connections or IP addresses to track clients, giving greater privacy assurances for things like collecting telemetry or other sensitive data. This specification is just finishing the standards process, and we’re using it to build an important new product, Privacy Gateway, to protect the privacy of our customers’ customers.

We and many others in the Internet community believe that this is just the start, because intermediation can partition communication, a valuable tool for improving privacy.

Protocol security

Finally, 2022 saw a lot of work on security-related aspects of HTTP. The Digest Fields specification is an update to the now-ancient `Digest` header field, allowing integrity digests to be added to messages. The HTTP Message Signatures specification enables cryptographic signatures on requests and responses — something that has widespread ad hoc deployment, but until now has lacked a standard. Both specifications are in the final stages of standardisation.

A revision of the Cookie specification also saw a lot of progress in 2022, and should be final soon. Since it’s not possible to get rid of them completely soon, much work has taken place to limit how they operate to improve privacy and security, including a new `SameSite` attribute.

Another set of security-related specifications that Cloudflare has invested in for many years is Privacy Pass also known as “Private Access Tokens.” These are cryptographic tokens that can assure clients are real people, not bots, without using an intrusive CAPTCHA, and without tracking the user’s activity online. In HTTP, they take the form of a new authentication scheme.

While Privacy Pass is still not quite through the standards process, 2022 saw its broad deployment by Apple, a huge step forward. And since Cloudflare uses it in Turnstile, our CAPTCHA alternative, your users can have a better experience today.

What about 2023?

So, what’s next? Besides, the specifications above that aren’t quite finished, the HTTP Working Group has a few other works in progress, including a QUERY method (think GET but with a body), Resumable Uploads (based on tus), Variants (an improved Vary header for caching), improvements to Structured Fields (including a new Date type), and a way to retrofit existing headers into Structured Fields. We’ll write more about these as they progress in 2023.

At the 2022 HTTP Workshop, the community also talked about what new work we can take on to improve the protocol. Some ideas discussed included improving our shared protocol testing infrastructure (right now we have a few resources, but it could be much better), improving (or replacing) Alternative Services to allow more intelligent and correct connection management, and more radical changes, like alternative, binary serialisations of headers.

There’s also a continuing discussion in the community about whether HTTP should accommodate pub/sub, or whether it should be standardised to work over WebSockets (and soon, WebTransport). Although it’s hard to say now, adjacent work on Media over QUIC that just started might provide an opportunity to push this forward.

Of course, that’s not everything, and what happens to HTTP in 2023 (and beyond) remains to be seen. HTTP is still evolving, even as it stays compatible with the largest distributed hypertext system ever conceived—the World Wide Web.

Delivering HTTP/2 upload speed improvements

Post Syndicated from Junho Choi original https://blog.cloudflare.com/delivering-http-2-upload-speed-improvements/

Delivering HTTP/2 upload speed improvements

Delivering HTTP/2 upload speed improvements

Cloudflare recently shipped improved upload speeds across our network for clients using HTTP/2. This post describes our journey from troubleshooting an issue to fixing it and delivering faster upload speeds to the global Internet.

We launched speed.cloudflare.com in May 2020 to give our users insight into how well their networks perform. The test provides download, upload and latency tests. Soon after release, we received reports from a small number of users that sometimes upload speeds were underreported. Our investigation determined that it seemed to happen with end users that had high upload bandwidth available (several hundreds Mbps class cable modem or fiber service). Our speed tests are performed via browser JavaScript, and most browsers use HTTP/2 by default. We found that HTTP/2 upload speeds were sometimes much slower than HTTP/1.1 (assuming all TLS) when the user had high available upload bandwidth.

Upload speed is more important than ever, especially for people using home broadband connections. As many people have been forced to work from home they’re using their broadband connections differently than before. Prior to the pandemic broadband traffic was very asymmetric (you downloaded way more than you uploaded… think listening to music, or streaming a movie), but now we’re seeing an increase in uploading as people video conference from home or create content from their home office.

Initial Tests

User reports were focused on particularly fast home networks. I set up a dummynet network simulator to test upload speed in a controlled environment. I launched a linux VM running our code inside my Macbook Pro and set up a dummynet between the VM and Mac host.  Measuring upload speed is simple – create a file and upload using curl to an endpoint which accepts a request body. I ran the same test 20 times and took a median upload speed (Mbps).

% dd if=/dev/urandom of=test.dat bs=1M count=10
% curl --http1.1 -w '%{speed_upload}\n' -sf -o/dev/null --data-binary @test.dat https://edge/upload-endpoint
% curl --http2 -w '%{speed_upload}\n' -sf -o/dev/null --data-binary @test.dat https://edge/upload-endpoint

Stepping up to uploading a 10MB object over a network which has 200Mbps available bandwidth and 40ms RTT, the result was surprising. Using our production configuration, HTTP/2 upload speed tested at almost half of the same test conditions using HTTP/1.1 (higher is better).

Delivering HTTP/2 upload speed improvements

The result may differ depending on your network, but the gap is bigger when the network is fast. On a slow network, like my home cable connection (5Mbps upload and 20ms RTT), HTTP/2 upload speed was almost identical to the performance observed with HTTP/1.1.

Receiver Flow Control

Before we get into more detail on this topic, my intuition suggested the issue was related to receiver flow control. Usually the client (browser or any HTTP client) is the receiver of data, but in the case the client is uploading content to the server, the server is the receiver of data. And the receiver needs some type of flow control of the receive buffer.

How we handle receiver flow control differs between HTTP/1.1 and HTTP/2. For example, HTTP/1.1 doesn’t define protocol-level receiver flow control since there is no multiplexing of requests in the connection and it’s up to the TCP layer which handles receiving data. Note that most of the modern OS TCP stacks have auto tuning of the receive buffer (we will revisit that later) and they tune based on the current BDP (bandwidth-delay product).

In the case of HTTP/2, there is a stream-level flow control mechanism because the protocol supports multiplexing of streams. Each HTTP/2 stream has its own flow control window and there is connection level flow control for all streams in the connection. If it’s too tight, the sender will be blocked by the flow control. If it’s too loose we may end up wasting memory for buffering. So keeping it optimal is important when implementing flow control and the most optimal strategy is to keep the receive buffer matching the current BDP. BDP represents the maximum bytes in flight in the given network and can be used as an optimal buffer size.

How NGINX handles the request body buffer

Initially I tried to find a parameter which controls NGINX upload buffering and tried to see if tuning the values improved the result. There are a couple of parameters which are related to uploading a request body.

And this is HTTP/2 specific:

Cloudflare does not use the proxy_buffering directive, so it can be immediately discounted.  client_body_buffer_size is the size of the request body buffer which is used regardless of the protocol, so this one applies to HTTP/1.1 and HTTP/2 as well.

When looking into the code, here is how it works:

Delivering HTTP/2 upload speed improvements

  • HTTP/1.1: use client_body_buffer_size buffer as a buffer between upstream and the client, simply repeating reading and writing using this buffer.
  • HTTP/2: since we need a flow control window update for the HTTP/2 DATA frame, there are two parameters:
    • http2_body_preread_size: it specifies the size of the initial request body read before it starts to send to the upstream.
    • client_body_buffer_size: it specifies the size of the request body buffer.
    • Those two parameters are used for allocating a request body buffer during uploading. Here is a brief summary of how unbuffered upload works:
      • Allocate a single request body buffer which size is a maximum of http2_body_preread_size and client_body_buffer_size. This means if http2_body_preread_size is 64KB and client_body_buffer_size is 128KB, then a 128KB buffer is allocated. We use 128KB for client_body_buffer_size.
      • HTTP/2 Settings INITIAL_WINDOW_SIZE of the stream is set to http2_body_preread_size and we use 64KB as a default (the RFC7540 default value).
      • HTTP/2 module reads up to http2_body_preread_size before sending it to upstream.
      • After flushing the preread buffer, keep reading from the client and write to upstream and send WINDOW_UPDATE frame back to the client when necessary until the request body is fully received.

To summarise what this means: HTTP/1.1 simply uses a single buffer, so TCP socket buffers do the flow control. However with HTTP/2, the application layer also has receiver flow control and NGINX uses a fixed size buffer for the receiver. This limits upload speed when the current link has a BDP larger than the current request body buffer size. So the bottleneck is HTTP/2 flow control when the buffer size is too tight.

We’re going to need a bigger buffer?

In theory, bigger buffer sizes should avoid upload bottlenecks, so I tried a few out by running my tests again. The previous chart result is now labelled “prod” and plotted alongside HTTP/2 tests with client_body_buffer_size set to 256KB, 512KB and 1024KB:

Delivering HTTP/2 upload speed improvements

It appears 512KB is an optimal value for client_body_buffer_size.

What if I test with some other network parameter? Here is when RTT is 10ms, in this case, 256KB looks optimal.

Delivering HTTP/2 upload speed improvements

Both cases look much better than current 128KB and get a similar performance to HTTP/1.1 or even better. However, it seems like the optimal buffer size is a moving target and furthermore having too large a buffer size can hurt the performance: we need a smart way to find the optimal buffer size.

Autotuning request body buffer size

One of the ideas that can help this kind of situation is autotuning. For example, modern TCP stacks autotune their receive buffers automatically. In production, our edge also has TCP receiver buffer autotuning enabled by default.

net.ipv4.tcp_moderate_rcvbuf = 1

But in case of HTTP/2, TCP buffer autotuning is not very effective because the HTTP/2 layer is doing its own flow control and the existing 128KB was too small for a high BDP link. At this point, I decided to pursue autotuning HTTP/2 receive buffer sizing as well, similar to what TCP does.

The basic idea is that NGINX doubles the size of HTTP/2 request body buffer size based on its BDP. Here is an algorithm currently implemented in our version of NGINX:

  • Allocate a request body buffer as explained above.
  • For every RTT (using linux tcp_info), update the current BDP.
  • Double the request body buffer size when the current BDP > (receiver window / 4).

Test Result

Lab Test

Here is a test result when HTTP/2 autotune upload is enabled (still using client_body_buffer_size 128KB). You can see “h2 autotune” is doing pretty well – similar or even slightly faster than HTTP/1.1 speed (that’s the initial goal). It might be slightly worse than a hand-picked optimal buffer size for given conditions, but you can see now NGINX picks up optimal buffer size automatically based on network conditions.

Delivering HTTP/2 upload speed improvements
Delivering HTTP/2 upload speed improvements

Production test

After we deployed this feature, I ran similar tests against our production edge, uploading a 10MB file from well connected client nodes to our edge. I created a Linux VM instance in Google Cloud and ran the upload test where the network is very fast (a few Gbps) and low latency (<10ms).

Here is when I run the test in the Google Cloud Belgium region to our CDG (Paris) PoP which has 7ms RTT. This looks very good with almost 3x improvement.

Delivering HTTP/2 upload speed improvements

I also tested between the Google Cloud Tokyo region and our NRT (Tokyo) PoP, which had a 2.3ms RTT. Although this is not realistic for home users, the results are interesting. A 128KB fixed buffer performs well, but HTTP/2 with buffer autotune outperforms even HTTP/1.1.

Delivering HTTP/2 upload speed improvements

Summary

HTTP/2 upload buffer autotuning is now fully deployed in the Cloudflare edge. Customers should now benefit from improved upload performance for all HTTP/2 connections, including speed tests on speed.cloudflare.com. Autotuning upload buffer size logic works well for most cases, and now HTTP/2 upload is much faster than before! When we think about the performance we usually tend to think about download speed or latency reduction, but faster uploading can also help users working from home when they need a large amount of upload, such as photo/video sharing apps, content creation, video conferencing or self broadcasting.

Many thanks to Lucas Pardue and Rustam Lalkaka for great feedback and suggestions on the article.