QUIC is a new Internet transport protocol for secure, reliable and multiplexed communications. HTTP/3 builds on top of QUIC, leveraging the new features to fix performance problems such as Head-of-Line blocking. This enables web pages to load faster, especially over troublesome networks.
QUIC and HTTP/3 are open standards that have been under development in the IETF for almost exactly 4 years. On October 21, 2020, following two rounds of Working Group Last Call, draft 32 of the family of documents that describe QUIC and HTTP/3 were put into IETF Last Call. This is an important milestone for the group. We are now telling the entire IETF community that we think we’re almost done and that we’d welcome their final review.
Speaking personally, I’ve been involved with QUIC in some shape or form for many years now. Earlier this year I was honoured to be asked to help co-chair the Working Group. I’m pleased to help shepherd the documents through this important phase, and grateful for the efforts of everyone involved in getting us there, especially the editors. I’m also excited about future opportunities to evolve on top of QUIC v1 to help build a better Internet.
There are two aspects to protocol development. One aspect involves writing and iterating upon the documents that describe the protocols themselves. Then, there’s implementing, deploying and testing libraries, clients and/or servers. These aspects operate hand in hand, helping the Working Group move towards satisfying the goals listed in its charter. IETF Last Call marks the point that the group and their responsible Area Director (in this case Magnus Westerlund) believe the job is almost done. Now is the time to solicit feedback from the wider IETF community for review. At the end of the Last Call period, the stakeholders will take stock, address feedback as needed and, fingers crossed, go onto the next step of requesting the documents be published as RFCs on the Standards Track.
Although specification and implementation work hand in hand, they often progress at different rates, and that is totally fine. The QUIC specification has been mature and deployable for a long time now. HTTP/3 has been generally available on the Cloudflare edge since September 2019, and we’ve been delighted to see support roll out in user agents such as Chrome, Firefox, Safari, curl and so on. Although draft 32 is the latest specification, the community has for the time being settled on draft 29 as a solid basis for interoperability. This shouldn’t be surprising, as foundational aspects crystallize the scope of changes between iterations decreases. For the average person in the street, there’s not really much difference between 29 and 32.
So today, if you visit a website with HTTP/3 enabled—such as https://cloudflare-quic.com—you’ll probably see response headers that contain Alt-Svc: h3-29=”… . And in a while, once Last Call completes and the RFCs ship, you’ll start to see websites simply offer Alt-Svc: h3=”… (note, no draft version!).
Need a deep dive?
We’ve collected a bunch of resource links at https://cloudflare-quic.com. If you’re more of an interactive visual learner, you might be pleased to hear that I’ve also been hosting a series on Cloudflare TV called “Levelling up Web Performance with HTTP/3”. There are over 12 hours of content including the basics of QUIC, ways to measure and debug the protocol in action using tools like Wireshark, and several deep dives into specific topics. I’ve also been lucky to have some guest experts join me along the way. The table below gives an overview of the episodes that are available on demand.
Understanding the role of congestion control in QUIC. Featuring Junho Choi.
So does Last Call mean QUIC is “done”? Not by a long shot. The new protocol is a giant leap for the Internet, because it enables new opportunities and innovation. QUIC v1 is basically the set of documents that have gone into Last Call. We’ll continue to see people gain experience deploying and testing this, and no doubt cool blog posts about tweaking parameters for efficiency and performance are on the radar. But QUIC and HTTP/3 are extensible, so we’ll see people interested in trying new things like multipath, different congestion control approaches, or new ways to carry data unreliably such as the DATAGRAM frame.
We’re also seeing people interested in using QUIC for other use cases. Mapping other application protocols like DNS to QUIC is a rapid way to get its improvements. We’re seeing people that want to use QUIC as a substrate for carrying other transport protocols, hence the formation of the MASQUE Working Group. There’s folks that want to use QUIC and HTTP/3 as a “supercharged WebSocket”, hence the formation of the WebTransport Working Group.
Whatever the future holds for QUIC, we’re just getting started, and I’m excited.
HTTP/3 is the third major version of the Hypertext Transfer Protocol, which takes the bold step of moving away from TCP to the new transport protocol QUIC in order to provide performance and security improvements.
During Cloudflare’s Birthday Week 2019, we were delighted to announce that we had enabled QUIC and HTTP/3 support on the Cloudflare edge network. This was joined by support from Google Chrome and Mozilla Firefox, two of the leading browser vendors and partners in our effort to make the web faster and more reliable for all. A big part of developing new standards is interoperability, which typically means different people analysing, implementing and testing a written specification in order to prove that it is precise, unambiguous, and actually implementable.
At the time of our announcement, Chrome Canary had experimental HTTP/3 support and we were eagerly awaiting a release of Firefox Nightly. Now that Firefox supports HTTP/3 we thought we’d share some instructions to help you enable and test it yourselves.
How do I enable HTTP/3 for my domain?
Simply go to the Cloudflare dashboard and flip the switch from the “Network” tab manually:
Using Firefox Nightly as an HTTP/3 client
Firefox Nightly has experimental support for HTTP/3. In our experience things are pretty good but be aware that you might experience some teething issues, so bear that in mind if you decide to enable and experiment with HTTP/3. If you’re happy with that responsibility, you’ll first need to download and install the latest Firefox Nightly build. Then open Firefox and enable HTTP/3 by visiting “about:config” and setting “network.http.http3.enabled” to true. There are some other parameters that can be tweaked but the defaults should suffice.
Once HTTP/3 is enabled, you can visit your site to test it out. A straightforward way to check if HTTP/3 was negotiated is to check the Developer Tools “Protocol” column in the “Network” tab (on Windows and Linux the Developer Tools keyboard shortcut is Ctrl+Shift+I, on macOS it’s Command+Option+I). This “Protocol” column might not be visible at first, so to enable it right-click one of the column headers and check “Protocol” as shown below.
Then reload the page and you should see that “HTTP/3” is reported.
The aforementioned teething issues might cause HTTP/3 not to show up initially. When you enable HTTP/3 on a zone, we add a header field such as alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400 to all responses for that zone. Clients see this as an advertisement to try HTTP/3 out and will take up the offer on the next request. So to make this happen you can reload the page but make sure that you bypass the local browser cache (via the “Disable Cache” checkbox, or use the Shift-F5 key combo) or else you’ll just see the protocol used to fetch the resource the first time around. Finally, Firefox provides the “about:networking” page which provides a list of visited zones and the HTTP version that was used to load them; for example, this very blog.
Sometimes browsers can get sticky to an existing HTTP connection and will refuse to start an HTTP/3 connection, this is hard to detect by humans, so sometimes the best option is to close the app completely and reopen it. Finally, we’ve also seen some interactions with Service Workers that make it appear that a resource was fetched from the network using HTTP/1.1, when in fact it was fetched from the local Service Worker cache. In such cases if you’re keen to see HTTP/3 in action then you’ll need to deregister the Service Worker. If you’re in doubt about what is happening on the network it is often useful to verify things independently, for example capturing a packet trace and dissecting it with Wireshark.
After more than three and a half years and substantial discussion, all 845 of the design issues raised against the QUIC protocol drafts have gained consensus or have a proposed resolution. In that time the protocol has been considerably transformed; it has become more secure, much more widely implemented, and has been shown to be interoperable. Both the Chairs and the Editors feel that it is ready to proceed in standardisation.
The coming months will see the specifications settle and we anticipate that implementations will continue to improve their QUIC and HTTP/3 support, eventually enabling it in their stable channels. We’re pleased to continue working with industry partners such as Mozilla to help build a better Internet together.
In this post, we will talk about QUIC congestion control and loss recovery briefly and CUBIC and HyStart++ in the quiche congestion control module. We will also discuss lab test results and how to visualize those using qlog which was recently added to the quiche library as well.
QUIC Congestion Control and Loss Recovery
In the network transport area, congestion control is how to decide how much data the connection can send into the network. It has an important role in networking so as not to overrun the link but also at the same time it needs to play nice with other connections in the same network to ensure that the overall network, the Internet, doesn’t collapse. Basically congestion control is trying to detect the current capacity of the link and tune itself in real time and it’s one of the core algorithms for running the Internet.
QUIC congestion control has been written based on many years of TCP experience, so it is little surprise that the two have mechanisms that bear resemblance. It’s based on the CWND (congestion window, the limit of how many bytes you can send to the network) and the SSTHRESH (slow start threshold, how to start a new collection slowly so as not to overwhelm the network). Congestion control mechanisms can have complicated edge cases and can be hard to tune. Since QUIC is a new transport protocol that people are implementing from scratch, the current draft recommends Reno as a relatively simple mechanism to get people started. However, it has known limitations and so QUIC is designed to have pluggable congestion control; it’s up to implementers to adopt any more advanced ones of their choosing.
Since Reno became the standard for TCP congestion control, many congestion control algorithms have been proposed by academia and industry. Largely there are two categories: loss-based congestion control such as Reno and CUBIC, where the congestion control responds to a packet loss event, and delay-based congestion control, such as Vegas and BBR , which the algorithm tries to find a balance between the bandwidth and RTT increase and tune the packet send rate.
You can port TCP based congestion control algorithms to QUIC without much change by implementing a few hooks. quiche provides a modular API to add a new congestion control module easily.
Loss detection is how to detect packet loss at the sender side. It’s usually separated from the congestion control algorithm but helps the congestion control to quickly respond to the congestion. Packet loss can be a result of the congestion on the link, but the link layer may also drop a packet without congestion due to the characteristics of the physical layer, such as on a WiFi or mobile network.
Traditionally TCP uses 3 DUP ACKs for ACK based detection, but delay-based loss detection such as RACK has also been used over the years. QUIC combines the lesson from TCP into two categories . One is based on the packet threshold (similar to 3 DUP ACK detection) and the other is based on a time threshold (similar to RACK). QUIC also has ACK Ranges similar to TCP SACK to provide a hole in the receive buffer but ACK Ranges can keep a longer list of received packets in the ACK frame than TCP SACK. This simplifies the implementation overall and helps provide quick recovery when there is multiple loss.
Reno is easy to understand and doesn’t need additional memory to store the state so can be implemented in low spec hardware too. However, its slow start can be very aggressive because it keeps increasing the CWND quickly until it sees congestion. In other words, it doesn’t stop until it sees the packet loss.
Note that there are multiple states for Reno; Reno starts from “slow start” mode which increases the CWND very aggressively, roughly 2x for every RTT until the congestion is detected or CWND > SSTHRESH. When packet loss is detected, it enters into the “recovery” mode until packet loss is recovered.
When it exits from recovery (no lost ranges) and CWND > SSTHRESH, it enters into the “congestion avoidance” mode where the CWND grows slowly (roughly a full packet per RTT) and tries to converge on a stable CWND. As a result you will see a “sawtooth” pattern when you make a graph of the CWND over time.
Here is an example of Reno congestion control CWND graph. See the “Congestion Window” line.
CUBIC was announced in 2008 and became the default congestion control in the Linux kernel. Currently it’s defined in RFC8312 and implemented in many OS including Linux, BSD and Windows. quiche’s CUBIC implementation follows RFC8312 with a fix made by Google in the Linux kernel .
What makes the difference from Reno is during congestion avoidance its CWND growth is based on CUBIC function as follows:
Wmax is the value of CWND when the congestion is detected. Then it will reduce the CWND by 30% and then the CWND starts to grow again using a CUBIC function as in the graph, approaching Wmax aggressively in the beginning in the first half but slowly converging to Wmax later. This makes sure that CWND growth approaches the previous point carefully and once we pass Wmax, it starts to grow aggressively again after some time to find a new max CWND (this is called “Max Probing”).
Also it has a “TCP-friendly” (actually a Reno-friendly) mode to make sure CWND growth is always bigger than Reno. When the congestion event happens, CUBIC reduces its CWND by 30%, where Reno cuts down CWND by 50%. This makes CUBIC a little more aggressive on packet loss.
Note that the original CUBIC only defines how to update the CWND during congestion avoidance. Slow start mode is exactly the same as Reno.
The authors of CUBIC made a separate effort to improve slow start because CUBIC only changed the way the CWND grows during congestion avoidance. They came up with the idea of HyStart .
HyStart is based on two ideas and basically changes how the CWND is updated during slow start:
RTT delay samples: when the RTT is increased during slow start and over the threshold, it exits slow start early and enters into congestion avoidance.
ACK train: When ACK inter-arrival time gets higher and over the threshold, it exits slow start early and enters into congestion avoidance.
However in the real world, ACK train is not very useful because of ACK compression (merging multiple ACKs into one). Also RTT delay may not work well when the network is unstable.
To improve such situations there is a new IETF draft proposed by Microsoft engineers named HyStart++ . HyStart++ is included in the Windows 10 TCP stack with CUBIC.
It’s a little different from original HyStart:
No ACK Train, only RTT sampling.
Add a LSS (Limited Slow Start) phase after exiting slow start. LSS grows the CWND faster than congestion avoidance but slower than Reno slow start. Instead of going into congestion avoidance directly, slow start exits to LSS and LSS exits to congestion avoidance when packet loss happens.
In quiche, HyStart++ is turned on by default for both Reno and CUBIC congestion control and can be configured via API.
Here is a test result using the test lab . The test condition is as follows:
5Mbps bandwidth, 60ms RTT with a different packet loss from 0% to 8%
In these tests, TCP-CUBIC (blue bars) is the baseline to which we compare the performance of QUIC congestion control variants. We include QUIC-RENO (red and yellow bars) because that is the default QUIC baseline. Reno is simpler so we expect it to perform worse than TCP-CUBIC. QUIC-CUBIC (green and orange bars) should perform the same or better than TCP-CUBIC.
You can see with 0% packet loss TCP and QUIC are almost doing the same (but QUIC is slightly slower). As packet loss increases QUIC CUBIC performs better than TCP CUBIC. QUIC loss recovery looks to work well, which is great news for real-world networks that do encounter loss.
With HyStart++, overall performance doesn’t change but that is to be expected, because the main goal of HyStart++ is to prevent overshooting the network. We will see that in the next section.
The impact of HyStart++
HyStart++ may not improve the download time but it will reduce packet loss while maintaining the same performance without it. Since slow start will exit to congestion avoidance when packet loss is detected, we focus on 0% packet loss where only network congestion creates packet loss.
For each test, the number of detected packets lost (not the retransmit count) is shown in the following chart. The lost packets number is the average of 20 runs for each test.
As shown above, you can see that HyStart++ reduces a lot of packet loss.
Note that compared with Reno, CUBIC can create more packet loss in general. This is because the CUBIC CWND can grow faster than Reno during congestion avoidance and also reduces the CWND less (30%) than Reno (50%) at the congestion event.
Visualization using qlog and qvis
qvis is a visualization tool based on qlog . Since quiche has implemented qlog support , we can take qlogs from a QUIC connection and use the qvis tool to visualize connection stats. This is a very useful tool for protocol development. We already used qvis for the Reno graph but let’s see a few more examples to understand how HyStart++ works.
CUBIC without HyStart++
Here is a qvis congestion chart for a 16MB transfer in the same lab test conditions, with 0% packet loss. You can see a high peak of CWND in the beginning due to slow start. After some time, it starts to show the CUBIC window growth pattern (concave function).
When we zoom into the slow start section (the first 1.5 seconds), we can see there is a linear increase of CWND during slow start. This continues until we see a packet lost around 770ms and enters into congestion avoidance, as you can see in the following chart:
CUBIC with HyStart++
Let’s see the same graph when HyStart++ is enabled. You can see the slow start peak is much smaller than without HyStart++, which will lead to less overshooting and packet loss:
When we zoom in the slow start part again, now we can see that the slow start exits to Limited Slow Start (LSS) around 430ms and exit to congestion avoidance at the congestion event around 620ms.
As a result you can see the slope is less steep until congestion is detected. It will lead to less packet loss due to less overshooting the network and faster convergence to a stable CWND.
Conclusions and Future Tasks
The QUIC draft spec already has integrated a lot of experience from TCP congestion control and loss recovery. It recommends the simple Reno mechanism as a means to get people started implementing the protocol but is under no illusion that there are better performing ones out there. So QUIC is designed to be pluggable in order for it to adopt mechanisms that are being deployed in state-of-the-art TCP implementations.
CUBIC and HyStart++ are known implementations in the TCP world and give better performance (faster download and less packet loss) than Reno. We’ve made quiche pluggable and have added CUBIC and HyStart++ support. Our lab testing shows that QUIC is a clear performance winner in lossy network conditions, which is the very thing it is designed for.
In the future, we also plan to work on advanced features in quiche, such as packet pacing, advanced recovery and BBR congestion control for better QUIC performance. Using quiche you can switch among multiple congestion control algorithms using the config API at the connection level, so you can play with it and choose the best one depending on your need. qlog endpoint logging can be visualized to provide high accuracy insight into how QUIC is behaving, greatly helping understanding and development.
We announced support for HTTP/3, the successor to HTTP/2 during Cloudflare’s birthday week last year. Our goal is and has always been to help build a better Internet. Collaborating on standards is a big part of that, and we’re very fortunate to do that here.
Even though HTTP/3 is still in draft status, we’ve seen a lot of interest from our users. So far, over 113,000 zones have activated HTTP/3 and, if you are using an experimental browser those zones can be accessed using the new protocol! It’s been great seeing so many people enable HTTP/3: having real websites accessible through HTTP/3 means browsers have more diverse properties to test against.
When we launched support for HTTP/3, we did so in partnership with Google, who simultaneously launched experimental support in Google Chrome. Since then, we’ve seen more browsers add experimental support: Firefox to their nightly builds, other Chromium-based browsers such as Opera and Microsoft Edge through the underlying Chrome browser engine, and Safari via their technology preview. We closely follow these developments and partner wherever we can help; having a large network with many sites that have HTTP/3 enabled gives browser implementers an excellent testbed against which to try out code.
So, what’s the status and where are we now?
The IETF standardization process develops protocols as a series of document draft versions with the ultimate aim of producing a final draft version that is ready to be marked as an RFC. The members of the QUIC Working Group collaborate on analyzing, implementing and interoperating the specification in order to find things that don’t work quite right. We launched with support for Draft-23 for HTTP/3 and have since kept up with each new draft, with 27 being the latest at time of writing. With each draft the group improves the quality of the QUIC definition and gets closer to “rough consensus” about how it behaves. In order to avoid a perpetual state of analysis paralysis and endless tweaking, the bar for proposing changes to the specification has been increasing with each new draft. This means that changes between versions are smaller, and that a final RFC should closely match the protocol that we’ve been running in production.
One of the main touted advantages of HTTP/3 is increased performance, specifically around fetching multiple objects simultaneously. With HTTP/2, any interruption (packet loss) in the TCP connection blocks all streams (Head of line blocking). Because HTTP/3 is UDP-based, if a packet gets dropped that only interrupts that one stream, not all of them.
In addition, HTTP/3 offers 0-RTT support, which means that subsequent connections can start up much faster by eliminating the TLS acknowledgement from the server when setting up the connection. This means the client can start requesting data much faster than with a full TLS negotiation, meaning the website starts loading earlier.
The following illustrates the packet loss and its impact: HTTP/2 multiplexing two requests . A request comes over HTTP/2 from the client to the server requesting two resources (we’ve colored the requests and their associated responses green and yellow). The responses are broken up into multiple packets and, alas, a packet is lost so both requests are held up.
The above shows HTTP/3 multiplexing 2 requests. A packet is lost that affects the yellow response but the green one proceeds just fine.
Improvements in session startup mean that ‘connections’ to servers start much faster, which means the browser starts to see data more quickly. We were curious to see how much of an improvement, so we ran some tests. To measure the improvement resulting from 0-RTT support, we ran some benchmarks measuring time to first byte (TTFB). On average, with HTTP/3 we see the first byte appearing after 176ms. With HTTP/2 we see 201ms, meaning HTTP/3 is already performing 12.4% better!
Interestingly, not every aspect of the protocol is governed by the drafts or RFC. Implementation choices can affect performance, such as efficient packet transmission and choice of congestion control algorithm. Congestion control is a technique your computer and server use to adapt to overloaded networks: by dropping packets, transmission is subsequently throttled. Because QUIC is a new protocol, getting the congestion control design and implementation right requires experimentation and tuning.
In order to provide a safe and simple starting point, the Loss Detection and Congestion Control specification recommends the Reno algorithm but allows endpoints to choose any algorithm they might like. We started with New Reno but we know from experience that we can get better performance with something else. We have recently moved to CUBIC and on our network with larger size transfers and packet loss, CUBIC shows improvement over New Reno. Stay tuned for more details in future.
For our existing HTTP/2 stack, we currently support BBR v1 (TCP). This means that in our tests, we’re not performing an exact apples-to-apples comparison as these congestion control algorithms will behave differently for smaller vs larger transfers. That being said, we can already see a speedup in smaller websites using HTTP/3 when compared to HTTP/2. With larger zones, the improved congestion control of our tuned HTTP/2 stack shines in performance.
For a small test page of 15KB, HTTP/3 takes an average of 443ms to load compared to 458ms for HTTP/2. However, once we increase the page size to 1MB that advantage disappears: HTTP/3 is just slightly slower than HTTP/2 on our network today, taking 2.33s to load versus 2.30s.
Synthetic benchmarks are interesting, but we wanted to know how HTTP/3 would perform in the real world.
To measure, we wanted a third party that could load websites on our network, mimicking a browser. WebPageTest is a common framework that is used to measure the page load time, with nice waterfall charts. For analyzing the backend, we used our in-house Browser Insights, to capture timings as our edge sees it. We then tied both pieces together with bits of automation.
As a test case we decided to use this very blog for our performance monitoring. We configured our own instances of WebPageTest spread over the world to load these sites over both HTTP/2 and HTTP/3. We also enabled HTTP/3 and Browser Insights. So, every time our test scripts kickoff a webpage test with an HTTP/3 supported browser loading the page, browser analytics report the data back. Rinse and repeat for HTTP/2 to be able to compare.
The following graph shows the page load time for a real world page — blog.cloudflare.com, to compare the performance of HTTP/3 and HTTP/2. We have these performance measurements running from different geographical locations.
As you can see, HTTP/3 performance still trails HTTP/2 performance, by about 1-4% on average in North America and similar results are seen in Europe, Asia and South America. We suspect this could be due to the difference in congestion algorithms: HTTP/2 on BBR v1 vs. HTTP/3 on CUBIC. In the future, we’ll work to support the same congestion algorithm on both to get a more accurate apples-to-apples comparison.
Overall, we’re very excited to be allowed to help push this standard forward. Our implementation is holding up well, offering better performance in some cases and at worst similar to HTTP/2. As the standard finalizes, we’re looking forward to seeing browsers add support for HTTP/3 in mainstream versions. As for us, we continue to support the latest drafts while at the same time looking for more ways to leverage HTTP/3 to get even better performance, be it congestion tuning, prioritization or system capacity (CPU and raw network throughput).
In the meantime, if you’d like to try it out, just enable HTTP/3 on our dashboard and download a nightly version of one of the major browsers. Instructions on how to enable HTTP/3 can be found on our developer documentation.
At Cloudflare, we develop protocols at multiple layers of the network stack. In the past, we focused on HTTP/1.1, HTTP/2, and TLS 1.3. Now, we are working on QUIC and HTTP/3, which are still in IETF draft, but gaining a lot of interest.
QUIC is a secure and multiplexed transport protocol that aims to perform better than TCP under some network conditions. It is specified in a family of documents: a transport layer which specifies packet format and basic state machine, recovery and congestion control, security based on TLS 1.3, and an HTTP application layer mapping, which is now called HTTP/3.
Let’s focus on the transport and recovery layer first. This layer provides a basis for what is sent on the wire (the packet binary format) and how we send it reliably. It includes how to open the connection, how to handshake a new secure session with the help of TLS, how to send data reliably and how to react when there is packet loss or reordering of packets. Also it includes flow control and congestion control to interact well with other transport protocols in the same network. With confidence in the basic transport and recovery layer, we can take a look at higher application layers such as HTTP/3.
To develop such a transport protocol, we need multiple stages of the development environment. Since this is a network protocol, it’s best to test in an actual physical network to see how works on the wire. We may start the development using localhost, but after some time we may want to send and receive packets with other hosts. We can build a lab with a couple of virtual machines, using Virtualbox, VMWare or even with Docker. We also have a local testing environment with a Linux VM. But sometimes these have a limited network (localhost only) or are noisy due to other processes in the same host or virtual machines.
Next step is to have a test lab, typically an isolated network focused on protocol analysis only consisting of dedicated x86 hosts. Lab configuration is particularly important for testing various cases – there is no one-size-fits-all scenario for protocol testing. For example, EDGE is still running in production mobile networks but LTE is dominant and 5G deployment is in early stages. WiFi is very common these days. We want to test our protocol in all those environments. Of course, we can’t buy every type of machine or have a very expensive network simulator for every type of environment, so using cheap hardware and an open source OS where we can configure similar environments is ideal.
The QUIC Protocol Testing lab
The goal of the QUIC testing lab is to aid transport layer protocol development. To develop a transport protocol we need to have a way to control our network environment and a way to get as many different types of debugging data as possible. Also we need to get metrics for comparison with other protocols in production.
The QUIC Testing Lab has the following goals:
Help with multiple transport protocol development: Developing a new transport layer requires many iterations, from building and validating packets as per protocol spec, to making sure everything works fine under moderate load, to very harsh conditions such as low bandwidth and high packet loss. We need a way to run tests with various network conditions reproducibly in order to catch unexpected issues.
Debugging multiple transport protocol development: Recording as much debugging info as we can is important for fixing bugs. Looking into packet captures definitely helps but we also need a detailed debugging log of the server and client to understand the what and why for each packet. For example, when a packet is sent, we want to know why. Is this because there is an application which wants to send some data? Or is this a retransmit of data previously known as lost? Or is this a loss probe which is not an actual packet loss but sent to see if the network is lossy?
Performance comparison between each protocol: We want to understand the performance of a new protocol by comparison with existing protocols such as TCP, or with a previous version of the protocol under development. Also we want to test with varying parameters such as changing the congestion control mechanism, changing various timeouts, or changing the buffer sizes at various levels of the stack.
Finding a bottleneck or errors easily: Running tests we may see an unexpected error – a transfer that timed out, or ended with an error, or a transfer was corrupted at the client side – each test needs to make sure every test is run correctly, by using a checksum of the original file to compare with what is actually downloaded, or by checking various error codes at the protocol of API level.
When we have a test lab with separate hardware, we have benefits, as follows:
Can configure the testing lab without public Internet access – safe and quiet.
Handy access to hardware and its console for maintenance purpose, or for adding or updating hardware.
Try other CPU architectures. For clients we use the Raspberry Pi for regular testing because this is ARM architecture (32bit or 64bit), similar to modern smartphones. So testing with ARM architecture helps for compatibility testing before going into a smartphone OS.
We can add a real smartphone for testing, such as Android or iPhone. We can test with WiFi but these devices also support Ethernet, so we can test them with a wired network for better consistency.
Here is a diagram of our QUIC Protocol Testing Lab:
This is a conceptual diagram and we need to configure a switch for connecting each machine. Currently, we have Raspberry Pis (2 and 3) as an Origin and a Client. And small Intel x86 boxes for the Traffic Shaper and Edge server plus Ethernet switches for interconnectivity.
Origin is simply serving HTTP and HTTPS test objects using a web server. Client may download a file from Origin directly to simulate a download direct from a customer’s origin server.
Client will download a test object from Origin or Edge, using a different protocol. In typical a configuration Client connects to Edge instead of Origin, so this is to simulate an edge server in the real world. For TCP/HTTP we are using the curl command line client and for QUIC, quiche’s http3_client with some modification.
Edge is running Cloudflare’s web server to serve HTTP/HTTPS via TCP and also the QUIC protocol using quiche. Edge server is installed with the same Linux kernel used on Cloudflare’s production machines in order to have the same low level network stack.
Traffic Shaper is sitting between Client and Edge (and Origin), controlling network conditions. Currently we are using FreeBSD and ipfw + dummynet. Traffic shaping can also be done using Linux’ netem which provides additional network simulation features.
The goal is to run tests with various network conditions, such as bandwidth, latency and packet loss upstream and downstream. The lab is able to run a plaintext HTTP test but currently our focus of testing is HTTPS over TCP and HTTP/3 over QUIC. Since QUIC is running over UDP, both TCP and UDP traffic need to be controlled.
Test Automation and Visualization
In the lab, we have a script installed in Client, which can run a batch of testing with various configuration parameters – for each test combination, we can define a test configuration, including:
Network Condition – Bandwidth, Latency, Packet Loss (upstream and downstream)
For example using netem traffic shaper we can simulate LTE network as below,(RTT=50ms, BW=22Mbps upstream and downstream, with BDP queue size)
Number of runs and number of requests in a single connection
The test script outputs a CSV file of results for importing into other tools for data processing and visualization – such as Google Sheets, Excel or even a jupyter notebook. Also it’s able to post the result to a database (Clickhouse in our case), so we can query and visualize the results.
Sometimes a whole test combination takes a long time – the current standard test set with simulated 2G, 3G, LTE, WiFi and various object sizes repeated 10 times for each request may take several hours to run. Large object testing on a slow network takes most of the time, so sometimes we also need to run a limited test (e.g. testing LTE-like conditions only for a sanity check) for quick debugging.
Chart using Google Sheets:
The following comparison chart shows the total transfer time in msec for TCP vs QUIC for different network conditions. The QUIC protocol used here is a development version one.
Debugging and performance analysis using of a smartphone
Mobile devices have become a crucial part of our day to day life, so testing the new transport protocol on mobile devices is critically important for mobile app performance. To facilitate that, we need to have a mobile test app which will proxy data over the new transport protocol under development. With this we have the ability to analyze protocol functionality and performance in mobile devices with different network conditions.
Adding a smartphone to the testbed mentioned above gives an advantage in terms of understanding real performance issues. The major smartphone operating systems, iOS and Android, have quite different networking stack. Adding a smartphone to testbed gives the ability to understand these operating system network stacks in depth which aides new protocol designs.
The above figure shows the network block diagram of another similar lab testbed used for protocol testing where a smartphone is connected both wired and wirelessly. A Linux netem based traffic shaper sits in-between the client and server shaping the traffic. Various networking profiles are fed to the traffic shaper to mimic real world scenarios. The client can be either an Android or iOS based smartphone, the server is a vanilla web server serving static files. Client, server and traffic shaper are all connected to the Internet along with the private lab network for management purposes.
The above lab has mobile devices for both Android or iOS installed with a test app built with a proprietary client proxy software for proxying data over the new transport protocol under development. The test app also has the ability to make HTTP requests over TCP for comparison purposes.
The Android or iOS test app can be used to issue multiple HTTPS requests of different object sizes sequentially and concurrently using TCP and QUIC as underlying transport protocol. Later, TTOTAL (total transfer time) of each HTTPS request is used to compare TCP and QUIC performance over different network conditions. One such comparison is shown below,
The table above shows the total transfer time taken for TCP and QUIC requests over an LTE network profile fetching different objects with different concurrency levels using the test app. Here TCP goes over native OS network stack and QUIC goes over Cloudflare QUIC stack.
Debugging network performance issues is hard when it comes to mobile devices. By adding an actual smartphone into the testbed itself we have the ability to take packet captures at different layers. These are very critical in analyzing and understanding protocol performance.
It’s easy and straightforward to capture packets and analyze them using the tcpdump tool on x86 boxes, but it’s a challenge to capture packets on iOS and Android devices. On iOS device ‘rvictl’ lets us capture packets on an external interface. But ‘rvictl’ has some drawbacks such as timestamps being inaccurate. Since we are dealing with millisecond level events, timestamps need to be accurate to analyze the root cause of a problem.
We can capture packets on internal loopback interfaces on jailbroken iPhones and rooted Android devices. Jailbreaking a recent iOS device is nontrivial. We also need to make sure that autoupdate of any sort is disabled on such a phone otherwise it would disable the jailbreak and you have to start the whole process again. With a jailbroken phone we have root access to the device which lets us take packet captures as needed using tcpdump.
Packet captures taken using jailbroken iOS devices or rooted Android devices connected to the lab testbed help us analyze performance bottlenecks and improve protocol performance.
iOS and Android devices different network stacks in their core operating systems. These packet captures also help us understand the network stack of these mobile devices, for example in iOS devices packets punted through loopback interface had a mysterious delay of 5 to 7ms.
Cloudflare is actively involved in helping to drive forward the QUIC and HTTP/3 standards by testing and optimizing these new protocols in simulated real world environments. By simulating a wide variety of networks we are working on our mission of Helping Build a Better Internet. For everyone, everywhere.
Would like to thank SangJo Lee, Hiren Panchasara, Lucas Pardue and Sreeni Tellakula for their contributions.
QUIC, the new Internet transport protocol designed to accelerate HTTP traffic, is delivered on top of UDP datagrams, to ease deployment and avoid interference from network appliances that drop packets from unknown protocols. This also allows QUIC implementations to live in user-space, so that, for example, browsers will be able to implement new protocol features and ship them to their users without having to wait for operating systems updates.
But while a lot of work has gone into optimizing TCP implementations as much as possible over the years, including building offloading capabilities in both software (like in operating systems) and hardware (like in network interfaces), UDP hasn’t received quite as much attention as TCP, which puts QUIC at a disadvantage. In this post we’ll look at a few tricks that help mitigate this disadvantage for UDP, and by association QUIC.
For the purpose of this blog post we will only be concentrating on measuring throughput of QUIC connections, which, while necessary, is not enough to paint an accurate overall picture of the performance of the QUIC protocol (or its implementations) as a whole.
The client and server are run on the same host (my laptop) running Linux 5.3, so the numbers don’t necessarily reflect what one would see in a production environment over a real network, but it should still be interesting to see how much of an impact each of the techniques have.
Currently the code that implements QUIC in NGINX uses the sendmsg() system call to send a single UDP packet at a time.
ssize_t sendmsg(int sockfd, const struct msghdr *msg,
The struct msghdr carries a struct iovec which can in turn carry multiple buffers. However, all of the buffers within a single iovec will be merged together into a single UDP datagram during transmission. The kernel will then take care of encapsulating the buffer in a UDP packet and sending it over the wire.
The throughput of this particular implementation tops out at around 80-90 MB/s, as measured by h2load when performing 10 sequential requests for a 100 MB resource.
Due to the fact that sendmsg() only sends a single UDP packet at a time, it needs to be invoked quite a lot in order to transmit all of the QUIC packets required to deliver the requested resources, as illustrated by the following bpftrace command:
Each of those system calls causes an expensive context switch between the application and the kernel, thus impacting throughput.
But while sendmsg() only transmits a single UDP packet at a time for each invocation, its close cousin sendmmsg() (note the additional “m” in the name) is able to batch multiple packets per system call:
int sendmmsg(int sockfd, struct mmsghdr *msgvec,
unsigned int vlen, int flags);
Multiple struct mmsghdr structures can be passed to the kernel as an array, each in turn carrying a single struct msghdr with its own struct iovec , with each element in the msgvec array representing a single UDP datagram.
Let’s see what happens when NGINX is updated to use sendmmsg() to send QUIC packets:
The number of system calls went down dramatically, which translates into an increase in throughput, though not quite as big as the decrease in syscalls:
UDP segmentation offload
With sendmsg() as well as sendmmsg(), the application is responsible for separating each QUIC packet into its own buffer in order for the kernel to be able to transmit it. While the implementation in NGINX uses static buffers to implement this, so there is no overhead in allocating them, all of these buffers need to be traversed by the kernel during transmission, which can add significant overhead.
Linux supports a feature, Generic Segmentation Offload (GSO), which allows the application to pass a single "super buffer" to the kernel, which will then take care of segmenting it into smaller packets. The kernel will try to postpone the segmentation as much as possible to reduce the overhead of traversing outgoing buffers (some NICs even support hardware segmentation, but it was not tested in this experiment due to lack of capable hardware). Originally GSO was only supported for TCP, but support for UDP GSO was recently added as well, in Linux 4.18.
This feature can be controlled using the UDP_SEGMENT socket option:
Where gso_size is the size of each segment that form the "super buffer" passed to the kernel from the application. Once configured, the application can provide one contiguous large buffer containing a number of packets of gso_size length (as well as a final smaller packet), that will then be segmented by the kernel (or the NIC if hardware segmentation offloading is supported and enabled).
GSO can also be combined with sendmmsg() to deliver an even bigger improvement. The idea being that each struct msghdr can be segmented in the kernel by setting the UDP_SEGMENT option using ancillary data, allowing an application to pass multiple “super buffers”, each carrying up to 64 segments, to the kernel in a single system call.
The improvement is again fairly significant:
Evolving from AFAP
Transmitting packets as fast as possible is easy to reason about, and there’s much fun to be had in optimizing applications for that, but in practice this is not always the best strategy when optimizing protocols for the Internet
Bursty traffic is more likely to cause or be affected by congestion on any given network path, which will inevitably defeat any optimization implemented to increase transmission rates.
Packet pacing is an effective technique to squeeze out more performance from a network flow. The idea being that adding a short delay between each outgoing packet will smooth out bursty traffic and reduce the chance of congestion, and packet loss. For TCP this was originally implemented in Linux via the fq packet scheduler, and later by the BBR congestion control algorithm implementation, which implements its own pacer.
Due to the nature of current QUIC implementations, which reside entirely in user-space, pacing of QUIC packets conflicts with any of the techniques explored in this post, because pacing each packet separately during transmission will prevent any batching on the application side, and in turn batching will prevent pacing, as batched packets will be transmitted as fast as possible once received by the kernel.
However Linux provides some facilities to offload the pacing to the kernel and give back some control to the application:
SO_MAX_PACING_RATE: an application can define this socket option to instruct the fq packet scheduler to pace outgoing packets up to the given rate. This works for UDP sockets as well, but it is yet to be seen how this can be integrated with QUIC, as a single UDP socket can be used for multiple QUIC connections (unlike TCP, where each connection has its own socket). In addition, this is not very flexible, and might not be ideal when implementing the BBR pacer.
SO_TXTIME / SCM_TXTIME: an application can use these options to schedule transmission of specific packets at specific times, essentially instructing fq to delay packets until the provided timestamp is reached. This gives the application a lot more control, and can be easily integrated into sendmsg() as well as sendmmsg(). But it does not yet support specifying different times for each packet when GSO is used, as there is no way to define multiple timestamps for packets that need to be segmented (each segmented packet essentially ends up being sent at the same time anyway).
While the performance gains achieved by using the techniques illustrated here are fairly significant, there are still open questions around how any of this will work with pacing, so more experimentation is required.
One of the more interesting features introduced by TLS 1.3, the latest revision of the TLS protocol, was the so called “zero roundtrip time connection resumption”, a mode of operation that allows a client to start sending application data, such as HTTP requests, without having to wait for the TLS handshake to complete, thus reducing the latency penalty incurred in establishing a new connection.
The basic idea behind 0-RTT connection resumption is that if the client and server had previously established a TLS connection between each other, they can use information cached from that session to establish a new one without having to negotiate the connection’s parameters from scratch. Notably this allows the client to compute the private encryption keys required to protect application data before even talking to the server.
However, in the case of TLS, “zero roundtrip” only refers to the TLS handshake itself: the client and server are still required to first establish a TCP connection in order to be able to exchange TLS data.
Zero means zero
QUIC goes a step further, and allows clients to send application data in the very first roundtrip of the connection, without requiring any other handshake to be completed beforehand.
Unfortunately, 0-RTT connection resumption is not all smooth sailing, and it comes with caveats and risks, which is why Cloudflare does not enable 0-RTT connection resumption by default. Users should consider the risks involved and decide whether to use this feature or not.
For starters, 0-RTT connection resumption does not provide forward secrecy, meaning that a compromise of the secret parameters of a connection will trivially allow compromising the application data sent during the 0-RTT phase of new connections resumed from it. Data sent after the 0-RTT phase, meaning after the handshake has been completed, would still be safe though, as TLS 1.3 (and QUIC) will still perform the normal key exchange algorithm (which is forward secret) for data sent after the handshake completion.
More worryingly, application data sent during 0-RTT can be captured by an on-path attacker and then replayed multiple times to the same server. In many cases this is not a problem, as the attacker wouldn’t be able to decrypt the data, which is why 0-RTT connection resumption is useful, but in some cases this can be dangerous.
For example, imagine a bank that allows an authenticated user (e.g. using HTTP cookies, or other HTTP authentication mechanisms) to send money from their account to another user by making an HTTP request to a specific API endpoint. If an attacker was able to capture that request when 0-RTT connection resumption was used, they wouldn’t be able to see the plaintext and get the user’s credentials, because they wouldn’t know the secret key used to encrypt the data; however they could still potentially drain that user’s bank account by replaying the same request over and over:
Of course this problem is not specific to banking APIs: any non-idempotent request has the potential to cause undesired side effects, ranging from slight malfunctions to serious security breaches.
In order to help mitigate this risk, Cloudflare will always reject 0-RTT requests that are obviously not idempotent (like POST or PUT requests), but in the end it’s up to the application sitting behind Cloudflare to decide which requests can and cannot be allowed with 0-RTT connection resumption, as even innocuous-looking ones can have side effects on the origin server.
To help origins detect and potentially disallow specific requests, Cloudflare also follows the techniques described in RFC8470. Notably, Cloudflare will add the Early-Data: 1 HTTP header to requests received during 0-RTT resumption that are forwarded to origins.
Origins able to understand this header can then decide to answer the request with the 425 (Too Early) HTTP status code, which will instruct the client that originated the request to retry sending the same request but only after the TLS or QUIC handshake have fully completed, at which point there is no longer any risk of replay attacks. This could even be implemented as part of a Cloudflare Worker.
This makes it possible for origins to allow 0-RTT requests for endpoints that are safe, such as a website’s index page which is where 0-RTT is most useful, as that is typically the first request a browser makes after establishing a connection, while still protecting other endpoints such as APIs and form submissions. But if an origin does not provide any of those non-idempotent endpoints, no action is required.
One stop shop for all your 0-RTT needs
Just like we previously did for TLS 1.3, we now support 0-RTT resumption for QUIC as well. In honor of this event, we have dusted off the user-interface controls that allow Cloudflare users to enable this feature for their websites, and introduced a dedicated toggle to control whether 0-RTT connection resumption is enabled or not, which can be found under the “Network” tab on the Cloudflare dashboard:
When TLS 1.3 and/or QUIC (via the HTTP/3 toggle) are enabled, 0-RTT connection resumption will be automatically offered to clients that support it, and the replay mitigation mentioned above will also be applied to the connections making use of this feature.
In addition, if you are a user of our open-source HTTP/3 patch for NGINX, after updating the patch to the latest version, you’ll be able to enable support for 0-RTT connection resumption in your own NGINX-based HTTP/3 deployment by using the built-in “ssl_early_data” option, which will work for both TLS 1.3 and QUIC+HTTP/3.
Just a few weeks ago we announced the availability on our edge network of HTTP/3, the new revision of HTTP intended to improve security and performance on the Internet. Everyone can now enable HTTP/3 on their Cloudflare zone and experiment with it using Chrome Canary as well as curl, among other clients.
We are now happy to announce that our implementation of HTTP/3 and QUIC can be integrated into your own installation of NGINX as well. This is made available as a patch to NGINX, that can be applied and built directly with the upstream NGINX codebase.
It’s important to note that this is not officially supported or endorsed by the NGINX project, it is just something that we, Cloudflare, want to make available to the wider community to help push adoption of QUIC and HTTP/3.
The above command instructs the NGINX build system to enable the HTTP/3 support ( --with-http_v3_module) by using the quiche library found in the path it was previously downloaded into ( --with-quiche=../quiche), as well as TLS and HTTP/2. Additional build options can be added as needed.
We are excited to make this available for everyone to to experiment and play with HTTP/3, but it’s important to note that the implementation is still experimental and it’s likely to have bugs as well as limitations in functionality. Feel free to submit a ticket to the quiche project if you run into problems or find any bug.
During last year’s Birthday Week we announced preliminary support for QUIC and HTTP/3 (or “HTTP over QUIC” as it was known back then), the new standard for the web, enabling faster, more reliable, and more secure connections to web endpoints like websites and APIs. We also let our customers join a waiting list to try QUIC and HTTP/3 as soon as they became available.
Since then, we’ve been working with industry peers through the Internet Engineering Task Force, including Google Chrome and Mozilla Firefox, to iterate on the HTTP/3 and QUIC standards documents. In parallel with the standards maturing, we’ve also worked on improving support on our network.
We are now happy to announce that QUIC and HTTP/3 support is available on the Cloudflare edge network. We’re excited to be joined in this announcement by Google Chrome and Mozilla Firefox, two of the leading browser vendors and partners in our effort to make the web faster and more reliable for all.
In the words of Ryan Hamilton, Staff Software Engineer at Google, “HTTP/3 should make the web better for everyone. The Chrome and Cloudflare teams have worked together closely to bring HTTP/3 and QUIC from nascent standards to widely adopted technologies for improving the web. Strong partnership between industry leaders is what makes Internet standards innovations possible, and we look forward to our continued work together.”
What does this mean for you, a Cloudflare customer who uses our services and edge network to make your web presence faster and more secure? Once HTTP/3 support is enabled for your domain in the Cloudflare dashboard, your customers can interact with your websites and APIs using HTTP/3. We’ve been steadily inviting customers on our HTTP/3 waiting list to turn on the feature (so keep an eye out for an email from us), and in the coming weeks we’ll make the feature available to everyone.
What does this announcement mean if you’re a user of the Internet interacting with sites and APIs through a browser and other clients? Starting today, you can use Chrome Canary to interact with Cloudflare and other servers over HTTP/3. For those of you looking for a command line client, curl also provides support for HTTP/3. Instructions for using Chrome and curl with HTTP/3 follow later in this post.
The Chicken and the Egg
Standards innovation on the Internet has historically been difficult because of a chicken and egg problem: which needs to come first, server support (like Cloudflare, or other large sources of response data) or client support (like browsers, operating systems, etc)? Both sides of a connection need to support a new communications protocol for it to be any use at all.
Cloudflare has a long history of driving web standards forward, from HTTP/2 (the version of HTTP preceding HTTP/3), to TLS 1.3, to things like encrypted SNI. We’ve pushed standards forward by partnering with like-minded organizations who share in our desire to help build a better Internet. Our efforts to move HTTP/3 into the mainstream are no different.
Throughout the HTTP/3 standards development process, we’ve been working closely with industry partners to build and validate client HTTP/3 support compatible with our edge support. We’re thrilled to be joined by Google Chrome and curl, both of which can be used today to make requests to the Cloudflare edge over HTTP/3. Mozilla Firefox expects to ship support in a nightly release soon as well.
Bringing this all together: today is a good day for Internet users; widespread rollout of HTTP/3 will mean a faster web experience for all, and today’s support is a large step toward that.
More importantly, today is a good day for the Internet: Chrome, curl, and Cloudflare, and soon, Mozilla, rolling out experimental but functional, support for HTTP/3 in quick succession shows that the Internet standards creation process works. Coordinated by the Internet Engineering Task Force, industry partners, competitors, and other key stakeholders can come together to craft standards that benefit the entire Internet, not just the behemoths.
Eric Rescorla, CTO of Firefox, summed it up nicely: “Developing a new network protocol is hard, and getting it right requires everyone to work together. Over the past few years, we’ve been working with Cloudflare and other industry partners to test TLS 1.3 and now HTTP/3 and QUIC. Cloudflare’s early server-side support for these protocols has helped us work the interoperability kinks out of our client-side Firefox implementation. We look forward to advancing the security and performance of the Internet together.”
It all started back in 1996 with the publication of the HTTP/1.0 specification which defined the basic HTTP textual wire format as we know it today (for the purposes of this post I’m pretending HTTP/0.9 never existed). In HTTP/1.0 a new TCP connection is created for each request/response exchange between clients and servers, meaning that all requests incur a latency penalty as the TCP and TLS handshakes are completed before each request.
Worse still, rather than sending all outstanding data as fast as possible once the connection is established, TCP enforces a warm-up period called “slow start”, which allows the TCP congestion control algorithm to determine the amount of data that can be in flight at any given moment before congestion on the network path occurs, and avoid flooding the network with packets it can’t handle. But because new connections have to go through the slow start process, they can’t use all of the network bandwidth available immediately.
The HTTP/1.1 revision of the HTTP specification tried to solve these problems a few years later by introducing the concept of “keep-alive” connections, that allow clients to reuse TCP connections, and thus amortize the cost of the initial connection establishment and slow start across multiple requests. But this was no silver bullet: while multiple requests could share the same connection, they still had to be serialized one after the other, so a client and server could only execute a single request/response exchange at any given time for each connection.
Finally, more than a decade later, came SPDY and then HTTP/2, which, among other things, introduced the concept of HTTP “streams”: an abstraction that allows HTTP implementations to concurrently multiplex different HTTP exchanges onto the same TCP connection, allowing browsers to more efficiently reuse TCP connections.
But, yet again, this was no silver bullet! HTTP/2 solves the original problem — inefficient use of a single TCP connection — since multiple requests/responses can now be transmitted over the same connection at the same time. However, all requests and responses are equally affected by packet loss (e.g. due to network congestion), even if the data that is lost only concerns a single request. This is because while the HTTP/2 layer can segregate different HTTP exchanges on separate streams, TCP has no knowledge of this abstraction, and all it sees is a stream of bytes with no particular meaning.
The role of TCP is to deliver the entire stream of bytes, in the correct order, from one endpoint to the other. When a TCP packet carrying some of those bytes is lost on the network path, it creates a gap in the stream and TCP needs to fill it by resending the affected packet when the loss is detected. While doing so, none of the successfully delivered bytes that follow the lost ones can be delivered to the application, even if they were not themselves lost and belong to a completely independent HTTP request. So they end up getting unnecessarily delayed as TCP cannot know whether the application would be able to process them without the missing bits. This problem is known as “head-of-line blocking”.
This is where HTTP/3 comes into play: instead of using TCP as the transport layer for the session, it uses QUIC, a new Internet transport protocol, which, among other things, introduces streams as first-class citizens at the transport layer. QUIC streams share the same QUIC connection, so no additional handshakes and slow starts are required to create new ones, but QUIC streams are delivered independently such that in most cases packet loss affecting one stream doesn’t affect others. This is possible because QUIC packets are encapsulated on top of UDP datagrams.
Using UDP allows much more flexibility compared to TCP, and enables QUIC implementations to live fully in user-space — updates to the protocol’s implementations are not tied to operating systems updates as is the case with TCP. With QUIC, HTTP-level streams can be simply mapped on top of QUIC streams to get all the benefits of HTTP/2 without the head-of-line blocking.
QUIC also combines the typical 3-way TCP handshake with TLS 1.3‘s handshake. Combining these steps means that encryption and authentication are provided by default, and also enables faster connection establishment. In other words, even when a new QUIC connection is required for the initial request in an HTTP session, the latency incurred before data starts flowing is lower than that of TCP with TLS.
But why not just use HTTP/2 on top of QUIC, instead of creating a whole new HTTP revision? After all, HTTP/2 also offers the stream multiplexing feature. As it turns out, it’s somewhat more complicated than that.
While it’s true that some of the HTTP/2 features can be mapped on top of QUIC very easily, that’s not true for all of them. One in particular, HTTP/2’s header compression scheme called HPACK, heavily depends on the order in which different HTTP requests and responses are delivered to the endpoints. QUIC enforces delivery order of bytes within single streams, but does not guarantee ordering among different streams.
This behavior required the creation of a new HTTP header compression scheme, called QPACK, which fixes the problem but requires changes to the HTTP mapping. In addition, some of the features offered by HTTP/2 (like per-stream flow control) are already offered by QUIC itself, so they were dropped from HTTP/3 in order to remove unnecessary complexity from the protocol.
HTTP/3, powered by a delicious quiche
QUIC and HTTP/3 are very exciting standards, promising to address many of the shortcomings of previous standards and ushering in a new era of performance on the web. So how do we go from exciting standards documents to working implementation?
We announced quiche a few months ago and since then have added support for the HTTP/3 protocol, on top of the existing QUIC support. We have designed quiche in such a way that it can now be used to implement HTTP/3 clients and servers or just plain QUIC ones.
How do I enable HTTP/3 for my domain?
As mentioned above, we have started on-boarding customers that signed up for the waiting list. If you are on the waiting list and have received an email from us communicating that you can now enable the feature for your websites, you can simply go to the Cloudflare dashboard and flip the switch from the “Network” tab manually:
We expect to make the HTTP/3 feature available to all customers in the near future.
Once enabled, you can experiment with HTTP/3 in a number of ways:
Using Google Chrome as an HTTP/3 client
In order to use the Chrome browser to connect to your website over HTTP/3, you first need to download and install the latest Canary build. Then all you need to do to enable HTTP/3 support is starting Chrome Canary with the “–enable-quic” and “–quic-version=h3-23” command-line arguments.
Once Chrome is started with the required arguments, you can just type your domain in the address bar, and see it loaded over HTTP/3 (you can use the Network tab in Chrome’s Developer Tools to check what protocol version was used). Note that due to how HTTP/3 is negotiated between the browser and the server, HTTP/3 might not be used for the first few connections to the domain, so you should try to reload the page a few times.
If this seems too complicated, don’t worry, as the HTTP/3 support in Chrome will become more stable as time goes on, enabling HTTP/3 will become easier.
This is what the Network tab in the Developer Tools shows when browsing this very blog over HTTP/3:
Note that due to the experimental nature of the HTTP/3 support in Chrome, the protocol is actually identified as “http2+quic/99” in Developer Tools, but don’t let that fool you, it is indeed HTTP/3.
In the coming months we’ll be working on improving and optimizing our QUIC and HTTP/3 implementation, and will eventually allow everyone to enable this new feature without having to go through a waiting list. We’ll continue updating our implementation as standards evolve, which may result in breaking changes between draft versions of the standards.
Here are a few new features on our roadmap that we’re particularly excited about:
One important feature that QUIC enables is seamless and transparent migration of connections between different networks (such as your home WiFi network and your carrier’s mobile network as you leave for work in the morning) without requiring a whole new connection to be created.
This feature will require some additional changes to our infrastructure, but it’s something we are excited to offer our customers in the future.
We are excited to support HTTP/3 and allow our customers to experiment with it while efforts to standardize QUIC and HTTP/3 are still ongoing. We’ll continue working alongside other organizations, including Google and Mozilla, to finalize the QUIC and HTTP/3 standards and encourage broad adoption.
Here’s to a faster, more reliable, more secure web experience for all.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.