Tag Archives: post quantum

The state of the post-quantum Internet

Post Syndicated from Bas Westerbaan original https://blog.cloudflare.com/pq-2024


Today, nearly two percent of all TLS 1.3 connections established with Cloudflare are secured with post-quantum cryptography. We expect to see double-digit adoption by the end of 2024. Apple announced in February 2024 that it will secure iMessage with post-quantum cryptography before the end of the year, and Signal chats are already secured. What once was the topic of futuristic tech demos will soon be the new security baseline for the Internet.

A lot has been happening in the field over the last few years, from mundane name changes (ML-KEM is the new name for Kyber), to new proposed algorithms in the signatures onramp, to the catastrophic attack on SIKE. Plenty that has been written merely three years ago now feels quite out of date. Thus, it is high time for an update: in this blog post we’ll take measure of where we are now in early 2024, what to expect for the coming years, and what you can do today.

Fraction of TLS 1.3 connections established with Cloudflare that are secured with post-quantum cryptography.

The quantum threat

First things first: why are we migrating our cryptography? It’s because of quantum computers. These marvelous devices, instead of restricting themselves to zeroes and ones, compute using more of what nature actually affords us: quantum superposition, interference, and entanglement. This allows quantum computers to excel at certain very specific computations, notably simulating nature itself, which will be very helpful in developing new materials.

Quantum computers are not going to replace regular computers, though: they’re actually much worse than regular computers at most tasks. Think of them as graphic cards — specialized devices for specific computations.

Unfortunately, quantum computers also excel at breaking key cryptography that’s in common use today. Thus, we will have to move to post-quantum cryptography: cryptography designed to be resistant against quantum attack. We’ll discuss the exact impact on the different types of cryptography later on. For now quantum computers are rather anemic: they’re simply not good enough today to crack any real-world cryptographic keys.

That doesn’t mean we shouldn’t worry yet: encrypted traffic can be harvested today, and decrypted with a quantum computer in the future.

Quantum numerology

When will they be good enough? Like clockwork, every year there are news stories of new quantum computers with record-breaking number of qubits. This focus on counting qubits is quite misleading. To start, quantum computers are analogue machines, and there is always some noise interfering with the computation.

There are big differences between the different types of technology used to build quantum computers: silicon-based quantum computers seem to scale well, are quick to execute instructions, but have very noisy qubits. This does not mean they’re useless: with quantum error correcting codes one can effectively turn tens of millions of noisy silicon qubits into a few thousand high-fidelity ones, which could be enough to break RSA. Trapped-ion quantum computers, on the other hand, have much less noise, but have been harder to scale. Only a few hundred-thousand trapped-ion qubits could potentially draw the curtain on RSA.

State-of-art in quantum computing measured by qubit count and noise in 2021, 2022, and 2023. Once the shaded gray area hits the left-most red line, we’re in trouble. Red line is expected to move to the left. Compiled by Samuel Jaques of the University of Waterloo.

We’re only scratching the surface with the number of qubits and noise. For instance, a quirk of many quantum computers is that only adjacent qubits can interact — something that most estimates do not take into account. On the other hand, for a specific quantum computer, a tailored algorithm can perform much better than a generic one. We can only guess what a future quantum computer will look like, and today’s estimates are most likely off by at least an order of magnitude.

When will quantum computers break real-world cryptography?

So, when do we expect the demise of RSA-2048 which is in common use today? In a 2022 survey, over half the interviewed experts thought it’d be more probable than not that by 2037 such a cryptographically relevant quantum computer would’ve been built.

We can also look at the US government’s timeline for the migration to post-quantum cryptography. The National Security Agency (NSA) aims to finish its migration before 2033, and will start to prefer post-quantum ready vendors for many products in 2025. The US government has a similarly ambitious timeline for the country as a whole: the aim is to be done by 2035.

NSA timeline for migrating third-party software to post-quantum cryptography.

More anecdotally, at industry conferences on the post-quantum migration, I see particularly high participation of the automotive branch. Not that surprising, considering that the median age of a car on the road is 14 years, a lot of money is on the line, and not all cryptography used in cars can be upgraded easily once on the road.

So when will it arrive? Whether it’s 2034 or 2050, it will be too soon. The immense success of cryptography means it’s all around us now, from dishwasher, to pacemaker, to satellite. Most upgrades will be easy, and fit naturally in the product’s lifecycle, but there will be a long tail of difficult and costly upgrades.

Two migrations

To help prioritize, it is important to understand that there is a big difference in the difficulty, impact, and urgency of the post-quantum migration for the different kinds of cryptography required to create secure connections. In fact, for most organizations there will be two post-quantum migrations: key agreement and signatures / certificates.

Already post-quantum secure: symmetric cryptography

Let’s explain this for the case of creating a secure connection when visiting a website in a browser. The workhorse is a symmetric cipher such as AES-GCM. It’s what you would think of when thinking of cryptography: both parties, in this case the browser and server, have a shared key, and they encrypt / decrypt their messages with the same key. Unless you have that key, you can’t read anything, or modify anything.

The good news is that symmetric ciphers, such as AES-GCM, are already post-quantum secure. There is a common misconception that Grover’s quantum algorithm requires us to double the length of symmetric keys. On closer inspection of the algorithm, it’s clear that it is not practical. The way NIST, the US National Institute for Standards and Technology (who have been spearheading the standardization of post-quantum cryptography) defines their post-quantum security levels is very telling. They define a specific security level by saying the scheme should be as hard to crack using either a classical or quantum computer as an existing symmetric cipher as follows:

Level Definition, as least as hard to break as … Example
1 To recover the key of AES-128 by exhaustive search ML-KEM-512, SLH-DSA-128s
2 To find a collision in SHA256 by exhaustive search ML-DSA-44
3 To recover the key of AES-192 by exhaustive search ML-KEM-768
4 To find a collision in SHA384 by exhaustive search
5 To recover the key of AES-256 by exhaustive search ML-KEM-1024, SLH-DSA-256s

NIST PQC security levels, higher is harder to break (“more secure”). The examples ML-DSA, SLH-DSA and ML-KEM are covered below.

There are good intentions behind suggesting doubling the key lengths of symmetric cryptography. In many use cases, the extra cost is not that high, and it mitigates any theoretical risk completely. Scaling symmetric cryptography is cheap: double the bits is typically far less than half the cost. So on the surface, it is simple advice.

But if we insist on AES-256, it seems only logical to insist on NIST PQC level 5 for the public key cryptography as well. The problem is that public key cryptography does not scale very well. Depending on the scheme, going from level 1 to level 5 typically more than doubles data usage and CPU cost. As we’ll see, deploying post-quantum signatures at level 1 is already painful, and deploying them at level 5 is problematic.

A second reason is that upgrading symmetric cryptography isn’t always easy. If it requires replacing hardware, it can be costly indeed. An organization that cannot migrate all its cryptography in time simply can’t afford to waste its time doubling symmetric key lengths.

First migration: key agreement

Symmetric ciphers are not enough on their own: how do I know which key to use when visiting a website for the first time? The browser can’t just send a random key, as everyone listening in would see that key as well. You’d think it’s impossible, but there is some clever math to solve this, so that the browser and server can agree on a shared key. Such a scheme is called a key agreement mechanism, and is performed in the TLS handshake. Today almost all traffic is secured with X25519, a Diffie–Hellman-style key agreement, but its security is completely broken by Shor’s algorithm on a quantum computer. Thus, any communication secured today with Diffie–Hellman, when stored, can be decrypted in the future by a quantum computer.

This makes it urgent to upgrade key agreement today. As we will see, luckily, post-quantum key agreement is relatively straight-forward to deploy.

Second migration: signatures / certificates

The key agreement allows secure agreement on a key, but there is a big gap: we do not know with whom we agreed on the key. If we only do key agreement, an attacker in the middle can do separate key agreements with the browser and server, and re-encrypt any exchanged messages. To prevent this we need one final ingredient: authentication.

This is achieved using signatures. When visiting a website, say cloudflare.com, the web server presents a certificate signed by a certification authority (CA) that vouches that the public key in that certificate is controlled by cloudflare.com. In turn, the web server signs the handshake and shared key using the private key corresponding to the public key in the certificate. This allows the client to be sure that they’ve done a key agreement with cloudflare.com.

RSA and ECDSA are commonly used traditional signature schemes. Again, Shor’s algorithm makes short work of them, allowing a quantum attacker to forge any signature. That means that a MitM (man-in-the-middle) can break into any connection that uses a signature scheme that is not post-quantum secure. This is of course an active attack: if the attacker isn’t in the middle as the handshake happens, the connection is not affected.

This makes upgrading signature schemes for TLS on the face of it less urgent, as we only need to have everyone migrated by the time the cryptographically-relevant quantum computer arrives. Unfortunately, we will see that migration to post-quantum signatures is much more difficult, and will require more time.

Timeline

Before we dive into the technical challenges of migrating the Internet to post-quantum cryptography, let’s have a look at how we got here, and what to expect in the coming years. Let’s start with how post-quantum cryptography came to be.

Origin of post-quantum cryptography

Physicists Feynman and Manin independently proposed quantum computers around 1980. It took another 14 years before Shor published his algorithm attacking public key cryptography. Most post-quantum cryptography predates Shor’s famous algorithm.

There are various branches of post-quantum cryptography, of which the most prominent are lattice-based, hash-based, multivariate, code-based, and isogeny-based. Except for isogeny-based cryptography, none of these were initially conceived as post-quantum cryptography. In fact, early code-based and hash-based schemes are contemporaries of RSA, being proposed in the 1970s, and comfortably predate the publication of Shor’s algorithm in 1994. Also, the first multivariate scheme from 1988 is comfortably older than Shor’s algorithm. It is a nice coincidence that the most successful branch, lattice-based cryptography, is Shor’s closest contemporary, being proposed in 1996. For comparison, elliptic curve cryptography, which is widely used today, was first proposed in 1985.

In the years after the publication of Shor’s algorithm, cryptographers took measure of the existing cryptography: what’s clearly broken, and what could be post-quantum secure? In 2006, the first annual International Workshop on Post-Quantum Cryptography took place. From that conference, an introductory text was prepared, which holds up rather well as an introduction to the field. A notable caveat is the demise of the Rainbow signature scheme. In that same year, the elliptic-curve key-agreement X25519 was proposed, which now secures the vast majority of all Internet connections.

NIST PQC competition

Ten years later, in 2016, NIST, the US National Institute of Standards and Technology, launched a public competition to standardize post-quantum cryptography. They’re using a similar open format as was used to standardize AES in 2001, and SHA3 in 2012. Anyone can participate by submitting schemes and evaluating the proposals. Cryptographers from all over the world submitted algorithms. To focus attention, the list of submissions were whittled down over three rounds. From the original 82, based on public feedback, eight made it into the final round. From those eight, in 2022, NIST chose to pick four to standardize first: one KEM (for key agreement) and three signature schemes.

Old name New name Branch
Kyber ML-KEM (FIPS 203)
Module-lattice based Key-Encapsulation Mechanism Standard
Lattice-based
Dilithium ML-DSA (FIPS 204)
Module-lattice based Digital Signature Standard
Lattice-based
SPHINCS+ SLH-DSA (FIPS 205)
Stateless Hash-Based Digital Signature Standard
Hash-based
Falcon FN-DSA
FFT over NTRU lattices Digital Signature Standard
Lattice-based

First four selected post-quantum algorithms from NIST competition.

ML-KEM is the only post-quantum key agreement close to standardization now, and despite some occasional difficulty with its larger key sizes, in many cases it allows for a drop-in upgrade.

The situation is rather different with the signatures: it’s quite telling that NIST chose to standardize three already. And there are even more signatures set to be standardized in the future. The reason is that none of the proposed signatures are close to ideal. In short, they all have much larger keys and signatures than we’re used to. From a security standpoint SLH-DSA is the most conservative choice, but also the worst performer. For public key and signature sizes, FN-DSA is the best of the worst, but is difficult to implement safely because of floating-point arithmetic. This leaves ML-DSA as the default pick. More in depth comparisons are included below.

Name changes

Undoubtedly Kyber is the most familiar name, as it’s a preliminary version of Kyber that has already been deployed by Chrome and Cloudflare among others to counter store-now/decrypt-later. We will have to adjust, though. Just like Rijndael is most well-known as AES, and Keccak is SHA3 to most, ML-KEM is set to become the catchy new moniker for Kyber going forward.

Final standards

Although we know NIST will standardize these four, we’re not quite there yet. In August 2023, NIST released three draft standards for the first three with minor changes, and solicited public feedback. FN-DSA is delayed for now, as it’s more difficult to standardize and deploy securely.

For timely adopters, it’s important to be aware that based on the feedback on the first three drafts, there might be a few small tweaks before the final standards are released. These changes will be minor, but the final versions could well be incompatible on the wire with the current draft standards. These changes are mostly immaterial, only requiring a small update, and do not meaningfully affect the brunt of work required for the migration, including organizational engagement, inventory, and testing. Before shipping, there can be good reasons to wait for the final standards: support for preliminary versions is not widespread, and it might be costly to support both the draft and final standards. Still, many organizations have not started work on the post-quantum migration at all, citing the lack of standards — a situation that has been called crypto procrastination.

So, when can we expect the final standards? There is no set timeline, but we expect the first three standards to be out around mid-2024.

Predicting protocol and software support

Having NIST’s final standards is not enough. The next step is to standardize the way the new algorithms are used in higher level protocols. In many cases, such as key agreement in TLS, this is as simple as assigning an identifier to the new algorithms. In other cases, such as DNSSEC, it requires a bit more thought. Many working groups at the IETF have been preparing for years for the arrival of NIST’s final standards, and I expect that many protocol integrations will be available before the end of 2024. For the moment, let’s focus on TLS.

The next step is software support. Not all ecosystems can move at the same speed, but we have seen a lot of preparation already. We expect several major open ecosystems to have post-quantum cryptography and TLS support available early 2025, if not earlier.

Again, for TLS there is a big difference again between key agreement and signatures. For key agreement, the server and client can add and enable support for post-quantum key agreement independently. Once enabled on both sides, TLS negotiation will use post-quantum key agreement. We go into detail on TLS negotiation in this blog post. If your product just uses TLS, your store-now/decrypt-now problem could be solved by a simple software update of the TLS library.

Post-quantum TLS certificates are more of a hassle. Unless you control both ends, you’ll need to install two certificates: one post-quantum certificate for the new clients, and a traditional one for the old clients. If you aren’t using automated issuance of certificates yet, this might be a good reason to check that out. TLS allows the client to signal which signature schemes it supports so that the server can choose to serve a post-quantum certificate only to those clients that support it. Unfortunately, although almost all TLS libraries support setting up multiple certificates, not all servers expose that configuration. If they do, it will still require a configuration change in most cases. (Although undoubtedly caddy will do it for you.)

Talking about post-quantum certificates: it will take some time before Certification Authorities (CAs) can issue them. Their HSMs will first need (hardware) support, which then will need to be audited. Also, the CA/Browser forum needs to approve the use of the new algorithms. Of these, the audits are likely to be the bottleneck, as there will be a lot of submissions after the publication of the NIST standards. It’s unlikely we will see a post-quantum certificate issued by a CA before 2026.

This means that it is not unlikely that come 2026, we are in an interesting in-between time, where almost all Internet traffic is protected by post-quantum key agreement, but not a single public post-quantum certificate is used.

More post-quantum standards

NIST is not quite done standardizing post-quantum cryptography. There are two more post-quantum competitions running: round 4 and the signatures onramp.

Round 4

From the post-quantum competition, NIST is still considering standardizing one or more of the code-based key agreements BIKE, HQC, Classic McEliece in a fourth round. The performance of BIKE and HQC, both in key sizes and computational efficiency, is much worse than ML-KEM. NIST is considering standardizing one as a backup KEM, in case there is a cryptanalytic breakthrough against lattice-based cryptography, such as ML-KEM.

Classic McEliece does not compete with ML-KEM directly as a general purpose KEM. Instead, it’s a specialist: Classic McEliece public keys are very large (268kB), but it has (for a post-quantum KEM) very small ciphertexts (128 bytes). This makes Classic McEliece very attractive for use cases where the public key can be distributed in advance, such as to secure a software update mechanism.

Signatures onramp

In late 2022, after announcing the first four picks, NIST also called a new competition, dubbed the signatures onramp, to find additional signature schemes. The competition has two goals. The first is hedging against cryptanalytic breakthroughs against lattice-based cryptography. NIST would like to standardize a signature that performs better than SLH-DSA, but is not based on lattices. Secondly, they’re looking for a signature scheme that might do well in use cases where the current roster doesn’t do well: we will discuss those at length later on in this post.

In July 2023, NIST posted the 40 submissions they received for a first round of public review. The cryptographic community got to work, and as is quite normal for a first round, at the time of writing (February 2024) have managed to break 10 submissions completely, and weaken a couple of others drastically. Thom Wiggers maintains a useful website comparing the submissions.

There are some very promising submissions. We will touch briefly upon them later on. It is worth mentioning that just like the main post-quantum competition, the selection process will take many years. It is unlikely that any of these onramp signature schemes will be standardized before 2027 — if they’re not broken in the first place.

Before we dive into the nitty-gritty of migrating the Internet to post-quantum cryptography, it’s instructive to look back at some past migrations.

Looking back: migrating to TLS 1.3

One of the big recent migrations on the Internet was the switch from TLS 1.2 to TLS 1.3. Work on the new protocol started around 2014. The goal was ambitious: to start anew, cut a lot of cruft, and have a performant clean transport protocol of the future. After a few years of hard work, the protocol was ready for field tests. In good spirits, in September 2016, we announced that we support TLS 1.3.

The followup blog in December 2017 had a rather different tone: “Why TLS 1.3 isn’t in browsers yet”.

Adoption of TLS 1.3 in December 2017: less than 0.06%.

It turned out that revision 11 of TLS 1.3 was completely undeployable in practice, breaking a few percent of all users. The reason? Protocol ossification. TLS was designed with flexibility in mind: the client sends a list of TLS versions it supports, so that the connection can be smoothly upgraded to the newest crypto. That’s the theory, but if you never move the joint, it rusts: for one, it turned out that a lot of server software and middleware simply crashed on just seeing an unknown version. Others would ignore the version number completely, and try to parse the messages as if it was TLS 1.2 anyway. In practice, the version negotiation turned out to be completely broken. So how was this fixed?

In revision 22 of the TLS 1.3 draft, changes were made to make TLS 1.3 look like TLS 1.2 on the wire: in particular TLS 1.3 advertises itself as TLS 1.2 with the normal version negotiation. Also, a lot of unnecessary fields are included in the TLS 1.3 ClientHello just to appease any broken middleboxes that might be peeking in.  A server that doesn’t understand TLS 1.3 wouldn’t even see that an attempt was made to negotiate TLS 1.3. Using a sneaky new extension, a second version negotiation mechanism was added. For the details, check out the December 2017 blog post linked above.

Today TLS 1.3 is a huge success, and is used by more than 93% of the connections.

TLS 1.3 adoption in February 2024. QUIC uses TLS 1.3 under the hood.

To help prevent ossification in the future, new protocols such as TLS 1.3 and QUIC use GREASE, where clients send unknown identifiers on purpose, including cryptographic algorithm identifiers, to help catch similar bugs, and keep the flexibility.

Migrating the Internet to post-quantum key agreement

Now that we understand what we’re dealing with on a high level, let’s dive into upgrading key agreement on the Internet. First, let’s have a closer look at NIST’s first and so far only post-quantum key agreement: ML-KEM.

ML-KEM was submitted under the name CRYTALS-Kyber. Even though it will be a US standard, its designers work in industry and academia across France, Switzerland, the Netherlands, Belgium, Germany, Canada, and the United States. Let’s have a look at its performance.

ML-KEM versus X25519

Today the vast majority of clients use the traditional key agreement X25519. Let’s compare that to ML-KEM.

Keyshares size(in bytes) Ops/sec (higher is better)
Algorithm PQ Client Server Client Server
ML-KEM-512 800 768 45,000 70,000
ML-KEM-768 1,184 1,088 29,000 45,000
ML-KEM-1024 1,568 1,568 20,000 30,000
X25519 32 32 19,000 19,000

Size and CPU compared between X25519 and ML-KEM. Performance varies considerably by hardware platform and implementation constraints, and should be taken as a rough indication only.

ML-KEM-512, -768 and -1024 aim to be as resistant to (quantum) attack as AES-128, -192 and -256 respectively. Even at the AES-128 level, ML-KEM is much bigger than X25519, requiring 1,568 bytes over the wire, whereas X25519 requires a mere 64 bytes.

On the other hand, even ML-KEM-1024 is typically significantly faster than X25519, although this can vary quite a bit depending on your platform.

ML-KEM-768 and X25519

At Cloudflare, we are not taking advantage of that speed boost just yet. Like many other early adopters, we like to play it safe and deploy a hybrid key-agreement combining X25519 and (a preliminary version of) ML-KEM-768. This combination might surprise you for two reasons.

  1. Why combine X25519 (“128 bits of security”) with ML-KEM-768 (“192 bits of security”)?
  2. Why bother with the non post-quantum X25519?

The apparent security level mismatch is a hedge against improvements in cryptanalysis in lattice-based cryptography. There is a lot of trust in the (non post-quantum) security of X25519: matching AES-128 is more than enough. Although we are comfortable in the security of ML-KEM-512 today, over the coming decades cryptanalysis could improve. Thus, we’d like to keep a margin for now.

The inclusion of X25519 has two reasons. First, there is always a remote chance that a breakthrough renders all variants of ML-KEM insecure. In that case, X25519 still provides non post-quantum security, and our post-quantum migration didn’t make things worse.

More important is that we do not only worry about attacks on the algorithm, but also on the implementation. A noteworthy example where we dodged a bullet is that of KyberSlash, a timing attack that affected many implementations of Kyber (an earlier version of ML-KEM), including our own. Luckily KyberSlash does not affect Kyber as it is used in TLS. A similar implementation mistake that would actually affect TLS, is likely to require an active attacker. In that case, the likely aim of the attacker wouldn’t be to decrypt data decades down the line, but steal a cookie or other token, or inject a payload. Including X25519 prevents such an attack.

So how well do ML-KEM-768 and X25519 together perform in practice?

Performance and protocol ossification

Browser experiments

Being well aware of potential compatibility and performance issues, Google started a first experiment with post-quantum cryptography back in 2016, the same year NIST started their competition. This was followed up by a second larger joint experiment by Cloudflare and Google in 2018. We tested two different hybrid post-quantum key agreements: CECPQ2, which is a combination of the lattice-based NTRU-HRSS and X25519, and CECPQ2b, a combination of the isogeny-based SIKE and again X25519. NTRU-HRSS is very similar to ML-KEM in size, but is computationally somewhat more taxing on the client-side. SIKE on the other hand, has very small keys, is computationally very expensive, and was completely broken in 2022. With respect to TLS handshake times, X25519+NTRU-HRSS performed very well, being hard to distinguish by eye from the control connections.

Handshake times compared between X25519 (blue), X25519+SIKE (green) and X25519+NTRU-HRSS (orange). 

Unfortunately, a small but significant fraction of clients experienced broken connections with NTRU-HRSS. The reason: the size of the NTRU-HRSS keyshares. In the past, when creating a TLS connection, the first message sent by the client, the so-called ClientHello, almost always fit within a single network packet. The TLS specification allows for a larger ClientHello, however no one really made use of that. Thus, protocol ossification strikes again as there are some middleboxes, load-balancers, and other software that tacitly assume the ClientHello always fits in a single packet.

Over the subsequent years, Chrome kept running their PQ experiment at a very low rate, and did a great job reaching out to vendors whose products were incompatible. If it were not for these compatibility issues, we would’ve likely seen Chrome ramp up post-quantum key agreement five years earlier.

Today the situation looks better. At the time of writing, Chrome has enabled post-quantum key-agreement for 10% of all users. That accounts for about 1.8% of all our TLS 1.3 connections, as shown in the figure below. That’s a lot, but we’re not out of the woods yet. There could well be performance and compatibility issues that prevent a further rollout.

Fraction of TLS 1.3 connections established with Cloudflare that are secured with post-quantum cryptography. At the moment, it’s more than 99% from Chrome. 

Nonetheless, we feel it’s more probable than not that we will see Chrome enable post-quantum key agreement for more users this year.

Other browsers

In January 2024, Firefox landed the code to support post-quantum key agreement in nightly, and it’s likely it will land in Firefox proper later in 2024. For Chrome-derived browsers, such as Edge and Brave, it’s easy to piggyback on the work of Chrome, and we could well see them follow suit when Chrome turns on post-quantum key-agreement by default.

However, browser to server connections aren’t the only connections important to the Internet.

Testing connections to customer origins

In September 2023, we added support for our customers to enable post-quantum key agreement on connections from Cloudflare to their origins. That’s connection (3) in the following diagram. This can be done in two ways: the fast way, and the slow but safer way. In both cases, if the origin does not support it, we fall back to traditional key-agreement. We explain the details of these in the blog post, but in short, in the fast way we send the post-quantum keyshare immediately, and in the slow but safe way we let the origin ask for post-quantum using a HelloRetryRequest message. Chrome, by the way, is deploying post-quantum key agreement the fast way.

Typical connection flow when a visitor requests an uncached page.

At the same time, we started regularly testing our customer origins to see if they would support us offering post-quantum key agreement. We found all origins supported the safe but slow method. The fast method didn’t fare as well, as we found that 0.34% of connections would break. That’s higher than the failure rates seen by browsers.

Unsurprisingly, many failures seem to be caused by the large ClientHello. Interestingly, the majority are caused by servers not correctly implementing HelloRetryRequest. To investigate the cause, we have reached out to customers to ascertain the cause. We’re very grateful to those that have responded, and we’re currently working through the data.

Outlook

As we’ve seen, post-quantum key agreement, despite protocol ossification, is relatively straightforward to deploy. We’re also on a great trajectory, as we might well see double-digit client support for post-quantum key agreement later this year.

Let’s turn to the second, more difficult migration.

Migrating the Internet to post-quantum signatures

Now, we’ll turn our attention to upgrading the signatures used on the Internet.

The zoo of post-quantum signatures

Let’s start by sizing up the post-quantum signatures we have available today at the AES-128 security level: ML-DSA-44, FN-DSA-512, and the two variants of SLH-DSA. As a comparison, we also include the venerable Ed25519 and RSA-2048 in wide use today, as well as a sample of five promising signature schemes from the signatures onramp.

Sizes (bytes) CPU time (lower is better)
PQ Public key Signature Signing Verification
Standardized Ed25519 32 64 1 (baseline) 1 (baseline)
RSA-2048 256 256 70 0.3
NIST drafts ML-DSA-44 1,312 2,420 4.8 0.5
FN-DSA-512 897 666 8 ⚠️ 0.5
SLH-DSA-128s 32 7,856 8,000 2.8
SLH-DSA-128f 32 17,088 550 7
Sample from signatures onramp MAYOone 1,168 321 4.7 0.3
MAYOtwo 5,488 180 5 0.2
SQISign I 64 177 60,000 500
UOV Is-pkc 66,576 96 2.5 2
HAWK512 1,024 555 2 1

Comparison of various signature schemes at the security level of AES-128. CPU times vary significantly by platform and implementation constraints and should be taken as a rough indication only. ⚠️FN-DSA signing time when using fast but dangerous floating-point arithmetic — see warning below.

It is immediately clear that none of the post-quantum signature schemes comes even close to being a drop-in replacement for Ed25519 (which is comparable to ECDSA P-256) as most of the signatures are simply much bigger. The exceptions are SQISign, MAYO, and UOV from the onramp, but they’re far from ideal. MAYO and UOV have large public keys, and SQISign requires an immense amount of computation.

When to use SLH-DSA

As mentioned before, today we only have drafts for SLH-DSA and ML-DSA. In every relevant performance metric, ML-DSA beats SLH-DSA handily. (Even the small public keys of SLH-DSA are not any advantage. If you include the ML-DSA public key with its signature, it’s still smaller than an SLH-DSA signature, and in that case you can use the short hash of the ML-DSA public key as a short public key.)

The advantage of SLH-DSA is that there is a lot of trust in its security. To forge an SLH-DSA signature you need to break the underlying hash function quite badly. It is not enough to break the collision resistance of the hash, as has been done with SHA-1 and MD5. In fact, as of February 2024, an SHA-1 based SLH-DSA would still be considered secure. Of course, SLH-DSA does not use SHA-1, and instead uses SHA2 and SHA3, against which not a single practical attack is known.

If you can shoulder the cost, SLH-DSA has the best security guarantee, which might be crucial when dealing with long-lasting signatures, or deployments where upgrades are impossible.

Be careful with FN-DSA

Looking ahead a bit: the best of the worst seems to be FN-DSA-512. FN-DSA-512’s signatures and public key together are only 1,563 bytes, with somewhat reasonable signing time. FN-DSA has an achilles heel though — for acceptable signing performance, it requires fast floating-point arithmetic. Without it, signing is about 20 times slower. But speed is not enough, as the floating-point arithmetic has to run in constant time — without it, the FN-DSA private key can be recovered by timing signature creation. Writing safe FN-DSA implementations has turned out to be quite challenging, which makes FN-DSA dangerous when signatures are generated on the fly, such as in a TLS handshake. It is good to stress that this only affects signing. FN-DSA verification does not require floating-point arithmetic (and during verification there wouldn’t be a private key to leak anyway.)

There are many signatures on the web

The biggest pain-point of migrating the Internet to post-quantum signatures, is that there are a lot of signatures even in a single connection. When you visit this very website for the first time, we send six signatures and two public keys.

The majority of these are for the certificate chain: the CA signs the intermediate certificate, which signs the leaf certificate, which in turn signs the TLS transcript to prove the authenticity of the server. If you’re keeping count: we’re still three signatures short.

Two of these are for SCTs required for certificate transparency. Certificate transparency is a key, but lesser known, part of the Web PKI, the ecosystem that secures browser connections. Its goal is to publicly log every certificate issued, so that misissuances can be detected after the fact. It works by having independent parties run CT logs. Before issuing a certificate, a CA must first submit it to at least two different CT logs. An SCT is a signature of a CT log that acts as a proof, a receipt, that the certificate has been logged.

The final signature is an OCSP staple, which proves that the leaf certificate hasn’t been revoked in the last few days.

Tailoring signature schemes

There are two aspects of how a signature can be used that are worthwhile to highlight: whether the public key is included with the signature, and whether the signature is online or offline.

For the SCTs and the signature of the root on the intermediate, the public key is not transmitted during the handshake. Thus, for those, a signature scheme with smaller signatures but larger public keys, such as MAYO or UOV, would be particularly well-suited. For the other signatures, the public key is included, and it’s more important to minimize the sizes of the combined public key and signature.

The handshake signature is the only signature that is created online — all the other signatures are created ahead of time.  The handshake signature is created and verified only once, whereas the other signatures are typically verified many times by different clients. This means that for the handshake signature, it’s advantageous to balance signing and verification time which are both in the hot path, whereas for the other signatures having better verification time at the cost of slower signing is worthwhile. This is one of the advantages RSA still enjoys over elliptic curve signatures today.

Putting together different signature schemes is a fun puzzle, but it also comes with some drawbacks. Using multiple different schemes increases the attack surface because an algorithmic or implementation vulnerability in one compromises the whole. Also, the whole ecosystem needs to implement and optimize multiple algorithms, which is a significant burden.

Putting it together

So, what are some reasonable combinations to try?

With NIST’s current picks

With the draft standards available today, we do not have a lot of options.

If we simply switch to ML-DSA-44 for all signatures, we’re adding 17kB of data that needs to be transmitted from the server to the client during the TLS handshake. Is that a lot? Probably. We will address that later on.

If we wait a bit and replace all but the handshake signature with FN-DSA-512, we’re looking at adding only 8kB. That’s much better, but I have to repeat that it’s difficult to implement FN-DSA-512 signing safely without timing side channels, and there is a good chance we’ll shoot ourselves in the foot if we’re not careful.

Another way to shoot ourselves in the foot today is with stateful hash-based signatures.

Stateful hash-based signatures

Apart from symmetric cryptography, there are already post-quantum signature schemes standardized today: LMS / HRSS and XMSS(MT). Just like SLH-DSA, these are hash-based signature schemes, and thus, algorithmically they’re very conservative.

But they come with a major drawback: you need to remember the state. What is this state? When generating a keypair, you prepare a fixed number of one-time-use slots, and you need to remember which one you’ve used. If you use the same prepared slot twice, then anyone can create a forgery with those two. Managing this state is not impossible, but quite tricky. What if the server was restored from a backup? The state can be distributed over multiple servers, but that changes the usual signature flow quite a bit, and it’s unclear whether regulators will allow this approach, as the state is typically considered part of the private key.

So, how do they perform? It’s hard to give a definite answer. These hash-based signature schemes have a lot of knobs to turn and can be fine-tuned to their use case. You can see for yourself, and play around with the parameters on this website. With standardized variants (with security parameter n=24) for the offline signatures, we can beat ML-DSA-44 in data on the wire, but can’t outperform FN-DSA-512. With security parameter n=16, which has not been standardized, stateful hash-based signatures are competitive with FN-DSA-512, and can even beat it on size. However, n=16 comes with yet another footgun: it allows the signer to create a single signature that validates two different messages — there is no non-repudiation.

All in all, FN-DSA-512 and stateful hash-based signatures tempt us with a similar and clear performance benefit over ML-DSA-44, but are difficult to use safely.

Signatures on the horizon

There are some very promising new signature schemes submitted to the NIST onramp.

UOV (unbalanced oil and vinegar) is an old multivariate scheme with a large public key (66.5kB), but small signatures (96 bytes). If we combine UOV for the root and SCTs with ML-DSA-44 for the others, we’re looking at only 10kB — close to FN-DSA-512.

Over the decades, there have been many attempts to add some structure to UOV public keys, to get a better balance between public key and signature size. Many of these so-called structured multivariate schemes, which includes Rainbow and GeMMS, unfortunately have been broken.

MAYO is the latest proposal for a structured multivariate scheme, designed by the cryptographer that broke Rainbow. As a structured multivariate scheme, its security requires careful scrutiny, but its utility (given it is not broken) is very appealing.

MAYO allows for a fine-grained tradeoff between signature and public key size. For the submission, to keep things simple, the authors proposed two concrete variants: MAYOone with balanced signature (321 bytes) and public key (1.1kB) sizes, and MAYOtwo that has signatures of 180 bytes, while keeping the public key manageable at 5.4kB. Verification times are excellent, while signing times are somewhat slower than ECDSA, but far better than RSA. Combining both variants in the obvious way, we’re only looking at 3.3kB.

Purely looking at sizes, SQISign I is the clear winner, even beating RSA-2048. Unfortunately, the computation required for signing, and crucially verification, are way too high. For niche applications, SQISign might be useful, but for general adoption verification times need to improve significantly, even if that requires a larger signature.

Finally, I would like to mention HAWK512. HAWK is a lattice-based scheme similar to FN-DSA-512, but does not require floating-point arithmetic. This makes HAWK an appealing alternative to FN-DSA. NIST has repeatedly stated that the main purpose of the onramp is to standardize a signature scheme that is not based on lattices — a description HAWK does not fit. We might see some innovations of HAWK be included in the final version of FN-DSA, but it is unclear whether that will solve all of FN-DSA implementation concerns.

There are more promising submissions in the onramp, but those discussed are a fairly representative sample of those interesting to TLS. For instance, SNOVA is similar to MAYO, and TUOV is similar to UOV. Explore the submissions for yourself on Thom’s webpage.

Do we really care about the extra bytes?

It will take 17kB extra to swap in ML-DSA-44. That’s a lot compared to the typical handshake today, but it’s not a lot compared to the JavaScript and images served on many web pages. The key point is that the change we must make here affects every single TLS connection, whether it’s used for a bloated website, or a time-critical API call. Also, it’s not just about waiting a bit longer. If you have spotty cellular reception, that extra data can make the difference between being able to load a page, and having the connection time out. (As an aside, talking about bloat: many apps perform a surprisingly high number of TLS handshakes.)

Just like with key agreement, performance isn’t our only concern: we also want the connection to succeed in the first place. Back in 2021, we ran an experiment artificially enlarging the certificate chain to simulate larger post-quantum certificates. We give a short summary of the key result below, but for the details, check out the full blog post.

Initially, we wanted to run the experiment on a small sample of regular traffic, in order to get unbiased data. Unfortunately, we found that large certificate chains broke some connections. Thus, to avoid breaking customer connections, we set up the experiment to use background connections launched from our challenge pages. For each participant, we launched two background connections: one with a larger certificate chain (live) and one with a normal chain(control). The graph on the right shows the number of control connections that are missing a corresponding live connection. There are jumps around 10kB and 30kB, suggesting that there are clients or middleboxes  that break when certificate chains grow by more than 10kB or 30kB.

Missing requests when artificially inflating certificate chain size to simulate post-quantum certificates.

This does not mean that the ML-DSA-44-only route is necessarily unviable. Just like with key agreement, browsers can slowly turn on support for post-quantum certificates. As we hit issues with middleboxes, we can work with vendors to fix what is broken. It is crucial here that servers are configured to be able to serve either a small traditional chain, or a larger post-quantum chain.

These issues are problematic for a single-certificate migration strategy. In this approach, the server installs a single traditional certificate that contains a separate post-quantum certificate in a so-called non-critical extension. A client that does not support post-quantum certificates will ignore the extension. In this approach, installing the single certificate will immediately break all clients with compatibility issues, making it a non-starter.

What about performance? We saw the following impact on TLS handshake time.

Performance when artificially inflating certificate chain size to simulate post-quantum certificates.

The jump at around 40kB is caused by an extra round-trip due to a full congestion window. In the 2021 blog post we go into detail on what that is all about. There is an important caveat: at Cloudflare, because we’re close to the client, we use a larger congestion window. With a typical congestion window, the jump would move to around 10kB. Also, the jump would be larger as typical round-trip times are higher.

Thus, when adding 9KB, we’re looking at a slowdown of about 15%. Crossing the 10kB boundary, we are likely to incur an extra roundtrip, which could well lead to a slowdown of more than 60%. That completely negates the much touted performance benefit that TLS 1.3 has over TLS 1.2, and it’s too high to be enabled by default.

Is 9kB too much? Enabling post-quantum key agreement wasn’t free either, but enabling post-quantum key agreement was cheaper and actually gets us a tangible security benefit today. However, this thinking is dangerous. If we wait too long before enabling post-quantum certificates by default, we might find ourselves out of time when the quantum computer arrives.

Way forward

Over the coming years, we’ll be working with browsers to test the viability and performance impact of post-quantum authentication in TLS. We expect to add support for post-quantum certificates as soon as they arrive (probably around 2026), but not enable them by default.

At the same time, we’re exploring various ideas to reduce the number of signatures.

Reducing number of signatures

Over the last few years, there have been several proposals to reduce the number of signatures used.

Leaving out intermediate certificates

CAs report the intermediate certificates they use in the CCADB. Most browsers ship with the list of intermediates (of CAs they trust). Using that list, a browser is able to establish a connection with a server that forgot to install the intermediate. If a server can leave out the intermediate, then why bother with it?

There are three competing proposals to leave out the intermediate certificate. The original 2019 proposal is by Martin Thomson, who suggests simply having the browser send a single bit to indicate that it has an up-to-date list of all intermediates. In that case, the server will leave out the intermediates. This will work well in the majority of cases, but could lead to some hard-to-debug issues in corner cases. For one, not all intermediates are listed in the CCADB, and these missing intermediates aren’t even from custom CAs. Another reason is that the browser could be mistaken about whether it’s up-to-date. A more esoteric issue is that the browser could reconstruct a different chain of certificates than the server had in mind.

To address these issues, in 2023, Dennis Jackson put forward a more robust proposal. In this proposal, every year a fixed list of intermediates is compiled from the CCADB. Instead of a single flag, the browser will send the named lists of intermediates it has. The server will not simply leave out matching intermediates, but rather replace them by the sequence number at which they appear in the list. He also did a survey of the most popular websites, and found that just by leaving out the intermediates today, we can save more than 2kB compared to certificate compression for half of them. That’s with today’s certificates: yes, X509 certificates are somewhat bloated.

Finally, there is the more general TLS trust expressions proposal that allows a browser to signal more in a more fine-grained manner which CAs and intermediates it trusts.

It’s likely some form of intermediate suppression will be adopted in the coming years. This will push the cost of a ML-DSA-44-only deployment down to less than 13kB.

KEMTLS

Another approach is to change TLS more rigorously by replacing the signature algorithm in the leaf certificate by a KEM. This is called KEMTLS (or AuthKEM at the IETF). The server proves it controls the leaf certificate, by being able to decrypt a challenge sent by the client. This is not an outlandishly new idea, as older versions of TLS would encrypt a shared key to an RSA certificate.

KEMTLS does add quite a bit of complexity to TLS 1.3, which was purposely designed to simplify TLS 1.2. Adding complexity adds security concerns, but we soften that by extending TLS 1.3 machine-checked security proof to KEMTLS. Nonetheless, adopting KEMTLS will be a significant engineering effort, and its gains should be worthwhile.

If we replace an ML-DSA-44 handshake signature of 2,420 bytes by KEMTLS using ML-KEM-512, we save 852 bytes in the total bytes transmitted by client and server. Looking just at the server, we save 1,620 bytes. If that’s 1.6kB saved on 17kB, it’s not very impressive. Also, KEMTLS is of little benefit if small post-quantum signatures such as MAYOone are available for the handshake.

KEMTLS shines in the case that 1.6kB savings pushes the server within the congestion window, such as when UOV is used for all but the handshake and leaf signature. Another advantage of KEMTLS, especially for embedded devices, is that it could reduce the number of algorithms that need to be implemented: you need a KEM for the key agreement anyway, and that could replace the signature scheme you would’ve only used for the handshake signature.

At the moment, deploying KEMTLS isn’t the lowest hanging fruit, but it could well come into its own, depending on which signature schemes are standardized, and which other protocol changes are made.

Merkle tree certificates

An even more ambitious and involved proposal is Merkle tree certificates (MTC). In this proposal, all signatures except the handshake signature are replaced by a short <800 byte Merkle tree certificate. This sounds too good to be true, and there is indeed a catch. MTC doesn’t work in all situations, and for those you will need to fall back to old-fashioned X509 certificates and certificate transparency. So, what’s assumed?

  • No direct certificate issuance. You can’t get a Merkle tree certificate immediately: you will have to ask for one, and then wait for at least a day before you can use it.
  • Clients (in MTC parlance relying parties) can only check a Merkle tree certificate if they stay up to date with a transparency service. Browsers have an update-mechanism that can be used for this, but a browser that hasn’t been used in a while might be stale.

MTC should be seen as an optimisation for the vast majority of cases.

Summary

So, how does it actually work? I’ll try to give a short summary — for a longer introduction check out David Benjamin’s IETF presentation, or get your hands dirty by setting up your own MTC CA.

An overview of a Merkle Tree certificate deployment

In MTC, CAs issues assertions in a batch in a fixed rhythm. Say once every hour. An example of an assertion is “you can trust P-256 public key ab….23 when connecting to example.com”. Basically an assertion is a certificate without the signature. If a subscriber wants to get a certificate, it sends the assertion to the CA, which vets it, and then queues it for issuance.

On this batch of assertions, the CA computes a Merkle tree. We have an explainer of Merkle trees in our blog post introducing certificate transparency. The short of it is that you can summarize a batch into a single hash by creating a tree hashing pairwise. The root is the summary. The nice thing about Merkle trees is that you can prove that something was in the batch to someone who only has the root, by revealing just a few hashes up the tree, which is called the Merkle tree certificate.

Each assertion is valid for a fixed number of batches — say 336 batches for a validity of two weeks. This is called the validity window. When issuing a batch, the CA not only publishes the assertions, but also a signature on the roots of all batches that are currently valid, called the signed validity window.

After the MTC CA has issued the new batch, the subscriber that asked for the certificate to be issued can pull the Merkle tree certificate from the CA. The subscriber can then install it, next to its X509 certificate, but will have to wait a bit before it’s useful.

Every hour, the transparency services, including those run by browser vendors, pull the new assertions and signed validity window from the CAs they trust. They check whether everything is consistent, including whether the new signed validity window matches with the old one. When satisfied, they republish the batches and signed validity window themselves.

Every hour, browsers download the latest roots from their trusted transparency service. Now, when connecting to a server, the client will essentially advertise which CAs it trusts, and the sequence number of the latest batch for which it has the roots. The server can then send either a new MTC, an older MTC (if the client is a bit stale), or fall back to a X509 certificate.

Outlook

The path for migrating the Internet to post-quantum authentication is much less clear than with key agreement. In the short term, we expect early adoption of post-quantum authentication across the Internet around 2026, but few will turn it on by default. Unless we can get performance much closer to today’s authentication, we expect the vast majority to keep post-quantum authentication disabled, unless motivated by regulation.

Not just TLS, authentication, and key agreement

Despite its length, in this blog post, we have only really touched upon migrating TLS. And even TLS we did not cover completely, as we have not discussed Encrypted ClientHello (we didn’t forget about it). Although important, TLS is not the only protocol key to the security of the Internet. We want to briefly mention a few other challenges, but cannot go into detail. One particular challenge is DNSSEC, which is responsible for securing the resolution of domain names.

Although key agreement and signatures are the most widely used cryptographic primitives, over the last few years we have seen the adoption of more esoteric cryptography to serve more advanced use cases, such as unlinkable tokens with Privacy Pass / PAT, anonymous credentials, and attribute based encryption to name a few. For most of these advanced cryptographic schemes, there is no known practical post-quantum alternative yet.

What you can do today

To finish, let’s review what you can do today. For most organizations the brunt of the work is in the preparation. Where is cryptography used in the first place? What software libraries / what hardware? What are the timelines of your vendors? Do you need to hire expertise? What’s at risk, and how should it be prioritized? Even before you can answer all those, create engagement within the organization. All this work can be started before NIST finishes their standards or software starts shipping with post-quantum cryptography.

You can also start testing right now since the performance characteristics of the final standards will not be meaningfully different from the preliminary ones available today. If it works with the preliminary ones today in your test environment, the final standards will most likely work just fine in production. We’ve collected a list of software and forks that already support preliminary post-quantum key agreement here.

Also on that page, we collected instructions on how to turn on post-quantum key agreement in your browser today. (For Chrome it’s enable-tls13-kyber in chrome://flags.)

If you’re a Cloudflare customer, you can check out how to enable post-quantum key agreement to your origin, and our products that are secured against store-now/decrypt-later today.

Good luck with your migration, and if you hit any issues, do reach out: [email protected]

Cloudflare now uses post-quantum cryptography to talk to your origin server

Post Syndicated from Suleman Ahmad original http://blog.cloudflare.com/post-quantum-to-origins/

Cloudflare now uses post-quantum cryptography to talk to your origin server

Cloudflare now uses post-quantum cryptography to talk to your origin server

Quantum computers pose a serious threat to security and privacy of the Internet: encrypted communication intercepted today can be decrypted in the future by a sufficiently advanced quantum computer. To counter this store-now/decrypt-later threat, cryptographers have been hard at work over the last decades proposing and vetting post-quantum cryptography (PQC), cryptography that’s designed to withstand attacks of quantum computers. After a six-year public competition, in July 2022, the US National Institute of Standards and Technology (NIST), known for standardizing AES and SHA, announced Kyber as their pick for post-quantum key agreement. Now the baton has been handed to Industry to deploy post-quantum key agreement to protect today’s communications from the threat of future decryption by a quantum computer.

Cloudflare operates as a reverse proxy between clients (“visitors”) and customers’ web servers (“origins”), so that we can protect origin sites from attacks and improve site performance. In this post we explain how we secure the connection from Cloudflare to origin servers. To put that in context, let’s have a look at the connection involved when visiting an uncached page on a website served through Cloudflare.

Cloudflare now uses post-quantum cryptography to talk to your origin server

The first connection is from the visitor’s browser to Cloudflare. In October 2022, we enabled X25519+Kyber as a beta for all websites and APIs served through Cloudflare. However, it takes two to tango: the connection is only secured if the browser also supports post-quantum cryptography. As of August 2023, Chrome is slowly enabling X25519+Kyber by default.

The visitor’s request is routed through Cloudflare’s network (2). We have upgraded many of these internal connections to use post-quantum cryptography, and expect to be done upgrading all of our internal connections by the end of 2024. That leaves as the final link the connection (3) between us and the origin server.

We are happy to announce that we are rolling out support for X25519+Kyber for most outbound connections, including origin servers and Cloudflare Workers fetch() calls.

Plan Support for post-quantum outbound connections
Free Started roll-out. Aiming for 100% by the end of the October.
Pro and Business Started roll-out. Aiming for 100% by the end of year.
Enterprise Start roll-out February 2024. 100% by March 2024.

You can skip the roll-out and opt-in your zone today, or opt-out ahead of time, using an API described below. Before rolling out this support for enterprise customers in February 2024, we will add a toggle on the dashboard to opt out.

In this post we will dive into the nitty-gritty of what we enabled; how we have to be a bit subtle to prevent breaking connections to origins that are not ready yet, and how you can add support to your (origin) server.

But before we dive in, for the impatient:

Quick start

To enable a post-quantum connection between Cloudflare and your origin server today, opt-in your zone to skip the gradual roll-out:

curl --request PUT \
  --url https://api.cloudflare.com/client/v4/zones/(zone_id)/cache/origin_post_quantum_encryption \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer (API token)' \
  --data '{"value": "preferred"}'

Then, make sure your server supports TLS 1.3; enable and prefer the key agreement X25519Kyber768Draft00; and ensure it’s configured with server cipher preference. For example, to configure nginx (compiled with a recent BoringSSL) like this, use

ssl_ecdh_curve X25519Kyber768Draft00:X25519;
ssl_prefer_server_ciphers on;
ssl_protocols TLSv1.3;

Replace (zone_id) and (API token) appropriately. Then, make sure your server supports TLS 1.3; enable and prefer the key agreement X25519Kyber768Draft00; and ensure it’s configured with server cipher preference. For example, to configure nginx (compiled with a recent BoringSSL) like this, use

	ssl_ecdh_curve X25519Kyber768Draft00:X25519;
	ssl_prefer_server_ciphers on;
	ssl_protocols TLSv1.3;

To check your server is properly configured, you can use the bssl tool of BoringSSL:

	$ bssl client -connect (your server):443 -curves X25519:X25519Kyber768Draft00
[...]
	  ECDHE curve: X25519Kyber768Draft00
[...]

We’re looking for X25519Kyber768Draft00 for a post-quantum connection as shown above instead of merely X25519.
For more client and server support, check out pq.cloudflareresearch.com. Now, let’s dive in.

Overview of a TLS 1.3 handshake

To understand how a smooth upgrade is possible, and where it might go wrong, we need to understand a few basics of the TLS 1.3 protocol, which is used to protect traffic on the Internet. A TLS connection starts with a handshake which is used to authenticate the server and derive a shared key. The browser (client) starts by sending a ClientHello message that contains among other things, the hostname (SNI) and the list of key agreement methods it supports.

To remove a round trip, the client is allowed to make a guess of what the server supports and start the key agreement by sending one or more client keyshares. That guess might be correct (on the left in the diagram below) or the client has to retry (on the right). By the way, this guessing of keyshares is a new feature of TLS 1.3, and it is the main reason why it’s faster than TLS 1.2.

Cloudflare now uses post-quantum cryptography to talk to your origin server
Protocol flow for server-authenticated TLS 1.3 with a supported client keyshare on the left and a HelloRetryRequest on the right.

In both cases the client sends a client keyshare to the server. From this client keyshare the server generates the shared key. The server then returns a server keyshare with which the client can also compute the shared key. This shared key is used to protect the rest of the connection using symmetric cryptography, such as AES.

Today X25519 is used as the key agreement in the vast majority of connections. To secure the connection against store-now/decrypt-later in the post-quantum world, a client can simply send a X25519+Kyber keyshare.

Hello! Retry Request? (HRR)

What we just described is the happy flow, where the client guessed correctly which key agreement the server supports. If the server does not support the keyshare that the client sent, then the server picks one of the supported key agreements that the client advertised, and asks for it in a HelloRetryRequest message.

This is not the only case where a server can use a HelloRetryRequest: even if the client sent keyshares that the server supports, the server is allowed to prefer a different key agreement the client advertised, and ask for it with a HelloRetryRequest. This will turn out to be very useful.

HelloRetryRequests are mostly undesirable: they add an extra round trip, and bring us back to the performance of TLS 1.2. We already had a transition of key agreement methods: back in the day P-256 was the de facto standard. When browsers couldn’t assume support for the newer X25519, some would send two keyshares, both X25519 and P-256 to prevent a HelloRetryRequest.

Also today, when enabling Kyber in Chrome, Chrome will send two keyshares: X25519 and X25519+Kyber to prevent a HelloRetryRequest. Sending two keyshares is not ideal: it requires the client to compute more, and it takes more space on the wire. This becomes more problematic when we want to send two post-quantum keyshares, as post-quantum keyshares are much larger. Talking about post-quantum keyshares, let’s have a look at X25519+Kyber.

The nitty-gritty of X25519+Kyber

The full name of the post-quantum key agreement we have enabled is X25519Kyber768Draft00, which has become the industry standard for early deployment. It is the combination (a so-called hybrid, more about that later) of two key agreements: X25519 and a preliminary version of NIST’s pick Kyber. Preliminary, because standardization of Kyber is not complete: NIST has released a draft standard for which it has requested public input. The final standard might change a little, but we do not expect any radical changes in security or performance. One notable change is the name: the NIST standard is set to be called ML-KEM. Once ML-KEM is released in 2024, we will promptly adopt support for the corresponding hybrid, and deprecate support for X25519Kyber768Draft00. We will announce deprecation on this blog and pq.cloudflareresearch.com.

Picking security level: 512 vs 768

Back in 2022, for incoming connections, we enabled hybrids with both Kyber512 and Kyber768. The difference is target security level: Kyber512 aims for the same security as AES-128, whereas Kyber768 matches up with AES-192. Contrary to popular belief, AES-128 is not broken in practice by quantum computers.

So why go with Kyber768? After years of analysis, there is no indication that Kyber512 fails to live up to its target security level. The designers of Kyber feel more comfortable, though, with the wider security margin of Kyber768, and we follow their advice.

Hybrid

It is not inconceivable though, that an unexpected improvement in cryptanalysis will completely break Kyber768. Notably Rainbow, GeMMS and SIDH survived several rounds of public review before being broken. We do have to add nuance here. For a big break you need some mathematical trick, and compared to other schemes, SIDH had a lot of mathematical attack surface. Secondly, even though a scheme participated in many rounds of review doesn’t mean it saw a lot of attention. Because of their performance characteristics, these three schemes have more niche applications, and therefore received much less scrutiny from cryptanalysts. In contrast, Kyber is the big prize: breaking it will ensure fame.

Notwithstanding, for the moment, we feel it’s wiser to stick with hybrid key agreement. We combine Kyber together with X25519, which is currently the de facto standard key agreement, so that if Kyber turns out to be broken, we retain the non-post quantum security of X25519.

Performance

Kyber is fast. Very fast. It easily beats X25519, which is already known for its speed:

Size keyshares(in bytes) Ops/sec (higher is better)
Algorithm PQ Client Server Client Server
X25519 32 32 17,000 17,000
Kyber768 1,184 1,088 31,000 70,000
X25519Kyber768Draft00 1,216 1,120 11,000 14,000

Combined X25519Kyber768Draft00 is slower than X25519, but not by much. The big difference is its size: when connecting the client has to send 1,184 extra bytes for Kyber in the first message. That brings us to the next topic.

When things break, and how to move forward

Split ClientHello

As we saw, the ClientHello is the first message that is sent by the client when setting up a TLS connection. With X25519, the ClientHello almost always fits within one network packet. With Kyber, the ClientHello doesn’t fit anymore with typical packet sizes and needs to be split over two network packets.

The TLS standard allows for the ClientHello to be split in this way. However, it used to be so exceedingly rare to see a split ClientHello that there is plenty of software and hardware out there that falsely assumes it never happens.

This so-called protocol ossification is the major challenge rolling out post-quantum key agreement. Back in 2019, during earlier post-quantum experiments, middleboxes of a particular vendor dropped connections with a split ClientHello. Chrome is currently slowly ramping up the number of post-quantum connections to catch these issues early. Several reports are listed here, and luckily most vendors seem to fix issues promptly.

Over time, with the slow ramp up of browsers, many of these implementation bugs will be found and corrected. However, we cannot completely rely on this for our outbound connections since in many cases Cloudflare is the sole client to connect directly to the origin server. Thus, we must exercise caution when deploying post-quantum cryptography to ensure that we are still able to reach origin servers even in the presence of buggy implementations.

HelloRetryRequest to the rescue

To enable support for post-quantum key agreement on all outbound connections, without risking issues with split ClientHello for those servers that are not ready yet, we make clever use of HelloRetryRequest. Instead of sending a X25519+Kyber keyshare, we will only advertise support for it, and send a non-post quantum secure X25519 keyshare in the first ClientHello.

If the origin does not support X25519+Kyber, then nothing changes. One might wonder: could merely advertising support for it trip up any origins? This used to be a real concern in the past, but luckily browsers have adopted a clever mechanism called GREASE: they will send codepoints selected from unpredictable regions to make it hard to implement any software that could trip up on unknown codepoints.

If the origin does support X25519+Kyber, then it can use the HelloRetryRequest to request a post-quantum key agreement from us.

Things might still break then: for instance a malfunctioning middlebox, load-balancer, or the server software itself might still trip over the large ClientHello with X25519+Kyber sent in response to the HelloRetryRequest.

If we’re frank, the HRR trick kicks the can down the road: we as an industry will need to fix broken hardware and software before we can enable post-quantum on every last connection. The important thing though is that those past mistakes will not hold us back from securing the majority of connections. Luckily, from our experience, breakage will not be common.

So, when you have flipped the switch on your origin server, and things do break against expectation, what could be the root cause?

Debugging and examples

It’s impossible to exhaustively list all bugs that could interfere with the post-quantum connection, but we like to share a few we’ve seen.

The first step is to figure out what pieces of hardware and software are involved in the connection. Rarely it’s just the server: there could be a load-balancer, and even a humble router could be at fault.

One straightforward mistake is to conveniently assume the ClientHello is small by reserving only a small, say 1000 byte, buffer.

A variation of this is where a server uses a single call to recv() to read the ClientHello from the TCP connection. This works perfectly fine if it fits within one packet, but when split over multiple, it might require more calls.

Not all issues that we encountered relate directly to split ClientHello. For instance, servers using the Rust TLS library rustls did not implement HelloRetryRequest correctly before 0.21.7.

If you turned on post-quantum support for your origin, and hit issues, please do reach out: email [email protected].

Opting in and opting out

Now that you know what might lie in wait for you, let’s cover how to configure the outbound connections of your zone. There are three settings. The setting affects all outbound connections for your zone: to the origin server, but also for fetch() requests made by Workers on your zone.

Setting Meaning
supported Advertise support for post-quantum key agreement, but send a classical keyshare in the first ClientHello.When the origin supports and prefers X25519+Kyber, a post-quantum connection will be established, but it incurs an extra roundtrip.This is the most compatible way to enable post-quantum.
preferred Send a post-quantum keyshare in the first ClientHello.When the origin supports X25519+Kyber, a post-quantum connection will be established without an extra roundtrip.This is the most performant way to enable post-quantum.
off Do not send or advertise support for post-quantum key agreement to the origin.
(default) Allow us to determine the best behavior for your zone. (More about that later.)

The setting can be adjusted using the following API call:

curl --request PUT \
  --url https://api.cloudflare.com/client/v4/zones/(zone_id)/cache/origin_post_quantum_encryption \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer (API token)' \
  --data '{"value": "(setting)"}'

Here, the parameters are as follows.

Parameter Value
setting supported, preferred, or off, with meaning as described above
zone_id Identifier of the zone to control. You can look up the zone_id in the dashboard.
API token Token used to authenticate you. You can create one in the dashboard. Use create custom token and under permissions select zone → zone settings → edit.

Testing whether your origin server is configured correctly

If you set your zone to preferred mode, you only need to check support for the proper post-quantum key agreement with your origin server. This can be done with the bssl tool of BoringSSL:

	$ bssl client -connect (your server):443 -curves X25519:X25519Kyber768Draft00
[...]
	  ECDHE curve: X25519Kyber768Draft00
[...]

If you set your zone to supported mode, or if you wait for the gradual roll-out, you will need to make sure that your origin server prefers post-quantum key agreement even if we sent a classical keyshare in the initial ClientHello. This can be done with our fork of BoringSSL:

	$ git clone https://github.com/cloudflare/boringssl-pq
	[...]
	$ cd boringssl-pq && cmake -B build && make -C build
$ build/bssl client -connect (your server):443 -curves X25519:X25519Kyber768Draft00 -disable-second-keyshare
[...]
	  ECDHE curve: X25519Kyber768Draft00
[...]

Scanning ahead to remove the extra roundtrip

With the HelloRetryRequest trick today, we can safely advertise support for post-quantum key agreement to all origins. The downside is that for those origins that do support post-quantum key agreement, we’re incurring an extra roundtrip for the HelloRetryRequest, which hurts performance.

You can remove the roundtrip by configuring your zone as preferred, but we can do better: the best setting is the one you shouldn’t have to touch.

We have started scanning all active origins for support of post-quantum key agreement. Roughly every 24 hours, we will attempt a series of about ten TLS connections to your origin, to test support and preferences for the various key agreements.

Our preliminary results show that 0.5% of origins support a post-quantum connection. As expected, we found that a small fraction (<0.34%) of all origins do not properly establish a connection, when we send a post-quantum keyshare in the first ClientHello, which corresponds to the preferred setting. Unexpectedly the vast majority of these origins do return a HelloRetryRequest, but fail after receiving the second ClientHello with a classical keyshare. We are investigating the exact causes of these failures, and will reach out to vendors to help resolve them.

Later this year, we will start using these scan results to determine the best setting for zones that haven’t been configured yet. That means that for those zones whose origins support it reliably, we will send a post-quantum keyshare directly without extra roundtrip.

Also speeding up non post-quantum origins

The scanner pipeline we built will not just benefit post-quantum origins. By default we send X25519, but not every origin supports or prefers X25519. We find that 4% of origin servers will send us a HelloRetryRequest for other key agreements such as P-384.

Key agreement Fraction supported Fraction preferred
X25519 96% 96%
P-256 97% 0.6%
P-384 89% 2.3%
P-521 82% 0.1%
X25519Kyber768Draft00 0.5% 0.5%

Also, later this year, we will use these scan results to directly send the most preferred keyshare to your origin removing the need for an extra roundtrip caused by HRR.

Wrapping up

To mitigate the store-now/decrypt-later threat, and ensure the Internet stays encrypted, the IT industry needs to work together to roll out post-quantum cryptography. We’re excited that today we’re rolling out support for post-quantum secure outbound connections: connections between Cloudflare and the origins.

We would love it if you would try and enable post-quantum key agreement on your origin. Please, do share your experiences, or reach out for any questions: [email protected].

To follow the latest developments of our deployment of post-quantum cryptography, and client/server support, check out pq.cloudflareresearch.com and keep an eye on this blog.

Cloudflare’s commitment to the 2023 Summit for Democracy

Post Syndicated from Patrick Day original https://blog.cloudflare.com/cloudflare-commitment-to-the-2023-summit-for-democracy/

Cloudflare’s commitment to the 2023 Summit for Democracy

Cloudflare’s commitment to the 2023 Summit for Democracy

On Tuesday, March 28, 2023, the US Government will launch the Summit for Democracy 2023, following up on the inaugural Summit for Democracy 2021. The Summit is co-hosted by the United States, Costa Rica, Zambia, the Netherlands, and South Korea. Cloudflare is proud to participate in and contribute commitments to the Summit because we believe that everyone should have access to an Internet that is faster, more reliable, more private, and more secure.  We work to ensure that the responsibility to respect human rights is embedded throughout our business functions. Cloudflare’s mission — to help build a better Internet — reflects a long-standing belief that we can help make the Internet better for everyone.

Our mission and core values dovetail with the Summit’s goals of strengthening democratic governance, respect for human rights and human rights defenders, and working in partnership to strengthen respect for these values. As we have written about before, access to the Internet allows activists and human rights defenders to expose abuses across the globe, allows collective causes to grow into global movements, and provides the foundation for large-scale organizing for political and social change in ways that have never been possible before.

Cloudflare’s commitment to the 2023 Summit for Democracy

What is the Summit for Democracy?

In December 2021, in an effort to respond to challenges to democracy worldwide, the United States held the first ever global Summit for Democracy. The Summit provided an opportunity to strengthen collaboration between democracies around the world and address common challenges from authoritarian threats.  The United States invited over 100 countries plus the President of the European Commission and the United Nations Secretary-General. The Summit focused on three key themes: (1) defending against authoritarianism; (2) addressing and fighting corruption; and (3) promoting respect for human rights, and gave participants an opportunity to announce commitments, reforms, and initiatives to defend democracy and human rights. The Summit was followed by a Year of Action, during which governments implemented their commitments to the Summit.

The 2023 Summit will focus more directly on partnering with the private sector to promote an affirmative vision for technology by countering the misuse of technology and shaping emerging technologies so that they strengthen democracy and human rights, which Cloudflare supports in theory and in practice.

The three-day Summit will highlight the importance of the private sector’s role in responding to challenges to democracy. The first day of the Summit is the Thematic Day, where Cabinet-level officials, the private sector and civil society organizations will spotlight key Summit themes. On the second day of the Summit, the Plenary Day, the five co-hosts will each host a high-level plenary session. On the final day of the Summit, Co-Host Event Day, each of the co-hosts will lead high-level regional conversations with partners from government, civil society, and the private sector.

Cloudflare will be participating in the Thematic Day and the Co-Host Event Day in Washington, DC, in addition to other related events.

Cloudflare commitments

In advance of the 2023 Summit, the United States issued a Call to Action to the private sector to consider commitments that advance an affirmative agenda for democratic renewal. The United States encouraged the private sector to make commitments that align with the Presidential Initiative on Democratic Renewal, the Declaration on the Future of the Internet, and the Summit’s four objectives:

  • Countering the misuse of technology
  • Fighting corruption
  • Protecting civic space
  • Advancing labor rights

Cloudflare answered the United States’s call to action and made commitments to (1) help democratize post-quantum cryptography; (2) work with researchers to share data on Internet censorship and shutdowns; and (3) engage with civil society on Internet protocols and the application of privacy-enhancing technologies.

Democratizing post-quantum cryptography by including it for free, by default

At Cloudflare, we believe to enhance privacy as a human right the most advanced cryptography needs to be available to everyone, free of charge, forever. Cloudflare has committed to including post-quantum cryptography for free by default to all customers – including individual web developers, small businesses, non-profits, and governments. In particular, this will benefit at-risk groups using Cloudflare services like humanitarian organizations, human rights defenders, and journalists through Project Galileo, as well as state and local government election websites through the Athenian Project, to help secure their websites, APIs, cloud tools and remote employees against future threats.

We believe everyone should have access to the next era of cybersecurity standards–instantly and for free. To that end, Cloudflare will also publish vendor-neutral roadmaps based on NIST standards to help businesses secure any connections that are not protected by Cloudflare. We hope that others will follow us in making their implementations of post-quantum cryptography free so that we can create a secure and private Internet without a “quantum” up-charge.  More details about our commitment is here and here.

Working with researchers to better document Internet censorship and shutdowns

Cloudflare commits to working with researchers to share data about Internet shutdowns and selective Internet traffic interference and to make the results of the analysis of this data public and accessible. The Cloudflare Network includes 285 locations in over 100 countries, interconnects with over 11,500 networks globally, and serves a significant portion of global Internet traffic. Cloudflare shares aggregated data on the Internet’s patterns, insights, threats and trends with the public through Cloudflare Radar, including providing alerts and data to help organizations like Access Now’s KeepItOn coalition, the Freedom Online Coalition, the Internet Society, and Open Observatory of Network Interference (OONI) monitor Internet censorship and shutdowns around the world. Cloudflare commits to working with research partners to identify signatures associated with connection tampering and failures, which are believed to be caused primarily by active censorship and blocking. Cloudflare is well-positioned to observe and report on these signatures from a global perspective, and will provide access to its findings to support additional tampering detection efforts.

Engaging with civil society on Internet protocols and the development and application of privacy-enhancing technologies

Cloudflare believes that meaningful consultation with civil society is a fundamental part of building an Internet that advances human rights. As Cloudflare works with Internet standards bodies and other Internet providers on the next-generation of privacy-enhancing technologies and protocols, like protocols to encrypt Domain Name Service records and Encrypted Client Hello (ECH) and privacy enhancing technologies like OHTTP, we commit to direct engagement with civil society and human rights experts on standards and technologies that might have implications for human rights.

Cloudflare has long worked with industry partners, stakeholders, and international standards organizations to build a more private, secure, and resilient Internet for everyone. For example, Cloudflare has built privacy technologies into its network infrastructure, helped develop and deploy TLS 1.3 alongside helping lead QUIC  and other Internet protocols, improve transparency around routing and public key infrastructure (PKI), and operating a public DNS resolver that supports encryption protocols. Ensuring civil society and human rights experts are able to contribute and provide feedback as part of those efforts will make certain that future development and application of privacy-enhancing technologies and protocols are consistent with human rights principles and account for human rights impacts.

Our commitments to democratizing post-quantum cryptography, working with researchers on Internet censorship and shutdowns, and engaging with civil society on Internet protocols and the development and application of privacy-preserving technologies will help to secure access to a free, open, and interconnected Internet.

Partnering to make the Summit a success

In the lead-up to the Summit, Cloudflare has been working in partnership with the US Department of State, the National Security Council, the US Agency for International Development (USAID), and various private sector and civil society partners to prepare for the Summit. As part of our involvement, we have also contributed to roundtables and discussions with the Center for Strategic and International Studies, GNI, the Design 4 Democracy Coalition, and the Freedom Online Coalition. Cloudflare is also participating in official meetings and side events including at the Carnegie Endowment for International Peace and the Council on Foreign Relations.

In addition to the official Summit events, there are a wide range of events organized by civil society which the Accountability Lab has created a website to highlight. Separately, on Monday, March 27 the Global Democracy Coalition convened a Partners Day to organize civil society and other non-governmental events. Many of these events are being held by some of our Galileo partners like the National Democratic Institute, the International Republican Institute, Freedom House, and the Council of Europe.

Cloudflare is grateful for all of the hard work that our partners in government, civil society, and the private sector have done over the past few months to make this Summit a success. At a time where we are seeing increasing challenges to democracy and the struggle for human rights around the world, maintaining a secure, open, Internet is critical. Cloudflare is proud of our participation in the Summit and in the commitments we are making to help advance human rights. We look forward to continuing our engagement in the Summit partnership to fulfill our mission to help build a better Internet.

Post-quantum crypto should be free, so we’re including it for free, forever

Post Syndicated from Wesley Evans original https://blog.cloudflare.com/post-quantum-crypto-should-be-free/

Post-quantum crypto should be free, so we’re including it for free, forever

Post-quantum crypto should be free, so we’re including it for free, forever

At Cloudflare, helping to build a better Internet is not just a catchy saying. We are committed to the long-term process of standards development. We love the work of pushing the fundamental technology of the Internet forward in ways that are accessible to everyone. Today we are adding even more substance to that commitment. One of our core beliefs is that privacy is a human right. We believe that to achieve that right the most advanced cryptography needs to be available to everyone, free of charge, forever. Today, we are announcing that our implementations of post-quantum cryptography will meet that standard: available to everyone, and included free of charge, forever.

We have a proud history of taking paid encryption products and launching it to the Internet at scale for Free. Even at the cost of short and long-term revenue because it’s the right thing to do. In 2014, we made SSL free for every Cloudflare customer with Universal SSL. As we make our implementations of post-quantum cryptography free forever today, we do it in the spirit of that first major announcement:

“Having cutting-edge encryption may not seem important to a small blog, but it is critical to advancing the encrypted-by-default future of the Internet. Every byte, however seemingly mundane, that flows encrypted across the Internet makes it more difficult for those who wish to intercept, throttle, or censor the web. In other words, ensuring your personal blog is available over HTTPS makes it more likely that a human rights organization or social media service or independent journalist will be accessible around the world. Together we can do great things.”

We hope that others will follow us in making their implementations of PQC free as well so that we can create a secure and private Internet without a “quantum” up-charge.

The Internet has matured since the 1990’s and the launch of SSL. What was once an experimental frontier has turned into the underlying fabric of modern society. It runs in our most critical infrastructure like power systems, hospitals, airports, and banks. We trust it with our most precious memories. We trust it with our secrets. That’s why the Internet needs to be private by default. It needs to be secure by default. It’s why we’re committed to ensuring that anyone and everyone can achieve post quantum security for free as well as start deploying it at scale today.

Our work on post-quantum crypto is driven by the thesis that quantum computers that can break conventional cryptography create a similar problem to the Year 2000 bug. We know there is going to be a problem in the future that could have catastrophic consequences for users, businesses, and even nation states. The difference this time is we don’t know the date and time that this break in the paradigm of how computers operate will occur. We need to prepare today to be ready for this threat.

To that end we have been preparing for this transition since 2018. At that time we were concerned about the implementation problems other large protocol transitions, like the move to TLS 1.3, had caused our customers and wanted to get ahead of it. Cloudflare Research over the last few years has become a leader and champion of the idea that PQC security wasn’t an afterthought for tomorrow but a real problem that needed to be solved today. We have collaborated with industry partners like Google and Mozilla, contributed to development through participation in the IETF, and even launched an open source experimental cryptography suite to help move the needle. We have tried hard to work with everyone that wanted to be a part of the process and show our work along the way.

As we have worked with our partners in both industry and academia to help prepare us and the Internet for a post-quantum future, we have become dismayed by an emerging trend. There are a growing number of vendors out there that want to cash in on the legitimate fear that nervous executives, privacy advocates, and government leaders have about quantum computing breaking traditional encryption. These vendors offer vague solutions based on unproven technologies like “Quantum Key Distribution” or “Post Quantum Security” libraries that package non-standard algorithms that haven’t been through public review with exorbitant price tags like RSA did in the 1990s. They often love to throw around phrases like “AI” and “Post Quantum” without really showing their work on how any of their systems actually function. Security and privacy are table stakes in the modern Internet, and no one should be charged just to get the baseline protection needed in our contemporary world.

Post-quantum crypto should be free, so we’re including it for free, forever

Launch your PQC transition today

Testing and adopting post-quantum cryptography in modern networks doesn’t have to be hard! In fact, Cloudflare customers can test PQC in their systems today, as we describe later in this post.

Currently, we support Kyber for key agreement on any traffic that uses TLS 1.3 including HTTP/3. (If you want a deep dive on our implementation check out our blog from last fall announcing the beta.) To help you test your traffic to Cloudflare domains with these new key agreement methods we have open-sourced forks for BoringSSL, Go and quic-go. For BoringSSL and Go, check out the sample code here.

If you use Tunnels with cloudflared then upgrading to PQC is super simple. Make sure you’re on at least version 2022.9.1 and simply run cloudflared --post-quantum.

After testing out how Cloudflare can help you implement PQC in your networks, it’s time to start to prepare yourself for the transition to PQC in all of your systems. This first step of inventorying and identifying is critical to a smooth rollout. We know first hand since we have undertaken an extensive evaluation of all of our systems to earn our FedRAMP Authorization certifications, and we are doing a similar evaluation again to transition all of our internal systems to PQC.

Post-quantum crypto should be free, so we’re including it for free, forever

How we are setting ourselves up for the future of quantum computing

Here’s a sneak preview of the path that we are developing right now to fully secure Cloudflare itself against the cryptographic threat of quantum computers. We can break that path down into three parts: internal systems, zero trust, and open source contributions.

The first part of our path to full PQC adoption at Cloudflare is around all of our connections. The connection between yourself and Cloudflare is just one part of the larger path of the connection. Inside our internal systems we are implementing two significant upgrades in 2023 to ensure that they are PQC secure as well.

The first is that we use BoringSSL for a substantial amount of connections. We currently use our fork and we are excited that upstream support for Kyber is underway. Any additional internal connections that use a different cryptographic system are being upgraded as well. The second major upgrade we are making is to shift the remaining internal connections that use TLS 1.2 to TLS 1.3. This combination of Kyber and TLS 1.3 will make our internal connections faster and more secure, even though we use a hybrid of classical and post-quantum secure cryptography. It’s a speed and security win-win. And we proved this power house combination would provide that speed and security over three and half years ago thanks to the groundbreaking work of Cloudflare Research and Google.

The next part of that path is all about using PQC and zero trust as allies together. As we think about the security posture of tomorrow being based around post-quantum cryptography, we have to look at the other critical security paradigm being implemented today: zero trust. Today, the zero trust vendor landscape is littered with products that fail to support common protocols like IPv6 and TLS 1.2, let alone the next generation of protocols like TLS 1.3 and QUIC that enable PQC. So many middleboxes struggle under the load of today’s modern protocols. They artificially downgrade connections and break end user security all in the name of inspecting traffic because they don’t have a better solution. Organizations big and small struggled to support customers that wanted the highest possible performance and security, while also keeping their businesses safe, because of the resistance of these vendors to adapt to modern standards. We do not want to repeat the mistakes of the past. We are planning and evaluating the needed upgrades to all of our zero trust products to support PQC out of the box. We believe that zero trust and post-quantum cryptography are not at odds with one another, but rather together are the future standard of security.

Finally, it’s not enough for us to do this for ourselves and for our customers. The Internet is only as strong as its weakest links in the connection chains that network us all together. Every connection on the Internet needs the strongest possible encryption so that businesses can be secure, and everyday users can be ensured of their privacy. We believe that this core technology should be vendor agnostic and open to everyone. To help make that happen our final part of the path is all about contributing to open source projects. We have already been focused on releases to CIRCL. CIRCL (Cloudflare Interoperable, Reusable Cryptographic Library) is a collection of cryptographic primitives written in Go. The goal of this library is to be used as a tool for experimental deployment of cryptographic algorithms targeting post quantum.

Later on this year we will be publishing as open source a set of easy to adopt, vendor-neutral roadmaps to help you upgrade your own systems to be secure against the future today. We want the security and privacy created by post quantum crypto to be accessible and free for everyone. We will also keep writing extensively about our post quantum journey. To learn more about how you can turn on PQC today, and how we have been building post-quantum cryptography at Cloudflare, please check out these resources:

No, AI did not break post-quantum cryptography

Post Syndicated from Lejla Batina (Guest author) original https://blog.cloudflare.com/kyber-isnt-broken/

No, AI did not break post-quantum cryptography

No, AI did not break post-quantum cryptography

News coverage of a recent paper caused a bit of a stir with this headline: “AI Helps Crack NIST-Recommended Post-Quantum Encryption Algorithm”. The news article claimed that Kyber, the encryption algorithm in question, which we have deployed world-wide, had been “broken.” Even more dramatically, the news article claimed that “the revolutionary aspect of the research was to apply deep learning analysis to side-channel differential analysis”, which seems aimed to scare the reader into wondering what will Artificial Intelligence (AI) break next?

Reporting on the paper has been wildly inaccurate: Kyber is not broken and AI has been used for more than a decade now to aid side-channel attacks. To be crystal clear: our concern is with the news reporting around the paper, not the quality of the paper itself. In this blog post, we will explain how AI is actually helpful in cryptanalysis and dive into the paper by Dubrova, Ngo, and Gärtner (DNG), that has been misrepresented by the news coverage. We’re honored to have Prof. Dr. Lejla Batina and Dr. Stjepan Picek, world-renowned experts in the field of applying AI to side-channel attacks, join us on this blog.

We start with some background, first on side-channel attacks and then on Kyber, before we dive into the paper.

Breaking cryptography

When one thinks of breaking cryptography, one imagines a room full of mathematicians puzzling over minute patterns in intercepted messages, aided by giant computers, until they figure out the key. Famously in World War II, the Nazis’ Enigma cipher machine code was completely broken in this way, allowing the Allied forces to read along with their communications.

No, AI did not break post-quantum cryptography

It’s exceedingly rare for modern established cryptography to get broken head-on in this way. The last catastrophically broken cipher was RC4, designed in 1987, while AES, designed in 1998, stands proud with barely a scratch. The last big break of a cryptographic hash was on SHA-1, designed in 1995, while SHA-2, published in 2001, remains untouched in practice.

So what to do if you can’t break the cryptography head-on? Well, you get clever.

Side-channel attacks

Can you guess the pin code for this gate?

No, AI did not break post-quantum cryptography

You can clearly see that some of the keys are more worn than the others, suggesting heavy use. This observation gives us some insight into the correct pin, namely the digits. But the correct order is not immediately clear. It might be 1580, 8510, or even 115085, but it’s a lot easier than trying every possible pin code. This is an example of a side-channel attack. Using the security feature (entering the PIN) had some unintended consequences (abrading the paint), which leaks information.

There are many different types of side channels, and which one you should worry about depends on the context. For instance, the sounds your keyboard makes as you type leaks what you write, but you should not worry about that if no one is listening in.

Remote timing side channel

When writing cryptography in software, one of the best known side channels is the time it takes for an algorithm to run. For example, let’s take the classic example of creating an RSA signature. Grossly simplified, to sign a message m with private key d, we compute the signature s as md (mod n). Computing the exponent of a big number is hard, but luckily, because we’re doing modular arithmetic, there is the square-and-multiply trick. Here is a naive implementation in pseudocode:

No, AI did not break post-quantum cryptography

The algorithm loops over the bits of the secret key, and does a multiply step if the current bit is a 1. Clearly, the runtime depends on the secret key. Not great, but if the attacker can only time the full run, then they only learn the number of 1s in the secret key. The typical catastrophic timing attack against RSA instead is hidden behind the “mod n”. In a naive implementation this modular reduction is slower if the number being reduced is larger or equal n. This allows an attacker to send specially crafted messages to tease out the secret key bit-by-bit and similar attacks are surprisingly practical.

Because of this, the mantra is: cryptography should run in “constant time”. This means that the runtime does not depend on any secret information. In our example, to remove the first timing issue, one would replace the if-statement with something equivalent to:

	s = ((s * powerOfM) mod n) * bit(s, i) + s * (1 - bit(s, i))

This ensures that the multiplication is always done. Similar countermeasures prevent practically all remote timing attacks.

Power side-channel

The story is quite different for power side-channel attacks. Again, the classic example is RSA signatures. If we hook up an oscilloscope to a smartcard that uses the naive algorithm from before, and measure the power usage while it signs, we can read off the private key by eye:

No, AI did not break post-quantum cryptography

Even if we use a constant-time implementation, there are still minute changes in power usage that can be detected. The underlying issue is that hardware gates that switch use more power than those that don’t. For instance, computing 127 + 64 takes more energy than 64 + 64.

No, AI did not break post-quantum cryptography
127+64 and 64+64 in binary. There are more switched bits in the first.

Masking
A common countermeasure against power side-channel leakage is masking. This means that before using the secret information, it is split randomly into shares. Then, the brunt of the computation is done on the shares, which are finally recombined.

In the case of RSA, before creating a new signature, one can generate a random r and compute md+r (mod n) and mr (mod n) separately. From these, the final signature md (mod n) can be computed with some extra care.

Masking is not a perfect defense. The parts where shares are created or recombined into the final value are especially vulnerable. It does make it harder for the attacker: they will need to collect more power traces to cut through the noise. In our example we used two shares, but we could bump that up even higher. There is a trade-off between power side-channel resistance and implementation cost.

One of the challenging parts in the field is to estimate how much secret information is actually leaked through the traces, and how to extract it. Here machine learning enters the picture.

Machine learning: extracting the key from the traces

Machine learning, of which deep learning is a part, represents the capability of a system to acquire its knowledge by extracting patterns from data —  in this case, the secrets from the power traces. Machine learning algorithms can be divided into several categories based on their learning style. The most popular machine learning algorithms in side-channel attacks follow the supervised learning approach. In supervised learning, there are two phases: 1) training, where a machine learning model is trained based on known labeled examples (e.g., side-channel measurements where we know the key) and 2) testing, where, based on the trained model and additional side-channel measurements (now, with an unknown key), the attacker guesses the secret key. A common depiction of such attacks is given in the figure below.

No, AI did not break post-quantum cryptography

While the threat model may sound counterintuitive, it is actually not difficult to imagine that the attacker will have access (and control) of a device similar to the one being attacked.

In side-channel analysis, the attacks following those two phases (training and testing) are called profiling attacks.

Profiling attacks are not new. The first such attack, called the template attack, appeared in 2002. Diverse machine learning techniques have been used since around 2010, all reporting good results and the ability to break various targets. The big breakthrough came in 2016, when the side-channel community started using deep learning. It greatly increased the effectiveness of power side-channel attacks both against symmetric-key and public-key cryptography, even if the targets were protected with, for instance, masking or some other countermeasures. To be clear: it doesn’t magically figure out the key, but it gets much better at extracting the leaked bits from a smaller number of power traces.

While machine learning-based side-channel attacks are powerful, they have limitations. Carefully implemented countermeasures make the attacks more difficult to conduct. Finding a good machine learning model that can break a target can be far from trivial: this phase, commonly called tuning, can last weeks on powerful clusters.

What will the future bring for machine learning/AI in side-channel analysis? Counter intuitively, we would like to see more powerful and easy to use attacks. You’d think that would make us worse off, but to the contrary it will allow us to better estimate how much actual information is leaked by a device. We also hope that we will be able to better understand why certain attacks work (or not), so that more cost-effective countermeasures can be developed. As such, the future for AI in side-channel analysis is bright especially for security evaluators, but we are still far from being able to break most of the targets in real-world applications.

Kyber

Kyber is a post-quantum (PQ) key encapsulation method (KEM). After a six-year worldwide competition, the National Institute of Standards and Technology (NIST) selected Kyber as the post-quantum key agreement they will standardize. The goal of a key agreement is for two parties that haven’t talked to each other before to agree securely on a shared key they can use for symmetric encryption (such as Chacha20Poly1305). As a KEM, it works slightly different with different terminology than a traditional Diffie–Hellman key agreement (such as X25519):

No, AI did not break post-quantum cryptography

When connecting to a website the client first generates a new ephemeral keypair that consists of a private and public key. It sends the public key to the server. The server then encapsulates  a shared key with that public key, which gives it a random shared key, which it keeps, and a ciphertext (in which the shared key is hidden), which the server returns to the client. The client can then use its private key to decapsulate the shared key from the ciphertext. Now the server and client can communicate with each other using the shared key.

Key agreement is particularly important to make secure against attacks of quantum computers. The reason is that an attacker can store traffic today, and crack the key agreement in the future, revealing the shared key and all communication encrypted with it afterwards. That is why we have already deployed support for Kyber across our network.

The DNG paper

With all the background under our belt, we’re ready to take a look at the DNG paper. The authors perform a power side-channel attack on their own masked implementation of Kyber with six shares.

Point of attack

They attack the decapsulation step. In the decapsulation step, after the shared key is extracted, it’s encapsulated again, and compared against the original ciphertext to detect tampering. For this re-encryption step, the precursor of the shared key—let’s call it the secret—is encoded bit-by-bit into a polynomial. To be precise, the 256-bit secret needs to be converted to a polynomial with 256 coefficients modulo q=3329, where the ith coefficient is (q+1)/2 if the ith bth is 1 and zero otherwise.

This function sounds simple enough, but creating a masked version is tricky. The rub is that the natural way to create shares of the secret is to have shares that xor together to be the secret, and that the natural way to share polynomials is to have shares that add together to get to the intended polynomial.

This is the two-shares implementation of the conversion that the DNG paper attacks:

No, AI did not break post-quantum cryptography

The code loops over the bits of the two shares. For each bit, it creates a mask, that’s 0xffff if the bit was 1 and 0 otherwise. Then this mask is used to add (q+1)/2 to the polynomial share if appropriate. Processing a 1 will use a bit more power. It doesn’t take an AI to figure out that this will be a leaky function. In fact, this pattern was pointed out to be weak back in 2016, and explicitly mentioned to be a risk for masked Kyber in 2020. Apropos, one way to mitigate this, is to process multiple bits at once — for the state of the art, tune into April 2023’s NIST PQC seminar. For the moment, let’s allow the paper its weak target.

The authors do not claim any fundamentally new attack here. Instead, they improve the effectiveness of the attack in two ways: the way they train the neural network, and how to use multiple traces more effectively by changing the ciphertext sent. So, what did they achieve?

Effectiveness

No, AI did not break post-quantum cryptography

To test the attack, they use a Chipwhisperer-lite board, which has a Cortex M4 CPU, which they downclock to 24Mhz. Power usage is sampled at 24Mhz, with high 10-bit precision.

To train the neural networks, 150,000 power traces are collected for decapsulation of different ciphertexts (with known shared key) for the same KEM keypair. This is already a somewhat unusual situation for a real-world attack: for key agreement KEM keypairs are ephemeral; generated and used only once. Still, there are certainly legitimate use cases for long-term KEM keypairs, such as for authentication, HPKE, and in particular ECH.

The training is a key step: different devices even from the same manufacturer can have wildly different power traces running the same code. Even if two devices are of the same model, their power traces might still differ significantly.

The main contribution highlighted by the authors is that they train their neural networks to attack an implementation with 6 shares, by starting with a neural network trained to attack an implementation with 5 shares. That one can be trained from a model to attack 4 shares, and so on. Thus to apply their method, of these 150,000 power traces, one-fifth must be from an implementation with 6 shares, another one-fifth from one with 5 shares, et cetera. It seems unlikely that anyone will deploy a device where an attacker can switch between the number of shares used in the masking on demand.

Given these affordances, the attack proper can commence. The authors report that, from a single power trace of a two-share decapsulation, they could recover the shared key under these ideal circumstances with probability… 0.12%. They do not report the numbers for single trace attacks on more than two shares.

When we’re allowed multiple traces of the same decapsulation, side-channel attacks become much more effective. The second trick is a clever twist on this: instead of creating a trace of decapsulation of exactly the same message, the authors rotate the ciphertext to move bits of the shared key in more favorable positions. With 4 traces that are rotations of the same message, the success probability against the two-shares implementation goes up to 78%. The six-share implementation stands firm at 0.5%. When allowing 20 traces from the six-share implementation, the shared key can be recovered with an 87% chance.

In practice

The hardware used in the demonstration might be somewhat comparable to a smart card, but it is very different from high-end devices such as smartphones, desktop computers and servers. Simple power analysis side-channel attacks on even just embedded 1GHz processors are much more challenging, requiring tens of thousands of traces using a high-end oscilloscope connected close to the processor. There are much better avenues for attack with this kind of physical access to a server: just connect the oscilloscope to the memory bus.

Except for especially vulnerable applications, such as smart cards and HSMs, power-side channel attacks are widely considered infeasible. Although sometimes, when the planets align,  an especially potent power side-channel attack can be turned into a remote timing attack due to throttling, as demonstrated by Hertzbleed. To be clear: the present attack does not even come close.

And even for these vulnerable applications, such as smart cards, this attack is not particularly potent or surprising. In the field, it is not a question of whether a masked implementation leaks its secrets, because it always does. It’s a question of how hard it is to actually pull off. Papers such as the DNG paper contribute by helping manufacturers estimate how many countermeasures to put in place, to make attacks too costly. It is not the first paper studying power side-channel attacks on Kyber and it will not be the last.

Wrapping up

AI did not completely undermine a new wave of cryptography, but instead is a helpful tool to deal with noisy data and discover the vulnerabilities within it. There is a big difference between a direct break of cryptography and a power side-channel attack. Kyber is not broken, and the presented power side-channel attack is not cause for alarm.

Defending against future threats: Cloudflare goes post-quantum

Post Syndicated from Bas Westerbaan original https://blog.cloudflare.com/post-quantum-for-all/

Defending against future threats: Cloudflare goes post-quantum

Defending against future threats: Cloudflare goes post-quantum

There is an expiration date on the cryptography we use every day. It’s not easy to read, but somewhere between 15 or 40 years, a sufficiently powerful quantum computer is expected to be built that will be able to decrypt essentially any encrypted data on the Internet today.

Luckily, there is a solution: post-quantum (PQ) cryptography has been designed to be secure against the threat of quantum computers. Just three months ago, in July 2022, after a six-year worldwide competition, the US National Institute of Standards and Technology (NIST), known for AES and SHA2, announced which post-quantum cryptography they will standardize. NIST plans to publish the final standards in 2024, but we want to help drive early adoption of post-quantum cryptography.

Starting today, as a beta service, all websites and APIs served through Cloudflare support post-quantum hybrid key agreement. This is on by default1; no need for an opt-in. This means that if your browser/app supports it, the connection to our network is also secure against any future quantum computer.

We offer this post-quantum cryptography free of charge: we believe that post-quantum security should be the new baseline for the Internet.

Deploying post-quantum cryptography seems like a no-brainer with quantum computers on the horizon, but it’s not without risks. To start, this is new cryptography: even with years of scrutiny, it is not inconceivable that a catastrophic attack might still be discovered. That is why we are deploying hybrids: a combination of a tried and tested key agreement together with a new one that adds post-quantum security.

We are primarily worried about what might seem mere practicalities. Even though the protocols used to secure the Internet are designed to allow smooth transitions like this, in reality there is a lot of buggy code out there: trying to create a post-quantum secure connection might fail for many reasons — for example a middlebox being confused about the larger post-quantum keys and other reasons we have yet to observe because these post-quantum key agreements are brand new. It’s because of these issues that we feel it is important to deploy post-quantum cryptography early, so that together with browsers and other clients we can find and work around these issues.

In this blog post we will explain how TLS, the protocol used to secure the Internet, is designed to allow a smooth and secure migration of the cryptography it uses. Then we will discuss the technical details of the post-quantum cryptography we have deployed, and how, in practice, this migration might not be that smooth at all. We finish this blog post by explaining how you can build a better, post-quantum secure, Internet by helping us test this new generation of cryptography.

TLS: Transport Layer Security

When you’re browsing a website using a secure connection, whether that’s using HTTP/1.1 or QUIC, you are using the Transport Layer Security (TLS) protocol under the hood. There are two major versions of TLS in common use today: the new TLS 1.3 (~90%) and the older TLS 1.2 (~10%), which is on the decline.

TLS 1.3 is a huge improvement over TLS 1.2: it’s faster, more secure, simpler and more flexible in just the right places. This makes it easier to add post-quantum security to TLS 1.3 compared to 1.2. For the moment, we will leave it at that: we’ve only added post-quantum support to TLS 1.3.

So, what is TLS all about? The goal is to set up a connection between a browser and website such that

  • Confidentiality and integrity, no one can read along or tamper with the data undetected.
  • Authenticity you know you’re connected to the right website; not an imposter.

Building blocks: AEAD, key agreement and signatures

Three different types of cryptography are used in TLS to reach this goal.

  • Symmetric encryption, or more precisely Authenticated Encryption With Associated Data (AEAD), is the workhorse of cryptography: it’s used to ensure confidentiality and integrity. This is a straight-forward kind of encryption: there is a single key that is used to encrypt and decrypt the data. Without the right key you cannot decrypt the data and any tampering with the encrypted data results in an error while decrypting.

In TLS 1.3, ChaCha20-Poly1305 and AES128-GCM are in common use today.
What about quantum attacks? At first glance, it looks like we need to switch to 256-bit symmetric keys to defend against Grover’s algorithm. In practice, however, Grover’s algorithm doesn’t parallelize well, so the currently deployed AEADs will serve just fine.

So if we can agree on a shared key to use with symmetric encryption, we’re golden. But how to get to a shared key? You can’t just pick a key and send it to the server: anyone listening in would know the key as well. One might think it’s an impossible task, but this is where the magic of asymmetric cryptography helps out:

  • A key agreement, also called key exchange or key distribution, is a cryptographic protocol with which two parties can agree on a shared key without an eavesdropper being able to learn anything. Today the X25519 Elliptic Curve Diffie–Hellman protocol (ECDH) is the de facto standard key agreement used in TLS 1.3. The security of X25519 is based on the discrete logarithm problem for elliptic curves, which is vulnerable to quantum attacks, as it is easily solved by a cryptographically relevant quantum computer using Shor’s algorithm. The solution is to use a post-quantum key agreement, such as Kyber.

A key agreement only protects against a passive attacker. An active attacker, that can intercept and modify messages (MitM), can establish separate shared keys with both the server and the browser, re-encrypting all data passing through. To solve this problem, we need the final piece of cryptography.

  • With a digital signature algorithm, such as RSA or ECDSA, there are two keys: a public and a private key. Only with the private key, one can create a signature for a message. Anyone with the corresponding public key can check whether a signature is indeed valid for a given message. These digital signatures are at the heart of TLS certificates that are used to authenticate websites.
    Both RSA and ECDSA are vulnerable to quantum attacks. We haven’t replaced those with post-quantum signatures, yet. The reason is that authentication is less urgent: we only need to have them replaced by the time a sufficiently large quantum computer is built, whereas any data secured by a vulnerable key agreement today can be stored and decrypted in the future. Even though we have more time, deploying post-quantum authentication will be quite challenging.

So, how do these building blocks come together to create TLS?

High-level overview of TLS 1.3

A TLS connection starts with a handshake which is used to authenticate the server and derive a shared key. The browser (client) starts by sending a ClientHello message that contains a list of the AEADs, signature algorithms, and key agreement methods it supports. To remove a roundtrip, the client is allowed to make a guess of what the server supports and start the key agreement by sending one or more client keyshares. That guess might be correct (on the left in the diagram below) or the client has to retry (on the right).

Defending against future threats: Cloudflare goes post-quantum
Protocol flow for server-authenticated TLS 1.3 with a supported client keyshare on the left and a HelloRetryRequest on the right.

Key agreement

Before we explain the rest of this interaction, let’s dig into the key agreement: what is a keyshare? The way the key agreement for Kyber and X25519 work is different: the first is a Key Encapsulation Mechanism (KEM), while the latter is a Diffie–Hellman (DH) style agreement. The latter is more flexible, but for TLS it doesn’t make a difference.

Defending against future threats: Cloudflare goes post-quantum
The shape of a KEM and Diffie–Hellman key agreement in TLS-compatible handshake is the same.

In both cases the client sends a client keyshare to the server. From this client keyshare the server generates the shared key. The server then returns a server keyshare with which the client can also compute the shared key.

Going back to the TLS 1.3 flow: when the server receives the ClientHello message it picks an AEAD (cipher), signature algorithm and client keyshare that it supports. It replies with a ServerHello message that contains the chosen AEAD and the server keyshare for the selected key agreement. With the AEAD and shared key locked in, the server starts encrypting data (shown with blue boxes).

Authentication

Together with the AEAD and server keyshare, the server sends a signature, the handshake signature, on the transcript of the communication so far together with a certificate (chain) for the public key that it used to create the signature. This allows the client to authenticate the server: it checks whether it trusts the certificate authority (e.g. Let’s Encrypt) that certified the public key and whether the signature verifies for the messages it sent and received so far. This not only authenticates the server, but it also protects against downgrade attacks.

Downgrade protection

We cannot upgrade all clients and servers to post-quantum cryptography at once. Instead, there will be a transition period where only some clients and some servers support post-quantum cryptography. The key agreement negotiation in TLS 1.3 allows this: during the transition servers and clients will still support non post-quantum key agreements, and can fall back to it if necessary.

This flexibility is great, but also scary: if both client and server support post-quantum key agreement, we want to be sure that they also negotiate the post-quantum key agreement. This is the case in TLS 1.3, but it is not obvious: the keyshares, the chosen keyshare and the list of supported key agreements are all sent in plain text. Isn’t it possible for an attacker in the middle to remove the post-quantum key agreements? This is called a downgrade attack.

This is where the transcript comes in: the handshake signature is taken over all messages received and sent by the server so far. This includes the supported key agreements and the key agreement that was picked. If an attacker changes the list of supported key agreements that the client sends, then the server will not notice. However, the client checks the server’s handshake signature against the list of supported key agreements it has actually sent and thus will detect the mischief.

The downgrade attack problems are much more complicated for TLS 1.2, which is one of the reasons we’re hesitant to retrofit post-quantum security in TLS 1.2.

Wrapping up the handshake

The last part of the server’s response is “server finished”, a message authentication code (MAC) on the whole transcript so far. Most of the work has been done by the handshake signature, but in other operating modes of TLS without handshake signature, such as session resumption, it’s important.

With the chosen AEAD and server keyshare, the client can compute the shared key and decrypt and verify the certificate chain, handshake signature and handshake MAC. We did not mention it before, but the shared key is not used directly for encryption. Instead, for good measure, it’s mixed together with communication transcripts, to derive several specific keys for use during the handshake and the main connection afterwards.

To wrap up the handshake, the client sends its own handshake MAC, and can then proceed to send application-specific data encrypted with the keys derived during the handshake.

Hello! Retry Request?

What we just sketched is the desirable flow where the client sends a keyshare that is supported by the server. That might not be the case. If the server doesn’t accept any key agreements advertised by the client, then it will tell the client and abort the connection.

If there is a key agreement that both support, but for which the client did not send a keyshare, then the server will respond with a HelloRetryRequest (HRR) message requesting a keyshare of a specific key agreement that the client supports as shown on the diagram on the right. In turn, the client responds with a new ClientHello with the selected keyshare.

If there is a key agreement that both support, but for which the client did not send a keyshare, then the server will respond with a HelloRetryRequest (HRR) message requesting a keyshare of a specific key agreement that the client supports as shown on the diagram on the right. In turn, the client responds with a new ClientHello with the selected keyshare.

This is not the whole story: a server is also allowed to send a HelloRetryRequest to request a different key agreement that it prefers over those for which the client sent shares. For instance, a server can send a HelloRetryRequest to a post-quantum key agreement if the client supports it, but didn’t send a keyshare for it.

HelloRetryRequests are rare today. Almost every server supports the X25519 key-agreement and almost every client (98% today) sends a X25519 keyshare. Earlier P-256 was the de facto standard and for a long time many browsers would send both a P-256 and X25519 keyshare to prevent a HelloRetryRequest. As we will discuss later, we might not have the luxury to send two post-quantum keyshares.

That’s the theory

TLS 1.3 is designed to be flexible in the cryptography it uses without sacrificing security or performance, which is convenient for our migration to post-quantum cryptography. That is the theory, but there are some serious issues in practice — we’ll go into detail later on. But first, let’s check out the post-quantum key agreements we’ve deployed.

What we deployed

Today we have enabled support for the X25519Kyber512Draft00 and X25519Kyber768Draft00 key agreements using TLS identifiers 0xfe30 and 0xfe31 respectively. These are exactly the same key agreements we enabled on a limited number of zones this July.

These two key agreements are a combination, a hybrid, of the classical X25519 and the new post-quantum Kyber512 and Kyber768 respectively and in that order. That means that even if Kyber turns out to be insecure, the connection remains as secure as X25519.

Kyber, for now, is the only key agreement that NIST has selected for standardization. Kyber is very light on the CPU: it is faster than X25519 which is already known for its speed. On the other hand, its keyshares are much bigger:

Size keyshares(in bytes) Ops/sec (higher is better)
Algorithm PQ Client Server Client Server
Kyber512 800 768 50,000 100,000
Kyber768 1,184 1,088 31,000 70,000
X25519 32 32 17,000 17,000

Size and CPU performance compared between X25519 and Kyber. Performance varies considerably by hardware platform and implementation constraints and should be taken as a rough indication only.

Kyber is expected to change in minor, but backwards incompatible ways, before final standardization by NIST in 2024. Also, the integration with TLS, including the choice and details of the hybrid key agreement, are not yet finalized by the TLS working group. Once they are, we will adopt them promptly.

Because of this, we will not support the preliminary key agreements announced today for the long term; they’re provided as a beta service. We will post updates on our deployment on pq.cloudflareresearch.com and announce it on the IETF PQC mailing list.

Now that we know how TLS negotiation works in theory, and which key agreements we’re adding, how could it fail?

Where things might break in practice

Protocol ossification

Protocols are often designed with flexibility in mind, but if that flexibility is not exercised in practice, it’s often lost. This is called protocol ossification. The roll-out of TLS 1.3 was difficult because of several instances of ossification. One poignant example is TLS’ version negotiation: there is a version field in the ClientHello message that indicates the latest version supported by the client. A new version was assigned to TLS 1.3, but in testing it turned out that many servers would not fallback properly to TLS 1.2, but crash the connection instead. How do we deal with ossification?

Workaround

Today, TLS 1.3 masquerades itself as TLS 1.2 down to including many legacy fields in the ClientHello. The actual version negotiation is moved into a new extension to the message. A TLS 1.2 server will ignore the new extension and ignorantly continue with TLS 1.2, while a TLS 1.3 server picks up on the extension and continues with TLS 1.3 proper.

Protocol grease

How do we prevent ossification? Having learnt from this experience, browsers will regularly advertise dummy versions in this new version field, so that misbehaving servers are caught early on. This is not only done for the new version field, but in many other places in the TLS handshake, and presciently also for the key agreement identifiers. Today, 40% of browsers send two client keyshares: one X25519 and another a bogus 1-byte keyshare to keep key agreement flexibility.

This behavior is standardized in RFC 8701: Generate Random Extensions And Sustain Extensibility (GREASE) and we call it protocol greasing, as in “greasing the joints” from Adam Langley’s metaphor of protocols having rusty joints in need of oil.

This keyshare grease helps, but it is not perfect, because it is the size of the keyshare that in this case causes the most concern.

Fragmented ClientHello

Post-quantum keyshares are big. The two Kyber hybrids are 832 and 1,216 bytes. Compared to that, X25519 is tiny with only 32 bytes. It is not unlikely that some implementations will fail when seeing such large keyshares.

Our biggest concern is with the larger Kyber768 based keyshare. A ClientHello with the smaller 832 byte Kyber512-based keyshare will just barely fit in a typical network packet. On the other hand, the larger 1,216 byte Kyber768-keyshare will typically fragment the ClientHello into two packets.

Assembling packets together isn’t free: it requires you to keep track of the partial messages around. Usually this is done transparently by the operating system’s TCP stack, but optimized middleboxes and load balancers that look at each packet separately, have to (and might not) keep track of the connections themselves.

QUIC
The situation for HTTP/3, which is built on QUIC, is particularly interesting. Instead of a simple port number chosen by the client (as in TCP), a QUIC packet from the client contains a connection ID that is chosen by the server. Think of it as “your reference” and “our reference” in snailmail. This allows a QUIC load-balancer to encode the particular machine handling the connection into the connection ID.

When opening a connection, the QUIC client doesn’t know which connection ID the server would like and sends a random one instead. If the client needs multiple initial packets, such as with a big ClientHello, then the client will use the same random connection ID. Even though multiple initial packets are allowed by the QUIC standard, a QUIC load balancer might not expect this, and won’t be able to refer to an underlying TCP connection.

Performance

Aside from these hard failures, soft failures, such as performance degradation are also of concern: if it’s too slow to load, a website might as well have been broken to begin with.

Back in 2019 in a joint experiment with Google, we deployed two post-quantum key agreements: CECPQ2, based on NTRU-HRSS, and CECPQ2b, based on SIKE. NTRU-HRSS is very similar to Kyber: it’s a bit larger and slower. Results from 2019 are very promising: X25519+NTRU-HRSS (orange line) is hard to distinguish from X25519 on its own (blue line).

Defending against future threats: Cloudflare goes post-quantum

We will continue to keep a close eye on performance, especially on the tail performance: we want a smooth transition for everyone, from the fastest to the slowest clients on the Internet.

How to help out

The Internet is a very heterogeneous system. To find all issues, we need sufficient numbers of diverse testers. We are working with browsers to add support for these key agreements, but there may not be one of these browsers in every network.

So, to help the Internet out, try and switch a small part of your traffic to Cloudflare domains to use these new key agreement methods. We have open-sourced forks for BoringSSL, Go and quic-go. For BoringSSL and Go, check out the sample code here. If you have any issues, please let us know at [email protected]. We will be discussing any issues and workarounds at the IETF TLS working group.

Outlook

The transition to a post-quantum secure Internet is urgent, but not without challenges. Today we have deployed a preliminary post-quantum key agreement on all our servers — a sizable portion of the Internet — so that we can all start testing the big migration today. We hope that come 2024, when NIST puts a bow on Kyber, we will all have laid the groundwork for a smooth transition to a Post-Quantum Internet.

…..
1We only support these post-quantum key agreements in protocols based on TLS 1.3 including HTTP/3. There is one exception: for the moment we disable these hybrid key exchanges for websites in FIPS-mode.

Introducing post-quantum Cloudflare Tunnel

Post Syndicated from Bas Westerbaan original https://blog.cloudflare.com/post-quantum-tunnel/

Introducing post-quantum Cloudflare Tunnel

Introducing post-quantum Cloudflare Tunnel

Undoubtedly, one of the big themes in IT for the next decade will be the migration to post-quantum cryptography. From tech giants to small businesses: we will all have to make sure our hardware and software is updated so that our data is protected against the arrival of quantum computers. It seems far away, but it’s not a problem for later: any encrypted data captured today (not protected by post-quantum cryptography) can be broken by a sufficiently powerful quantum computer in the future.

Luckily we’re almost there: after a tremendous worldwide effort by the cryptographic community, we know what will be the gold standard of post-quantum cryptography for the next decades. Release date: somewhere in 2024. Hopefully, for most, the transition will be a simple software update then, but it will not be that simple for everyone: not all software is maintained, and it could well be that hardware needs an upgrade as well. Taking a step back, many companies don’t even have a full list of all software running on their network.

For Cloudflare Tunnel customers, this migration will be much simpler: introducing Post-Quantum Cloudflare Tunnel. In this blog post, first we give an overview of how Cloudflare Tunnel works and explain how it can help you with your post-quantum migration. Then we’ll explain how to get started and finish with the nitty-gritty technical details.

Cloudflare Tunnel

With Cloudflare Tunnel you can securely expose a server sitting within an internal network to the Internet by running the cloudflared service next to it. For instance, after having installed cloudflared on your internal network, you can expose your on-prem webapp on the Internet under, say example.com, so that remote workers can access it from anywhere,

Introducing post-quantum Cloudflare Tunnel
Life of a Cloudflare Tunnel request.

How does it work? cloudflared creates long-running connections to two nearby Cloudflare data centers, for instance San Francisco (connection 3) and one other. When your employee visits your domain, they connect (1) to a Cloudflare server close to them, say in Frankfurt. That server knows that this is a Cloudflare Tunnel and that your cloudflared has a connection to a server in San Francisco, and thus it relays (2) the request to it. In turn, via the reverse connection, the request ends up at cloudflared, which passes it (4) to the webapp via your internal network.

In essence, Cloudflare Tunnel is a simple but convenient tool, but the magic is in what you can do on top with it: you get Cloudflare’s DDoS protection for free; fine-grained access control with Cloudflare Access (even if the application didn’t support it) and request logs just to name a few. And let’s not forget the matter at hand:

Post-quantum tunnels

Our goal is to make it easy for everyone to have a fully post-quantum secure connection from users to origin. For this, Post-Quantum Cloudflare Tunnel is a powerful tool, because with it, your users can benefit from a post-quantum secure connection without upgrading your application (connection 4 in the diagram).

Today, we make two important steps towards this goal: cloudflared 2022.9.1 adds the --post-quantum flag, that when given, makes the connection from cloudflared to our network (connection 3) post-quantum secure.

Also today, we have announced support for post-quantum browser connections (connection 1).

We aren’t there yet: browsers (and other HTTP clients) do not support the post-quantum security offered by our network, yet, and we still have to make the connections between our data centers (connection 2) post-quantum secure.

An attacker only needs to have access to one vulnerable connection, but attackers don’t have access everywhere: with every connection we make post-quantum secure, we remove one opportunity for compromise.

We are eager to make post-quantum tunnels the default, but for now it is a beta feature. The reason is that the cryptography used and its integration into the network protocol are not yet final. Making post-quantum the default now, would require users to update cloudflared more often than we can reasonably expect them to.

Getting started

Are frequent updates to cloudflared not a problem for you? Then please do give post-quantum Cloudflare Tunnel a try. Make sure you’re on at least 2022.9.1 and simply run cloudflared with the --post-quantum flag:

$ cloudflared tunnel run --post-quantum tunnel-name
2022-09-23T11:44:42Z INF Starting tunnel tunnelID=[...]
2022-09-23T11:44:42Z INF Version 2022.9.1
2022-09-23T11:44:42Z INF GOOS: darwin, GOVersion: go1.19.1, GoArch: amd64
2022-09-23T11:44:42Z INF Settings: map[post-quantum:true pq:true]
2022-09-23T11:44:42Z INF Generated Connector ID: [...]
2022-09-23T11:44:42Z INF cloudflared will not automatically update if installed by a package manager.
2022-09-23T11:44:42Z INF Initial protocol quic
2022-09-23T11:44:42Z INF Using experimental hybrid post-quantum key agreement X25519Kyber768Draft00
2022-09-23T11:44:42Z INF Starting metrics server on 127.0.0.1:53533/metrics
2022-09-23T11:44:42Z INF Connection [...] registered connIndex=0 ip=[...] location=AMS
2022-09-23T11:44:43Z INF Connection [...] registered connIndex=1 ip=[...] location=AMS
2022-09-23T11:44:44Z INF Connection [...] registered connIndex=2 ip=[...] location=AMS
2022-09-23T11:44:45Z INF Connection [...] registered connIndex=3 ip=[...] location=AMS

If you run cloudflared as a service, you can turn on post-quantum by adding post-quantum: true to the tunnel configuration file. Conveniently, the cloudflared service will automatically update itself if not installed by a package manager.

If, for some reason, creating a post-quantum tunnel fails, you’ll see an error message like

2022-09-22T17:30:39Z INF Starting tunnel tunnelID=[...]
2022-09-22T17:30:39Z INF Version 2022.9.1
2022-09-22T17:30:39Z INF GOOS: darwin, GOVersion: go1.19.1, GoArch: amd64
2022-09-22T17:30:39Z INF Settings: map[post-quantum:true pq:true]
2022-09-22T17:30:39Z INF Generated Connector ID: [...]
2022-09-22T17:30:39Z INF cloudflared will not automatically update if installed by a package manager.
2022-09-22T17:30:39Z INF Initial protocol quic
2022-09-22T17:30:39Z INF Using experimental hybrid post-quantum key agreement X25519Kyber512Draft00
2022-09-22T17:30:39Z INF Starting metrics server on 127.0.0.1:55889/metrics
2022-09-22T17:30:39Z INF 

===================================================================================
You are hitting an error while using the experimental post-quantum tunnels feature.

Please check:

   https://pqtunnels.cloudflareresearch.com

for known problems.
===================================================================================


2022-09-22T17:30:39Z ERR Failed to create new quic connection error="failed to dial to edge with quic: CRYPTO_ERROR (0x128): tls: handshake failure" connIndex=0 ip=[...]

When the post-quantum flag is given, cloudflared will not fall back to a non post-quantum connection.

What to look for

The setup phase is the crucial part: once established, the tunnel is the same as a normal tunnel. That means that performance and reliability should be identical once the tunnel is established.

The post-quantum cryptography we use is very fast, but requires roughly a kilobyte of extra data to be exchanged during the handshake. The difference will be hard to notice in practice.

Our biggest concern is that some network equipment/middleboxes might be confused by the bigger handshake. If the post-quantum Cloudflare Tunnel isn’t working for you, we’d love to hear about it. Contact us at [email protected] and tell us which middleboxes or ISP you’re using.

Under the hood

When the --post-quantum flag is given, cloudflared restricts itself to the QUIC transport for the tunnel connection to our network and will only allow the post-quantum hybrid key exchanges X25519Kyber512Draft00 and X25519Kyber768Draft00 with TLS identifiers 0xfe30 and 0xfe31 respectively. These are hybrid key exchanges between the classical X25519 and the post-quantum secure Kyber. Thus, on the off-chance that Kyber turns out to be insecure, we can still rely on the non-post quantum security of X25519. These are the same key exchanges supported on our network.

cloudflared randomly picks one of these two key exchanges. The reason is that the latter usually requires two initial packets for the TLS ClientHello whereas the former only requires one. That allows us to test whether a fragmented ClientHello causes trouble.

When cloudflared fails to set up the post-quantum connection, it will report the attempted key exchange, cloudflared version and error to pqtunnels.cloudflareresearch.com so that we have visibility into network issues. Have a look at that page for updates on our post-quantum tunnel deployment.

The control connection and authentication of the tunnel between cloudflared and our network are not post-quantum secure yet. This is less urgent than the store-now-decrypt-later issue of the data on the tunnel itself.

We have open-sourced support for these post-quantum QUIC key exchanges in Go.

Outlook

In the coming decade the industry will roll out post-quantum data protection. Some cases will be as simple as a software update and others will be much more difficult. Post-Quantum Cloudflare Tunnel will secure the connection between Cloudflare’s network and your origin in a simple and user-friendly way — an important step towards the Post-Quantum Internet, so that everyone may continue to enjoy a private and secure Internet.

Experiment with post-quantum cryptography today

Post Syndicated from Bas Westerbaan original https://blog.cloudflare.com/experiment-with-pq/

Experiment with post-quantum cryptography today

Experiment with post-quantum cryptography today

Practically all data sent over the Internet today is at risk in the future if a sufficiently large and stable quantum computer is created. Anyone who captures data now could decrypt it.

Luckily, there is a solution: we can switch to so-called post-quantum (PQ) cryptography, which is designed to be secure against attacks of quantum computers. After a six-year worldwide selection process, in July 2022, NIST announced they will standardize Kyber, a post-quantum key agreement scheme. The standard will be ready in 2024, but we want to help drive the adoption of post-quantum cryptography.

Today we have added support for the X25519Kyber512Draft00 and X25519Kyber768Draft00 hybrid post-quantum key agreements to a number of test domains, including pq.cloudflareresearch.com.

Do you want to experiment with post-quantum on your test website for free? Mail [email protected] to enroll your test website, but read the fine-print below.

What does it mean to enable post-quantum on your website?

If you enroll your website to the post-quantum beta, we will add support for these two extra key agreements alongside the existing classical encryption schemes such as X25519. If your browser doesn’t support these post-quantum key agreements (and none at the time of writing do), then your browser will continue working with a classically secure, but not quantum-resistant, connection.

Then how to test it?

We have open-sourced a fork of BoringSSL and Go that has support for these post-quantum key agreements. With those and an enrolled test domain, you can check how your application performs with post-quantum key exchanges. We are working on support for more libraries and languages.

What to look for?

Kyber and classical key agreements such as X25519 have different performance characteristics: Kyber requires less computation, but has bigger keys and requires a bit more RAM to compute. It could very well make the connection faster if used on its own.

We are not using Kyber on its own though, but are using hybrids. That means we are doing both an X25519 and Kyber key agreement such that the connection is still classically secure if either is broken. That also means that connections will be a bit slower. In our experiments, the difference is very small, but it’s best to check for yourself.

The fine-print

Cloudflare’s post-quantum cryptography support is a beta service for experimental use only. Enabling post-quantum on your website will subject the website to Cloudflare’s Beta Services terms and will impact other Cloudflare services on the website as described below.

No stability or support guarantees

Over the coming months, both Kyber and the way it’s integrated into TLS will change for several reasons, including:

  1. Kyber will see small, but backward-incompatible changes in the coming months.
  2. We want to be compatible with other early adopters and will change our integration accordingly.
  3. As, together with the cryptography community, we find issues, we will add workarounds in our integration.

We will update our forks accordingly, but cannot guarantee any long-term stability or continued support. PQ support may become unavailable at any moment. We will post updates on pq.cloudflareresearch.com.

Features in enrolled domains

For the moment, we are running enrolled zones on a slightly different infrastructure for which not all features, notably QUIC, are available.

With that out of the way, it’s…

Demo time!

BoringSSL

With the following commands build our fork of BoringSSL and create a TLS connection with pq.cloudflareresearch.com using the compiled bssl tool. Note that we do not enable the post-quantum key agreements by default, so you have to pass the -curves flag.

$ git clone https://github.com/cloudflare/boringssl-pq
[snip]
$ cd boringssl-pq && mkdir build && cd build && cmake .. -Gninja && ninja 
[snip]
$ ./tool/bssl client -connect pq.cloudflareresearch.com -server-name pq.cloudflareresearch.com -curves Xyber512D00
	Connecting to [2606:4700:7::a29f:8a55]:443
Connected.
  Version: TLSv1.3
  Resumed session: no
  Cipher: TLS_AES_128_GCM_SHA256
  ECDHE curve: X25519Kyber512Draft00
  Signature algorithm: ecdsa_secp256r1_sha256
  Secure renegotiation: yes
  Extended master secret: yes
  Next protocol negotiated: 
  ALPN protocol: 
  OCSP staple: no
  SCT list: no
  Early data: no
  Encrypted ClientHello: no
  Cert subject: CN = *.pq.cloudflareresearch.com
  Cert issuer: C = US, O = Let's Encrypt, CN = E1

Go

Our Go fork doesn’t enable the post-quantum key agreement by default. The following simple Go program enables PQ by default for the http package and GETs pq.cloudflareresearch.com.

​​package main

import (
  "crypto/tls"
  "fmt"
  "net/http"
)

func main() {
  http.DefaultTransport.(*http.Transport).TLSClientConfig = &tls.Config{
    CurvePreferences: []tls.CurveID{tls.X25519Kyber512Draft00, tls.X25519},
    CFEventHandler: func(ev tls.CFEvent) {
      switch e := ev.(type) {
      case tls.CFEventTLS13HRR:
        fmt.Printf("HelloRetryRequest\n")
      case tls.CFEventTLS13NegotiatedKEX:
        switch e.KEX {
        case tls.X25519Kyber512Draft00:
          fmt.Printf("Used X25519Kyber512Draft00\n")
        default:
          fmt.Printf("Used %d\n", e.KEX)
        }
      }
    },
  }

  if _, err := http.Get("https://pq.cloudflareresearch.com"); err != nil {
    fmt.Println(err)
  }
}

To run we need to compile our Go fork:

$ git clone https://github.com/cloudflare/go
[snip]
$ cd go/src && ./all.bash
[snip]
$ ../bin/go run path/to/example.go
Used X25519Kyber512Draft00

On the wire

So what does this look like on the wire? With Wireshark we can capture the packet flow. First a non-post quantum HTTP/2 connection with X25519:

Experiment with post-quantum cryptography today

This is a normal TLS 1.3 handshake: the client sends a ClientHello with an X25519 keyshare, which fits in a single packet. In return, the server sends its own 32 byte X25519 keyshare. It also sends various other messages, such as the certificate chain, which requires two packets in total.

Let’s check out Kyber:

Experiment with post-quantum cryptography today

As you can see the ClientHello is a bit bigger, but still fits within a single packet. The response takes three packets now, instead of two, because of the larger server keyshare.

Under the hood

Want to add client support yourself? We are using a hybrid of X25519 and Kyber version 3.02. We are writing out the details of the latter in version 00 of this CRFG IETF draft, hence the name. We are using TLS group identifiers 0xfe30 and 0xfe31 for X25519Kyber512Draft00 and X25519Kyber768Draft00 respectively.

There are some differences between our Go and BoringSSL forks that are interesting to compare.

  • Our Go fork uses our fast AVX2 optimized implementation of Kyber from CIRCL. In contrast, our BoringSSL fork uses the simpler portable reference implementation. Without the AVX2 optimisations it’s easier to evaluate. The downside is that it’s slower. Don’t be mistaken: it is still very fast, but you can check yourself.
  • Our Go fork only sends one keyshare. If the server doesn’t support it, it will respond with a HelloRetryRequest message and the client will fallback to one the server does support. This adds a roundtrip.
    Our BoringSSL fork, on the other hand, will send two keyshares: the post-quantum hybrid and a classical one (if a classical key agreement is still enabled). If the server doesn’t recognize the first, it will be able to use the second. In this way we avoid a roundtrip if the server does not support the post-quantum key agreement.

Looking ahead

The quantum future is here. In the coming years the Internet will move to post-quantum cryptography. Today we are offering our customers the tools to get a headstart and test post-quantum key agreements. We love to hear your feedback: e-mail it to [email protected].

This is just a small, but important first step. We will continue our efforts to move towards a secure and private quantum-secure Internet. Much more to come — watch this space.

NIST’s pleasant post-quantum surprise

Post Syndicated from Bas Westerbaan original https://blog.cloudflare.com/nist-post-quantum-surprise/

NIST’s pleasant post-quantum surprise

NIST’s pleasant post-quantum surprise

On Tuesday, the US National Institute of Standards and Technology (NIST) announced which post-quantum cryptography they will standardize. We were already drafting this post with an educated guess on the choice NIST would make. We almost got it right, except for a single choice we didn’t expect—and which changes everything.

At Cloudflare, post-quantum cryptography is a topic close to our heart, as the future of a secure and private Internet is on the line. We have been working towards this day for many years, by implementing post-quantum cryptography, contributing to standards, and testing post-quantum cryptography in practice, and we are excited to share our perspective.

In this long blog post, we explain how we got here, what NIST chose to standardize, what it will mean for the Internet, and what you need to know to get started with your own post-quantum preparations.

How we got here

Shor’s algorithm

Our story starts in 1994, when mathematician Peter Shor discovered a marvelous algorithm that efficiently factors numbers and computes discrete logarithms. With it, you can break nearly all public-key cryptography deployed today, including RSA and elliptic curve cryptography. Luckily, Shor’s algorithm doesn’t run on just any computer: it needs a quantum computer. Back in 1994, quantum computers existed only on paper.

But in the years since, physicists started building actual quantum computers. Initially, these machines were (and still are) too small and too error-prone to be threatening to the integrity of public-key cryptography, but there is a clear and impending danger: it only seems a matter of time now before a quantum computer is built that has the capability to break public-key cryptography. So what can we do?

Encryption, key agreement and signatures

To understand the risk, we need to distinguish between the three cryptographic primitives that are used to protect your connection when browsing on the Internet:

Symmetric encryption. With a symmetric cipher there is one key to encrypt and decrypt a message. They’re the workhorse of cryptography: they’re fast, well understood and luckily, as far as known, secure against quantum attacks. (We’ll touch on this later when we get to security levels.) Examples are AES and ChaCha20.

Symmetric encryption alone is not enough: which key do we use when visiting a website for the first time? We can’t just pick a random key and send it along in the clear, as then anyone surveilling that session would know that key as well. You’d think it’s impossible to communicate securely without ever having met, but there is some clever math to solve this.

Key agreement, also called a key exchange, allows two parties that never met to agree on a shared key. Even if someone is snooping, they are not able to figure out the agreed key. Examples include Diffie–Hellman over elliptic curves, such as X25519.

The key agreement prevents a passive observer from reading the contents of a session, but it doesn’t help defend against an attacker who sits in the middle and does two separate key agreements: one with you and one with the website you want to visit. To solve this, we need the final piece of cryptography:

Digital signatures, such as RSA, allow you to check that you’re actually talking to the right website with a chain of certificates going up to a certificate authority.

Shor’s algorithm breaks all widely deployed key agreement and digital signature schemes, which are both critical to the security of the Internet. However, the urgency and mitigation challenges between them are quite different.

Impact

Most signatures on the Internet have a relatively short lifespan. If we replace them before quantum computers can crack them, we’re golden. We shouldn’t be too complacent here: signatures aren’t that easy to replace as we will see later on.

More urgently, though, an attacker can store traffic today and decrypt later by breaking the key agreement using a quantum computer. Everything that’s sent on the Internet today (personal information, credit card numbers, keys, messages) is at risk.

NIST Competition

Luckily cryptographers took note of Shor’s work early on and started working on post-quantum cryptography: cryptography not broken by quantum algorithms. In 2016, NIST, known for standardizing AES and SHA, opened a public competition to select which post-quantum algorithms they will standardize. Cryptographers from all over the world submitted algorithms and publicly scrutinized each other’s submissions. To focus attention, the list of potential candidates were whittled down over three rounds. From the original 82 submissions, eight made it into the final third round. From those eight, NIST chose one key agreement scheme and three signature schemes. Let’s have a look at the key agreement first.

What NIST announced

Key agreement

For key agreement, NIST picked only Kyber, which is a Key Encapsulation Mechanism (KEM). Let’s compare it side-by-side to an RSA-based KEM and the X25519 Diffie–Hellman key agreement:

NIST’s pleasant post-quantum surprise
Performance characteristics of Kyber and RSA. We compare instances of security level 1, see below. Timings vary considerably by platform and implementation constraints and should be taken as a rough indication only.
NIST’s pleasant post-quantum surprise
Performance characteristics of the X25519 Diffie–Hellman key agreement commonly used in TLS 1.3.
KEM versus Diffie–Hellman

To properly compare these numbers, we have to explain how KEM and Diffie–Hellman key agreements are different.

NIST’s pleasant post-quantum surprise
Protocol flow of KEM and Diffie-Hellman key agreement.

Let’s start with the KEM. A KEM is essentially a Public-Key Encryption (PKE) scheme tailored to encrypt shared secrets. To agree on a key, the initiator, typically the client, generates a fresh keypair and sends the public key over. The receiver, typically the server, generates a shared secret and encrypts (“encapsulates”) it for the initiator’s public key. It returns the ciphertext to the initiator, who finally decrypts (“decapsulates”) the shared secret with its private key.

With Diffie–Hellman, both parties generate a keypair. Because of the magic of Diffie–Hellman, there is a unique shared secret between every combination of a public and private key. Again, the initiator sends its public key. The receiver combines the received public key with its own private key to create the shared secret and returns its public key with which the initiator can also compute the shared secret.

NIST’s pleasant post-quantum surprise
Interactive versus non-interactive key agreement

As an aside, in this simple key agreement (such as in TLS), there is not a big difference between using a KEM or Diffie–Hellman: the number of round-trips is exactly the same. In fact, we’re using Diffie–Hellman essentially as a KEM. This, however, is not the case for all protocols: for instance, the 3XDH handshake of Signal can’t be done with plain KEMs and requires the full flexibility of Diffie–Hellman.

Now that we know how to compare KEMs and Diffie–Hellman, how does Kyber measure up?

Kyber

Kyber is a balanced post-quantum KEM. It is very fast: much faster than X25519, which is already known for its speed. Its main drawback, common to many post-quantum KEMs, is that Kyber has relatively large ciphertext and key sizes: compared to X25519 it adds 1,504 bytes. Is this problematic?

We have some indirect data. Back in 2019 together with Google we tested two post-quantum KEMs, NTRU-HRSS and SIKE in Chrome. SIKE has very small keys, but is computationally very expensive. NTRU-HRSS, on the other hand, has similar performance characteristics to Kyber, but is slightly bigger and slower. This is what we found:

NIST’s pleasant post-quantum surprise
Handshake times for TLS with X25519 (control), NTRU-HRSS (CECPQ2) and SIKE (CECPQ2b). Both post-quantum KEMs were combined with a X25519 key agreement.

In this experiment we used a combination (a hybrid) of the post-quantum KEM and X25519. Thus NTRU-HRSS couldn’t benefit from its speed compared to X25519. Even with this disadvantage, the difference in performance is very small. Thus we expect that switching to a hybrid of Kyber and X25519 will have little performance impact.

So can we switch to post-quantum TLS today? We would love to. However, we have to be a bit careful: some TLS implementations are brittle and crash on the larger KeyShare message that contains the bigger post-quantum keys. We will work hard to find ways to mitigate these issues, as was done to deploy TLS 1.3. Stay tuned!

The other finalists

It’s interesting to have a look at the KEMs that didn’t make the cut. NIST intends to standardize some of these in a fourth round. One reason is to increase the diversity in security assumptions in case there is a breakthrough in attacks on structured lattices on which Kyber is based. Another reason is that some of these schemes have specialized, but very useful applications. Finally, some of these schemes might be standardized outside of NIST.

Structured lattices Backup Specialists
NTRU BIKE 4️⃣ Classic McEliece 4️⃣
NTRU Prime HQC 4️⃣ SIKE 4️⃣
SABER FrodoKEM

The finalists and candidates of the third round of the competition. The ones marked with 4️⃣ are proceeding to a fourth round and might yet be standardized.

The structured lattice generalists

Just like Kyber, the KEMs SABER, NTRU and NTRU Prime are all structured lattice schemes that are very similar in performance to Kyber. There are some finer differences, but any one of these KEMs would’ve been a great pick. And they still are: OpenSSH 9.0 chose to implement NTRU Prime.

The backup generalists

BIKE, HQC and FrodoKEM are also balanced KEMs, but they’re based on three different underlying hard problems. Unfortunately they’re noticeably less efficient, both in key sizes and computation. A breakthrough in the cryptanalysis of structured lattices is possible, though, and in that case it’s nice to have backups. Thus NIST is advancing BIKE and HQC to a fourth round.

While NIST chose not to advance FrodoKEM, which is based on unstructured lattices, Germany’s BSI prefers it.

The specialists

The last group of post-quantum cryptographic algorithms under NIST’s consideration are the specialists. We’re happy that both are advancing to the fourth round as they can be of great value in just the right application.

First up is Classic McEliece: it has rather unbalanced performance characteristics with its large public key (261kB) and small ciphertexts (128 bytes). This makes McEliece unsuitable for the ephemeral key exchange of TLS, where we need to transmit the public key. On the other hand, McEliece is ideal when the public key is distributed out-of-band anyway, as is often the case in applications and mobile apps that pin certificates. To use McEliece in this way, we need to change TLS a bit. Normally the server authenticates itself by sending a signature on the handshake. Instead, the client can encrypt a challenge to the KEM public key of the server. Being able to decrypt it is an implicit authentication. This variation of TLS is known as KEMTLS and also works great with Kyber when the public key isn’t known beforehand.

Finally, there is SIKE, which is based on supersingular isogenies. It has very small key and ciphertext sizes. Unfortunately, it is computationally more expensive than the other contenders.

Digital signatures

As we just saw, the situation for post-quantum key agreement isn’t too bad: Kyber, the chosen scheme is somewhat larger, but it offers computational efficiency in return. The situation for post-quantum signatures is worse: none of the schemes fit the bill on their own for different reasons. We discussed these issues at length for ten of them in a deep-dive last year. Let’s restrict ourselves for the moment to the schemes that were most likely to be standardized and compare them against Ed25519 and RSA-2048, the schemes that are in common use today.

NIST’s pleasant post-quantum surprise

Performance characteristics of NIST’s chosen signature schemes compared to Ed25519 and RSA-2048. We compare instances of security level 1, see below. Timings vary considerably by platform and implementation constraints and should be taken as a rough indication only. SPHINCS+ was timed with simple haraka as the underlying hash function. (*) Falcon requires a suitable double-precision floating-point unit for fast signing.

Floating points: Falcon’s achilles

All of these schemes have much larger signatures than those commonly used today. Looking at just these numbers, Falcon is the best of the worst. It, however, has a weakness that this table doesn’t show: it requires fast constant-time double-precision floating-point arithmetic to have acceptable signing performance.

Let’s break that down. Constant time means that the time the operation takes does not depend on the data processed. If the time to create a signature depends on the private key, then the private key can often be recovered by measuring how long it takes to create a signature. Writing constant-time code is hard, but over the years cryptographers have got it figured out for integer arithmetic.

Falcon, crucially, is the first big cryptographic algorithm to use double-precision floating-point arithmetic. Initially it wasn’t clear at all whether Falcon could be implemented in constant-time, but impressively, Falcon was implemented in constant-time for several different CPUs, which required several clever workarounds for certain CPU instructions.

Despite this achievement, Falcon’s constant-timeness is built on shaky grounds. The next generation of Intel CPUs might add an optimization that breaks Falcon’s constant-timeness. Also, many CPUs today do not even have fast constant-time double-precision operations. And then still, there might be an obscure bug that has been overlooked.

In time it might be figured out how to do constant-time arithmetic on the FPU robustly, but we feel it’s too early to deploy Falcon where the timing of signature minting can be measured. Notwithstanding, Falcon is a great choice for offline signatures such as those in certificates.

Dilithium’s size

This brings us to Dilithium. Compared to Falcon it’s easy to implement safely and has better signing performance to boot. Its signatures and public keys are much larger though, which is problematic. For example, to each browser visiting this very page, we sent six signatures and two public keys. If we’d replace them all with Dilithium2 we would be looking at 17kB of additional data. Last year, we ran an experiment to see the impact of additional data in the TLS handshake:

NIST’s pleasant post-quantum surprise
Impact of larger signatures on TLS handshake time. For the details, see this blog.

There are some caveats to point out: first, we used a big 30-segment initial congestion window (icwnd). With a normal icwnd, the bump at 40KB moves to 10KB. Secondly, the height of this bump is the round-trip time (RTT), which due to our broadly distributed network, is very low for us. Thus, switching to Dilithium alone might well double your TLS handshake times. More disturbingly, we saw that some connections stopped working when we added too much data:

NIST’s pleasant post-quantum surprise
Amount of failed TLS handshakes by size of added signatures. For the details, see this blog.

We expect this was caused by misbehaving middleboxes. Taken together, we concluded that early adoption of post-quantum signatures on the Internet would likely be more successful if those six signatures and two public keys would fit in 9KB. This can be achieved by using Dilithium for the handshake signature and Falcon for the other (offline) signatures.

At most one of Dilithium or Falcon

Unfortunately, NIST stated on several occasions that it would choose only two signature schemes, but not both Falcon and Dilithium:

NIST’s pleasant post-quantum surprise
Slides of NIST’s status update after the conclusion of round 2

The reason given is that both Dilithium and Falcon are based on structured lattices and thus do not add more security diversity. Because of the difficulty of implementing Falcon correctly, we expected NIST to standardize Dilithium and as a backup SPHINCS+. With that guess, we saw a big challenge ahead: to keep the Internet fast we would need some difficult and rigorous changes to the protocols.

The twist

However, to everyone’s surprise, NIST picked both! NIST chose to standardize Dilithium, Falcon and SPHINCS+. This is a very pleasant surprise for the Internet: it means that post-quantum authentication will be much simpler to adopt.

SPHINCS+, the conservative choice

In the excitement of the fight between Dilithium and Falcon, we could almost forget about SPHINCS+, a stateless hash-based signature. Its big advantage is that its security is based on the second-preimage resistance of the underlying hash-function, which is well understood. It is not a stretch to say that SPHINCS+ is the most conservative choice for a signature scheme, post-quantum or otherwise. But even as a co-submitter of SPHINCS+, I have to admit that its performance isn’t that great.

There is a lot of flexibility in the parameter choices for SPHINCS+: there are tradeoffs between signature size, signing time, verification time and the maximum number of signatures that can be minted. Of the current parameter sets, the “s” are optimized for size and “f” for signing speed; both chosen to allow 264  signatures. NIST has hinted at reducing the signature limit, which would improve performance. A custom choice of parameters for a particular application would improve it even more, but would still trail Dilithium.

Having discussed NIST choices, let’s have a look at those that were left out.

The other finalists

There were three other finalists: GeMSS, Picnic and Rainbow. None of these are progressing to a fourth round.

Picnic is a conservative choice similar to SPHINCS+. Its construction is interesting: it is based on the secure multiparty computation of a block cipher. To be efficient, a non-standard block cipher is chosen. This makes Picnic’s assumptions a bit less conservative, which is why NIST preferred SPHINCS+.

GeMSS and Rainbow are specialists: they have large public key sizes (hundreds of kilobytes), but very small signatures (33–66 bytes). They would be great for applications where the public key can be distributed out of band, such as for the Signed Certificate Timestamps included in certificates for Certificate Transparency. Unfortunately, both turned out to be broken.

Signature schemes on the horizon

Although we expect Falcon and Dilithium to be practical for the Internet, there is ample room for improvement. Many new signature schemes have been proposed after the start of the competition, which could help out a lot. NIST recognizes this and is opening a new competition for post-quantum signature schemes.

A few schemes that have caught our eye already are UOV, which has similar performance trade-offs to those for GeMSS and Rainbow; SQISign, which has small signatures, but is computationally expensive; and MAYO, which looks like it might be a great general-purpose signature scheme.

Stateful hash-based signatures

Finally, we’d be remiss not to mention the post-quantum signature scheme that already has been standardized by NIST: the stateful hash-based signature schemes LMS and XMSS. They share the same conservative security as their sibling SPHINCS+, but have much better performance. The rub is that for each keypair there are a finite number of signature slots and each signature slot can only be used once. If it’s used twice, it is insecure. This is why they are called stateful; as the signer must remember the state of all slots that have been used in the past, and any mistake is fatal. Keeping the state perfectly can be very challenging.

What else

What’s next?

NIST will draft standards for the selected schemes and request public feedback on them. There might be changes to the algorithms, but we do not expect anything major. The standards are expected to be finalized in 2024.

In the coming months, many languages, libraries and protocols will already add preliminary support for the current version of Kyber and the other post-quantum algorithms. We’re helping out to make post-quantum available to the Internet as soon as possible: we’re working within the IETF to add Kyber to TLS and will contribute upstream support to popular open-source libraries.

Start experimenting with Kyber today

Now is a good time for you to try out Kyber in your software stacks. We were lucky to correctly guess Kyber would be picked and have experience running it internally. Our tests so far show it performs great. Your requirements might differ, so try it out yourself.

The reference implementation in C is excellent. The Open Quantum Safe project integrates it with various TLS libraries, but beware: the algorithm identifiers and scheme might still change, so be ready to migrate.

Our CIRCL library has a fast independent implementation of Kyber in Go. We implemented Kyber ourselves so that we could help tease out any implementation bugs or subtle underspecification.

Experimenting with post-quantum signatures

Post-quantum signatures are not as urgent, but might require more engineering to get right. First off, which signature scheme to pick?

  • Are large signatures and slow operations acceptable? Go for SPHINCS+.
  • Do you need more performance?
    • Can your signature generation be timed, for instance when generated on-the-fly? Then go for (a hybrid, see below, with) Dilithium.
    • For offline signatures, go for (a hybrid with) Falcon.
  • If you can keep a state perfectly, check out XMSS/LMS.

Open Quantum Safe can be used to test these out. Our CIRCL library also has a fast independent implementation of Dilithium in Go. We’ll add Falcon and SPHINCS+ soon.

Hybrids

A hybrid is a combination of a classical and a post-quantum scheme. For instance, we can combine Kyber512 with X25519 to create a single Kyber512X key agreement. The advantage of a hybrid is that the data remains secure against non-quantum attackers even if Kyber512 turns out broken. It is important to note that it’s not just about the algorithm, but also the implementation: Kyber512 might be perfectly secure, but an implementation might leak via side-channels. The downside is that two key-exchanges are performed, which takes more CPU cycles and bytes on the wire. For the moment, we prefer sticking with hybrids, but we will revisit this soon.

Post-quantum security levels

Each algorithm has different parameters targeting various post-quantum security levels. Up till  now we’ve only discussed the performance characteristics of security level 1 (or 2 in case of Dilithium, which doesn’t have level 1 parameters.) The definition of the security levels is rather interesting: they’re defined as being as hard to crack by a classical or quantum attacker as specific instances of AES and SHA:

Level Definition, as least as hard to break as …
1 To recover the key of AES-128 by exhaustive search
2 To find a collision in SHA256 by exhaustive search
3 To recover the key of AES-192 by exhaustive search
4 To find a collision in SHA384 by exhaustive search
5 To recover the key of AES-256 by exhaustive search

So which security level should we pick? Is level 1 good enough? We’d need to understand how hard it is for a quantum computer to crack AES-128.

Grover’s algorithm

In 1996, two years after Shor’s paper, Lov Grover published his quantum search algorithm. With it, you can find the AES-128 key (given known plain and ciphertext) with only 264 executions of the cipher in superposition. That sounds much faster than the 2127 tries on average for a classical brute-force attempt. In fact, it sounds like security level 1 isn’t that secure at all. Don’t be alarmed: level 1 is much more secure than it sounds, but it requires some context.

To start, a classical brute-force attempt can be parallelized — millions of machines can participate, sharing the work. Grover’s algorithm, on the other hand, doesn’t parallelize well because the quadratic speedup disappears over that portion. To wit, a billion quantum computers would still have to do 249 iterations each to crack AES-128.

Then each iteration requires many gates. It’s estimated that these 249 operations take roughly 264 noiseless quantum gates. If each of our billion quantum computers could execute a billion noiseless quantum gates per second, then it’d still take 500 years.

That already sounds more secure, but we’re not done. Quantum computers do not execute noiseless quantum gates: they’re analogue machines. Every operation has a little bit of noise. Does this mean that quantum computing is hopeless? Not at all! There are clever algorithms to turn, say, a million noisy qubits into one less noisy qubit. It doesn’t just add qubits, but also extra gates. How much depends very much on the exact details of the quantum computer.

It is not inconceivable that in the future there will be quantum computers that effectively execute far more than a billion noiseless gates per second, but it will likely be decades after Shor’s algorithm is practical. This all is a long-winded way of saying that security level 1 seems solid for the foreseeable future.

Hedging against attacks

A different reason to pick a higher security level is to hedge against better attacks on the algorithm. This makes a lot of sense, but it is important to note that this isn’t a foolproof strategy:

  • Not all attacks are small improvements. It’s possible that improvements in cryptanalysis break all security levels at once.
  • Higher security levels do not protect against implementation flaws, such as (new) timing vulnerabilities.

A different aspect, that’s arguably more important than picking a high number, is crypto agility: being able to switch to a new algorithm/implementation in case of a break of trouble. Let’s hope that we will not need it, but now we’re going to switch, it’s nice to make it easier in the future.

CIRCL is Post-Quantum Enabled

We already mentioned CIRCL a few times, it’s our optimized crypto-library for Go whose development we started in 2019. CIRCL already contains support for several post-quantum algorithms such as the KEMs Kyber and SIKE and signature schemes Dilithium and Frodo. The code is up to date and compliant with test vectors from the third round. CIRCL is readily usable in Go programs either as a library or natively as part of Go using this fork.

NIST’s pleasant post-quantum surprise

One goal of CIRCL is to enable experimentation with post-quantum algorithms in TLS. For instance, we ran a measurement study to evaluate the feasibility of the KEMTLS protocol for which we’ve adapted the TLS package of the Go library.

As an example, this code uses CIRCL to sign a message with eddilithium2, a hybrid signature scheme pairing Ed25519 with Dilithium mode 2.

package main

import (
  "crypto"
  "crypto/rand"
  "fmt"

  "github.com/cloudflare/circl/sign/eddilithium2"
)

func main() {
  // Generating random keypair.
  pk, sk, err := eddilithium2.GenerateKey(rand.Reader)

  // Signing a message.
  msg := []byte("Signed with CIRCL using " + eddilithium2.Scheme().Name())
  signature, err := sk.Sign(rand.Reader, msg, crypto.Hash(0))

  // Verifying signature.
  valid := eddilithium2.Verify(pk, msg, signature[:])

  fmt.Printf("Message: %v\n", string(msg))
  fmt.Printf("Signature (%v bytes): %x...\n", len(signature), signature[:4])
  fmt.Printf("Signature Valid: %v\n", valid)
  fmt.Printf("Errors: %v\n", err)
}
Message: Signed with CIRCL using Ed25519-Dilithium2
Signature (2484 bytes): 84d6882a...
Signature Valid: true
Errors: <nil>

As can be seen the application programming interface is the same as the crypto.Signer interface from the standard library. Try it out, and we’re happy to hear your feedback.

Conclusion

This is a big moment for the Internet. From a set of excellent options for post-quantum key agreement, NIST chose Kyber. With it, we can secure the data on the Internet today against quantum adversaries of the future, without compromising on performance.

On the authentication side, NIST pleasantly surprised us by choosing both Falcon and Dilithium against their earlier statements. This was a great choice, as it will make post-quantum authentication more practical than we expected it would be.

Together with the cryptography community, we have our work cut out for us: we aim to make the Internet post-quantum secure as fast as possible.

Want to follow along? Keep an eye on this blog or have a look at research.cloudflare.com.

Want to help out? We’re hiring and open to research visits.

Future-proofing SaltStack

Post Syndicated from Lenka Mareková original https://blog.cloudflare.com/future-proofing-saltstack/

Future-proofing SaltStack

Future-proofing SaltStack

At Cloudflare, we are preparing the Internet and our infrastructure for the arrival of quantum computers. A sufficiently large and stable quantum computer will easily break commonly deployed cryptography such as RSA. Luckily there is a solution: we can swap out the vulnerable algorithms with so-called post-quantum algorithms that are believed to be secure even against quantum computers. For a particular system, this means that we first need to figure out which cryptography is used, for what purpose, and under which (performance) constraints. Most systems use the TLS protocol in a standard way, and there a post-quantum upgrade is routine. However, some systems such as SaltStack, the focus of this blog post, are more interesting. This blog post chronicles our path of making SaltStack quantum-secure, so welcome to this adventure: this secret extra post-quantum blog post!

SaltStack, or simply Salt, is an open-source infrastructure management tool used by many organizations. At Cloudflare, we rely on Salt for provisioning and automation, and it has allowed us to grow our infrastructure quickly.

Salt uses a bespoke cryptographic protocol to secure its communication. Thus, the first step to a post-quantum Salt was to examine what the protocol was actually doing. In the process we discovered a number of security vulnerabilities (CVE-2022-22934, CVE-2022-22935, CVE-2022-22936). This blogpost chronicles the investigation, our findings, and how we are helping secure Salt now and in the Quantum future.

Cryptography in Salt

Let’s start with a high-level overview of Salt.

The main agents in a Salt system are servers and clients (referred to as masters and minions in the Salt documentation). A server is the central control point for a number of clients, which can be in the tens of thousands: it can issue a command to the entire fleet, provision client machines with different characteristics, collect reports on jobs running on each machine, and much more. Depending on the architecture, there can be multiple servers managing the same fleet of clients. This is what makes Salt great, as it helps the management of complex infrastructure.

By default, the communication between a server and a client happens over ZeroMQ on top of TCP though there is an experimental option to switch to a custom transport directly on TCP. The cryptographic protocol is largely the same for both transports. The experimental TCP transport has an option to enable TLS which does not replace the custom protocol but wraps it in server-authenticated TLS. More about that later on.

The custom protocol relies on a setup phase in which each server and each client has its own long-term RSA-2048 keypair. On the surface similar to TLS, Salt defines a handshake, or key exchange protocol, that generates a shared secret, and a record protocol which uses this secret with symmetric encryption (the symmetric channel).

Key exchange protocol

In its basic form, the key exchange (or handshake) protocol is an RSA key exchange in which the server chooses the secret and encrypts it to the client’s public key. The exact form of the protocol then depends on whether either party already knows the other party’s long-term public key, since certificates (like in TLS) are not used. By default, clients trust the server’s public key on first use, and servers only trust the client’s public key after it has been accepted by an out-of-band mechanism. The shared secret is shared among the entire fleet of clients, so it is not specific to a particular server and client pair. This allows the server to encrypt a broadcast message only once. We will come back to this performance trade-off later on.

Future-proofing SaltStack
Salt key exchange (as of version 3004) under default settings, showing the first connection between the given server and client.

Symmetric channel

The shared secret is used as the key for encryption. Most of the messages between a server and a client are secured in an Encrypt-then-MAC fashion, with AES-192 in CBC mode with SHA-256 for HMAC. For certain more sensitive messages, variations on this protocol are used to add more security. For example, commands are signed using the server’s long-term secret key, and “pillar data” (deemed more sensitive) is encrypted only to a particular client using a freshly generated secret key.

Future-proofing SaltStack
Symmetric channel in Salt (as of version 3004).

Security vulnerabilities

We found that the protocol variation used for pillar messages contains a flaw. As shown in the diagram below, a monster-in-the-middle attacker (MitM) that sits between a server and a client can substitute arbitrary “pillar data” to the client. It needs to know the client’s public key, but that is easy to find since clients broadcast it as part of their key exchange request. The initial key exchange can be observed, or one could be triggered on-demand using a specially crafted message. This MitM was possible because neither the newly generated key nor the actual payload were authenticated as coming from the server. This matters because “pillar data” can include anything from specifying packages to be installed to credentials and cryptographic keys. Thus, it is possible that this flaw could allow an attacker to gain access to the vulnerable client machine.

Future-proofing SaltStack
Illustration of the monster-in-the-middle attack, CVE-2022-22934.

We reported the issue to Salt November 5, 2021, which assigned CVE-2022-22934 to it. Earlier this week, on March 28, 2022, Salt released a patch that adds a signature of the server on the pillar message to prevent the attack.

There were several other smaller issues we found. Messages could be replayed to the same or a different client. This could allow a file intended for one client, to be served to a different one, perhaps aiding lateral movement. This is CVE-2022-22936 and has been patched by adding the name of the addressed client, a nonce and a signature to messages.

Finally, there were some messages which could be manipulated to cause the client to crash.  This is CVE-2022-22935 and was patched similarly by adding the addressee, nonce and signature.

If you are running Salt, please update as soon as possible to either 3002.8, 3003.4 or 3004.1.

Moving forward

These patches add signatures to almost every single message. The decision to have a single shared secret was a performance trade-off: only a single encryption is required to broadcast a message. As signatures are computationally much more expensive than symmetric encryption, this trade-off didn’t work out that well. It’s better to switch to a separate shared key per client, so that we don’t need to add a signature on every separate message. In effect, we are creating a single long-lived mutually-authenticated channel per client. But then we are getting very close to what mutually authenticated TLS (mTLS) can provide. What would that look like? Hold that thought for a moment: we will return to it below.

We got sidetracked from our original question: what does this all mean for making Salt post-quantum secure? One thing to take away about post-quantum cryptography today is that signatures are much larger: Dilithium2, for instance, weighs in at 2.4 kB compared to 256 bytes for an RSA-2048 signature. So ironically, patching these vulnerabilities has made our job more difficult as there are many more signatures. Thus, also for our post-quantum goal, mTLS seems very attractive. Not the least because there are post-quantum TLS stacks ready to go.

Finally, as the security properties of  mTLS are well understood, it will be much easier to add new messages and functionality to Salt’s protocol. With the current complex protocol, any change is much harder to judge with confidence.

An mTLS-based transport

So what would such an mTLS-based protocol for Salt look like? Clients would pin the server certificate and the server would pin the client certificates. Thus, we wouldn’t have any traditional public key infrastructure (PKI). This matches up nicely with how Salt deals with keys currently. This allows clients to establish long-lived connections with their server and be sure that all data exchanged between them is confidential, mutually authenticated and has forward secrecy. This mitigates replays, swapping of messages, reflection attacks or subtle denial of service pathways for free.

We tested this idea by implementing a third transport using WebSocket over mTLS (WSS). As mentioned before, Salt already offers an option to use TLS with the TCP transport, but it doesn’t authenticate clients and creates a new TCP connection for every client request which leads to a multitude of unnecessary TLS handshakes. Internally, Salt has been architected to work with new connections for each request, so our proof of concept required some laborious retooling.

Our findings show promise that there would be no significant losses and potentially some improvements when it comes to performance at scale. In our preliminary experiments with a single server handling a thousand clients, there was no difference in several metrics compared to the default ZeroMQ transport. Resource-intensive operations such as the fetching of pillar and state data resulted, in our experiment, in lower CPU usage in the mTLS transport. Enabling long-lived connections reduced the amount of data transmitted between the clients and the server, in some cases, significantly so.

We have shared our preliminary results with Salt, and we are working together to add an mTLS-based transport upstream. Stay tuned!

Conclusion

We had a look at how to make Salt post-quantum secure. While doing so, we found and helped fix several issues. We see a clear path forward to a future post-quantum Salt based on mTLS. Salt is but one system: we will continue our work, checking system-by-system, collaborating with vendors to bring post-quantum into the present.

With thanks to Bas and Sofía for their help on the project.

The post-quantum future: challenges and opportunities

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/post-quantum-future/

The post-quantum future: challenges and opportunities

“People ask me to predict the future, when all I want to do is prevent it. Better yet, build it. Predicting the future is much too easy, anyway. You look at the people around you, the street you stand on, the visible air you breathe, and predict more of the same. To hell with more. I want better.”
Ray Bradbury, from Beyond 1984: The People Machines

The post-quantum future: challenges and opportunities

The story and the path are clear: quantum computers are coming that will have the ability to break the cryptographic mechanisms we rely on to secure modern communications, but there is hope! The cryptographic community has designed new mechanisms to safeguard against this disruption. There are challenges: will the new safeguards be practical? How will the fast-evolving Internet migrate to this new reality? In other blog posts in this series, we have outlined some potential solutions to these questions: there are new algorithms for maintaining confidentiality and authentication (in a “post-quantum” manner) in the protocols we use. But will they be fast enough to deploy at scale? Will they provide the required properties and work in all protocols? Are they easy to use?

Adding post-quantum cryptography into architectures and networks is not only about being novel and looking at interesting research problems or exciting engineering challenges. It is primarily about protecting people’s communications and data because a quantum adversary can not only decrypt future traffic but, if they want to, past traffic. Quantum adversaries could also be capable of other attacks (by using quantum algorithms, for example) that we may be unaware of now, so protecting against them is, in a way, the challenge of facing the unknown. We can’t fully predict everything that will happen with the advent of quantum computers1, but we can prepare and build greater protections than the ones that currently exist. We do not see the future as apocalyptic, but as an opportunity to reflect, discover and build better.

What are the challenges, then? And related to this question: what have we learned from the past that enables us to build better in a post-quantum world?

Beyond a post-quantum TLS

As we have shown in other blog posts, the most important security and privacy properties to protect in the face of a quantum computer are confidentiality and authentication. The threat model of confidentiality is clear: quantum computers will not only be able to decrypt on-going traffic, but also any traffic that was recorded and stored prior to their arrival. The threat model for authentication is a little more complex: a quantum computer could be used to impersonate a party (by successfully mounting a monster-in-the-middle attack, for example) in a connection or conversation, and it could also be used to retroactively modify elements of the past message, like the identity of a sender (by, for example, changing the authorship of a past message to a different party). Both threat models are important to consider and pose a problem not only for future traffic but also for any traffic sent now.

In the case of using a quantum computer to impersonate a party: how can this be done? Suppose an attacker is able to use a quantum computer to compromise a user’s TLS certificate private key (which has been used to sign lots of past connections). The attacker can then forge connections and pretend that they come from the honest user (by signing with the user’s key) to another user, let’s say Bob. Bob will think the connections are coming from the honest user (as they all did in the past), when, in reality, they are now coming from the attacker.

We have algorithms that protect confidentiality and authentication in the face of quantum threats. We know how to integrate them into TLS, as we have seen in this blog post, so is that it? Will our connections then be safe? We argue that we will not yet be done, and these are the future challenges we see:

  • Changing the key exchange of the TLS handshake is simple; changing the authentication of TLS, in practice, is hard.
  • Middleboxes and middleware in the network, such as antivirus software and corporate proxies, can be slow to upgrade, hindering the rollout of new protocols.
  • TLS is not the only foundational protocol of the Internet, there are other protocols to take into account: some of them are very similar to TLS and they are easy to fix; others such as DNSSEC or QUIC are more challenging.

TLS (we will be focusing on its current version, which is 1.3) is a protocol that aims to achieve three primary security properties:

The first two properties are easy to maintain in a quantum-computer world: confidentiality is maintained by swapping the existing non-quantum-safe algorithm for a post-quantum one; integrity is maintained because the algorithms are intractable on a quantum computer. What about the last property, authentication? There are three ways to achieve authentication in a TLS handshake, depending on whether server-only or mutual authentication is required:

  1. By using a ‘pre-shared’ key (PSK) generated from a previous run of the TLS connection that can be used to establish a new connection (this is often called “session resumption” or “resuming” with a PSK),
  2. By using a Password-Authenticated Key Exchange (PAKE) for handshake authentication or post-handshake authentication (with the usage of exported authenticators, for example). This can be done by using the OPAQUE or the SPAKE protocols,
  3. By using public certificates that advertise parameters that are employed to assure (i.e., create a proof of) the identity of the party you are talking to (this is by far the most common method).

Securing the first authentication mechanism is easily achieved in the post-quantum sphere as a unique key is derived from the initial quantum-protected handshake. The second authentication mechanism does not pose a theoretical challenge (as public and private parameters are replaced with post-quantum counterparts) but rather faces practical limitations: certificate-based authentication involves multiple actors and it is difficult to properly synchronize this change with them, as we will see next. It is not only one public parameter and one public certificate: certificate-based authentication is achieved through the use of a certificate chain with multiple entities.

A certificate chain is a chain of trust by which different entities attest to the validity of public parameters to provide verifiability and confidence. Typically, for one party to authenticate another (for example, for a client to authenticate a server), a chain of certificates starting from a root’s Certificate Authority (CA) certificate is used, followed by at least one intermediate CA certificate, and finally by the leaf (or end-entity) certificate of the actual party. This is what you usually find in real-world connections. It is worth noting that the order of this chain (for TLS 1.3) does not require each certificate to certify the one immediately preceding it. Servers sometimes send both a current and deprecated intermediate certificate for transitional purposes, or are configured incorrectly.

What are these certificates? Why do multiple actors need to validate them? A (digital) certificate certifies the ownership of a public key by the named party of the certificate: it attests that the party owns the private counterpart of the public parameter through the use of digital signatures. A CA is the entity that issues these certificates. Browsers, operating systems or mobile devices operate CA “membership” programs where a CA must meet certain criteria to be incorporated into the trusted set. Devices accept their CA root certificates as they come “pre-installed” in a root store. Root certificates, in turn, are used to generate a number of intermediate certificates which will be, in turn, used to generate leaf certificates. This certificate chain process of generation, validation and revocation is not only a procedure that happens at a software level but rather an amalgamation of policies, rules, roles, and hardware2 and software needs. This is what is often called the Public Key Infrastructure (PKI).

All of the above goes to show that while we can change all of these parameters to post-quantum ones, it is not as simple as just modifying the TLS handshake. Certificate-based authentication involves many actors and processes, and it does not only involve one algorithm (as it happens in the key exchange phase) but typically at least six signatures: one handshake signature; two in the certificate chain; one OCSP staple and two SCTs. The last five signatures together are used to prove that the server you’re talking to is the right server for the website you’re visiting, for example. Of these five, the last three are essentially patches: the OCSP staple is used to deal with revoked certificates and the SCTs are to detect rogue CA’s. Starting with a clean slate, could we improve on the status quo with an efficient solution?

More pointedly, we can ask if indeed we still need to use this system of public attestation. The migration to post-quantum cryptography is also an opportunity to modify this system. The PKI as it exists is difficult to maintain, update, revoke, model, or compose. We have an opportunity, perhaps, to rethink this system.

Even without considering making fundamental changes to public attestation, updating the existing complex system presents both technical and management/coordination challenges:

  • On the technical side: are the post-quantum signatures, which have larger sizes and bigger computation times, usable in our handshakes? We explore this idea in this experiment, but we need more information. One potential solution is to cache intermediate certificates or to use other forms of authentication beyond digital signatures (like KEMTLS).
  • On the management/coordination side: how are we going to coordinate the migration of this complex system? Will there be some kind of ceremony to update algorithms? How will we deal with the situation where some systems have updated but others have not? How will we revoke past certificates?

This challenge brings into light that the migration to post-quantum cryptography is not only about the technical changes but is dependent on how the Internet works as the interconnected community that it is. Changing systems involves coordination and the collective willingness to do so.

On the other hand, post-quantum password-based authentication for TLS is still an open discussion. Most PAKE systems nowadays use Diffie-Hellman assumptions, which can be broken by a quantum computer. There are some ideas on how to transition their underlying algorithms to the post-quantum world; but these seem to be so inefficient as to render their deployment infeasible. It seems, though, that password authentication has an interesting property called “quantum annoyance”. A quantum computer can compromise the algorithm, but only one instance of the problem at the time for each guess of a password: “Essentially, the adversary must guess a password, solve a discrete logarithm based on their guess, and then check to see if they were correct”, as stated in the paper. Early quantum computers might take quite a long time to solve each guess, which means that a quantum-annoying PAKE combined with a large password space could delay quantum adversaries considerably in their goal of recovering a large number of passwords. Password-based authentication for TLS, therefore, could be safe for a longer time. But this does not mean, however, that it is not threatened by quantum adversaries.

The world of security protocols, though, does not end with TLS. There are many other security protocols (such as DNSSEC, WireGuard, SSH, QUIC, and more) that will need to transition to post-quantum cryptography. For DNSSEC, the challenge is complicated by the protocol not seeming to be able to deal with large signatures or high computation costs on verification time. According to research from SIDN Labs, it seems like only Falcon-512 and Rainbow-I-CZ can be used in DNSSEC (note that, though, there is a recent attack on Rainbow).

Scheme Public key size Signature size Speed of operations
Finalists
Dilithium2 1,312 2,420 Very fast
Falcon-512 897 690 Fast, if you have the right hardware
Rainbos-I-CZ 103,648 66 Fast
Alternate Candidates
SPHINCS+-128f 32 17,088 Slow
SPHINCS+-128s 32 7,856 Very slow
GeMSS-128 352,188 33 Very slow
Picnic3 35 14,612 Very slow

Table 1: Signature post-quantum algorithms. The orange rows show the suitable algorithms for DNSSEC.

What are the alternatives for a post-quantum DNSSEC? Perhaps, the isogeny-based signature scheme, SQISign, might be a solution if its verification time can be improved (which currently is 42 ms as noted in the original paper when running on a 3.40GHz Intel Core i7-6700 CPU over 250 runs for verification. Still slower than P-384. Recently, it has improved to 25ms). Another solution might be the usage of MAYO, which on an Intel i5-8400H CPU at 2.5GHz, a signing operation can take 2.50 million cycles, and a verification operation can take 1.3 million cycles. There is a lot of research that needs to be done to make isogeny-based cryptography faster so it will fit the protocol’s needs (research on this area is currently ongoing —see, for example, the Isogeny School) and provide assurance of their security properties. Another alternative could be using other forms of authentication for this protocol’s case, like using hash-based signatures.

DNSSEC is just one example of a protocol where post-quantum cryptography has a long road ahead, as we need hands-on experimentation to go along with technical updates. For the other protocols, there is timely research: there are, for example, proposals for a post-quantum WireGuard and for a post-quantum SSH. More research, though, needs to be done on the practical implications of these changes over real connections.

One important thing to note here is that there will likely be an intermediate period in which security protocols provide a “hybrid” set of algorithms for transitional purposes, compliance and security. “Hybrid” means that both a pre-quantum (or classical) algorithm and a post-quantum one are used to generate the secret used to encrypt or provide authentication. The security reason for using this hybrid mode is due to safeguarding in case post-quantum algorithms are broken. There are still many unknowns here (single code point, multiple codepoints, contingency plans) that we need to consider.

The failures of cryptography in practice

The Achilles heel for cryptography is often introducing it into the real-world. Designing, implementing, and deploying cryptography is notoriously hard to get right due to flaws in security proofs, implementation bugs, and vulnerabilities3 of software and hardware. We often deploy cryptography that is found to be flawed and the cost of fixing it is immense (it involves resources, coordination, and more). We have some previous lessons to draw from it as a community, though. For TLS 1.3, we pushed for verifying implementations of the standard, and for using tools to analyze the symbolic and computational models, as seen in other blog posts. Every time we design a new algorithm, we should aim for this same level of confidence, especially for the big migration to post-quantum cryptography.

In other blog posts, we have discussed our formal verification efforts, so we will not repeat these here. Rather let’s focus on what remains to be done on the formal verification front. Verification, analysis and implementation are not yet complete and we still need to:

  • Create easy-to-understand guides into what formal analysis is and how it can be used (as formal languages are unfamiliar to developers).
  • Develop user-tested APIs.
  • Flawless integration of a post-quantum algorithm’s API into protocols’ APIs.
  • Test and analyze the boundaries between verified and unverified code.
  • Verifying specifications at the standard level by, for example, integrating hacspec into IETF drafts.

Only in doing so can we prevent some of the security issues we have had in the past. Post-quantum cryptography will be a big migration and we can, if we are not careful, repeat the same issues of the past. We want a future that is better. We want to mitigate bugs and provide high assurance of the security of connections users have.

A post-quantum tunnel is born

The post-quantum future: challenges and opportunities

We’ve described the challenges of post-quantum cryptography from a theoretical and practical perspective. These are problems we are working on and issues that we are analyzing. They will take time to solve. But what can you expect in the short-term? What new ideas do we have? Let’s look at where else we can put post-quantum cryptography.

Cloudflare Tunnel is a service that creates a secure, outbound-only connection between your services and Cloudflare by deploying a lightweight connector in your environment. This is the end at the server side. At the other end, at the client side, we have WARP, a ‘VPN’ for client devices that can secure and accelerate all HTTPS traffic. So, what if we add post-quantum cryptography to all our internal infrastructure, and also add it to this server and the client endpoints? We would then have a post-quantum server to client connection where any request from the WARP client to a private network (one that uses Tunnel) is secure against a quantum adversary (how to do it will be similar to what is detailed here). Why would we want to do this? First, because it is great to have a connection that is fully protected against quantum computers. Second, because we can better measure the impacts of post-quantum cryptography in this environment (and even measure them in a mobile environment). This will also mean that we can provide guidelines to clients and servers on how to migrate to post-quantum cryptography. It would also be the first available service to do so at this scale. How will all of us experience this transition? Only time will tell, but we are excited to work towards this vision.

And, furthermore, as Tunnel uses the QUIC protocol in some cases and WARP uses the WireGuard protocol, this means that we can experiment with post-quantum cryptography in protocols that are novel and have not seen much experimentation in the past.

The post-quantum future: challenges and opportunities

So, what is the future in the post-quantum cryptography era? The future is better and not just the same. The future is fast and more secure. Deploying cryptography can be challenging and we have had problems with it in the past; but post-quantum cryptography is the opportunity to dream of better security, it is the path to explore, and it is the reality to make it happen.

Thank you for reading our post-quantum blog post series and expect more post-quantum content and updates from us!


If you are a student enrolled in a PhD or equivalent research program and looking for an internship for 2022, see open opportunities.

If you’re interested in contributing to projects helping Cloudflare, our engineering teams are hiring.

You can reach us with questions, comments, and research ideas at [email protected].

…..

1 And when we do predict, we often predict more of the same attacks we are accustomed to: adversaries breaking into connections, security being tampered with. Is this all that they will be capable of?
2The private part of a public key advertised in a certificate is often the target of attacks. An attacker who steals a certificate authority’s private keys is able to forge certificates, for example. Private keys are almost always stored on a hardware security module (HSM), which prevents key extraction. This is a small example of how hardware is involved in the process.
3Like constant-time failures, side-channel, and timing attacks.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/post-quantumify-cloudflare/

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless

Theoretically, there is no impediment to adding post-quantum cryptography to any system. But the reality is harder. In the middle of last year, we posed ourselves a big challenge: to change all internal connections at Cloudflare to use post-quantum cryptography. We call this, in a cheeky way, “post-quantum-ifying” our services. Theoretically, this should be simple: swap algorithms for post-quantum ones and move along. But with dozens of different services in various programming languages (as we have at Cloudflare), it is not so simple. The challenge is big but we are here and up for the task! In this blog post, we will look at what our plan was, where we are now, and what we have learned so far. Welcome to the first announcement of a post-quantum future at Cloudflare: our connections are going to be quantum-secure!

What are we doing?

The life of most requests at Cloudflare begins and ends at the edge of our global network. Not all requests are equal and on their path they are transmitted by several protocols. Some of those protocols provide security properties whilst others do not. For the protocols that do, for context, Cloudflare uses: TLS, QUIC, WireGuard, DNSSEC, IPsec, Privacy Pass, and more. Migrating all of these protocols and connections to use post-quantum cryptography is a formidable task. It is also a task that we do not treat lightly because:

  • We have to be assured that the security properties provided by the protocols are not diminished.
  • We have to be assured that performance is not negatively affected.
  • We have to be wary of other requirements of our ever-changing ecosystem (like, for example, keeping in mind our FedRAMP certification efforts).

Given these requirements, we had to decide on the following:

  • How are we going to introduce post-quantum cryptography into the protocols?
  • Which protocols will we be migrating to post-quantum cryptography?
  • Which Cloudflare services will be targeted for this migration?

Let’s explore now what we chose: welcome to our path!

TLS and post-quantum in the real world

One of the most used security protocols is Transport Layer Security (TLS). It is the vital protocol that protects most of the data that flows over the Internet today. Many of Cloudflare’s internal services also rely on TLS for security. It seemed natural that, for our migration to post-quantum cryptography, we would start with this protocol.

The protocol provides three security properties: integrity, authentication, and confidentiality. The algorithms used to provide the first property, integrity, seem to not be quantum-threatened (there is some research on the matter). The second property, authentication, is under quantum threat, but we will not focus on it for reasons detailed later. The third property, confidentiality, is the one that we are interested in protecting as it is urgent to do this now.

Confidentiality assures that no one other than the intended receiver and sender of a message can read the transmitted message. Confidentiality is especially threatened by quantum computers as an attacker can record traffic now and decrypt it in the future (when they get access to a quantum computer): this means that all past and current traffic, not just future traffic, is vulnerable to be read by anyone who obtains a quantum computer (and has stored the encrypted traffic captured today).

At Cloudflare, to protect many of our connections, we use TLS. We mainly use the latest version of the protocol, TLS 1.3, but we do sometimes still use TLS 1.2 (as seen in the image, though, it only shows the connections between websites to our network).  As we are a company that pushes for innovation, this means that we are intent on using this time of migration to post-quantum cryptography as an opportunity to also update TLS handshakes to 1.3 and be assured that we are using TLS in the right way (by, for example, ensuring that we are not using deprecated features of TLS).

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Cloudflare’s TLS and QUIC usage taken on the 17/02/2022 and showing the last 7 days.

Changing TLS 1.3 to provide quantum-resistant security for its confidentiality means changing the ‘key exchange’ phase of the TLS handshake. Let’s briefly look at how the TLS 1.3 handshake works.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
The TLS 1.3 handshake

In TLS 1.3, there will always be two parties: a client and a server. A client is a party that wants the server to “serve” them something, which could be a website, emails, chat messages, voice messages, and more. The handshake is the process by which the server and client attempt to agree on a shared secret, which will be used to encrypt the subsequent exchange of data (this shared secret is called the “master secret”). The client selects their favorite key exchange algorithms and submits one or more “key shares” to the server (they send both the name of the key share and its public key parameter). The server picks one of the key exchange algorithms (assuming that it supports one of them), and replies back with their own key share. Both the server and the client then combine the key shares to compute a shared secret (the “master secret”), which is used to protect the remainder of the connection. If the client only chooses algorithms that the server does not support, the server instead replies with the algorithms that it does support and asks the client to try again. During this initial conversation, the client and server also agree on authentication methods and the parameters for encryption, but we can leave that aside for today in this blog post. This description is also simplified to focus only in the “key exchange” phase.

There is a mechanism to add post-quantum cryptography to this procedure: you advertise post-quantum algorithms in the list of key shares, so the final derived shared key (the “master secret”) is quantum secure. But there are requirements we had to take into account when doing this with our connections: the security of much post-quantum cryptography is still under debate and we need to respect our compliance efforts. The solution to these requirements is to use a “hybrid” mechanism.

A “hybrid” mechanism means to use both a “pre-quantum” or “classical” algorithm and a “post-quantum” algorithm, and mixing both generated shared secrets into the derivation of the “master secret”. The combination of both shared secrets is of the form \Z′ = Z || T\ (for TLS and with the fixed size shared secrets, simple concatenation is secure. In other cases, you have to be a bit more careful). This procedure is a concatenation consisting of:

The usage of a “hybrid” approach allows us to safeguard our connections in case the security of the post-quantum algorithm fails. It also results in a suitable-for-FIPS secret, as it is approved in the “Recommendation for Key-Derivation Methods in Key-Establishment Schemes” (SP 800-56C Rev. 2), which is listed in the Annex D, as an approved key establishing technique for FIPS 140-2.

At Cloudflare, we are using different TLS libraries. We decided to add post-quantum cryptography to those, specifically, to the BoringCrypto library or the compiled version of Golang with BoringCrypto. We added our implementation of the Kyber-512 algorithm (this algorithm can be eventually swapped by another one; we’re not picking any here. We are using it for our testing phase) to those libraries and implemented the “hybrid” mechanism as part of the TLS handshake. For the “classical” algorithm we used curve P-256. We then compiled certain services with these new TLS libraries.

Name of algorithm Number of times loop executed Average runtime per operation Number of bytes required per operation Number of allocations
Curve P-256 23,056 52,204 ns/op 256 B/op 5 allocs/op
Kyber-512 100,977 11,793 ns/op 832 B/op 3 allocs/op

Table 1: Benchmarks of the “key share” operation of Curve P-256 and Kyber-512: Scalar Multiplication and Encapsulation respectively. Benchmarks ran on Darwin, amd64, Intel(R) Core(TM) i7-9750H CPU @ 2.60 GHz.

Note that as TLS supports the described “negotiation” mechanism for the key exchange, the client and server have a way of mutually deciding what algorithms they want to use. This means that it is not required that both a client or a server support or even prefer the exact same algorithms: they just need to share support for a single algorithm for a handshake to succeed. In turn, herewith, even if we advertise post-quantum cryptography and a server/client does not support it, they will not fail but rather agree on some other algorithm they share.

A note on a matter we left on hold above: why are we not migrating the authentication phase of TLS to post-quantum? Certificate-based authentication in TLS, which is the one we commonly use at Cloudflare, also depends on systems on the wider Internet. Thus, changes to authentication require a coordinated and much wider effort to change. Certificates are attested as proofs of identity by outside parties: migrating authentication means coordinating a ceremony of migration with these outside parties. Note though that at Cloudflare we use a PKI with internally-hosted Certificate Authorities (CAs), which means that we can more easily change our algorithms. This will still need careful planning. We will not do this today, but we will in the near future.

Cloudflare services

The first step of our post-quantum migration is done. We have TLS libraries with post-quantum cryptography using a hybrid mechanism. The second step is to test this new mechanism in specific Cloudflare connections and services. We will look at three systems from Cloudflare that we have started migrating to post-quantum cryptography. The services in question are: Logfwrdr, Cloudflare Tunnel, and GoKeyless.

A post-quantum Logfwdr

Logfwdr is an internal service, written in Golang, that handles structured logs, and sends them from our servers for processing (to a subservice called ‘Logreceiver’) where they write them to Kafka. The connection between Logfwdr and Logreceiver is protected by TLS. The same goes for the connection between Logreceiver and Kafka in core. Logfwdr pushes its logs through “streams” for processing.

This service seemed an ideal candidate for migrating to post-quantum cryptography as its architecture is simple, it has long-lived connections, and it handles a lot of traffic. In order to first test the viability of using post-quantum cryptography, we created our own instance of Logreceiver and deployed it. We also created our own stream (the “pq-stream”), which is basically a copy of a HTTP stream (which was remarkably easy to add). We then compiled these services with the modified TLS library and we got a post-quantum protected Logfwdr.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Figure 1: TLS latency of Logfwdr for selected metals. Notice how post-quantum cryptography is faster than non-post-quantum one (it is labeled with a “PQ” acronym).

What we found was that using post-quantum cryptography is faster than using “classical” cryptography! This was expected, though, as we are using a lattice-based post-quantum algorithm (Kyber512). The TLS latency of both post-quantum handshakes and “classical” ones can be noted in Figure 1. The figure shows more handshakes are executed than usual behavior as these servers are frequently restarted.

Note though that we are not using “only” post-quantum cryptography but rather the “hybrid” mechanism described above. This could increase performance times: in this case, the increase was minimal and still kept the post-quantum handshakes faster than the classical ones. Perhaps what makes the TLS handshakes faster in the post-quantum case is the usage of TLS 1.3, as the “classical” Logfwdr is using TLS 1.2. Logfwdr, though, is a service that executes long-lived handshakes, so in aggregate TLS 1.2 is not “slower” but it does have a slower start time.

As shown in Figure 2, the average batch duration of the post-quantum stream is lower than when not using post-quantum cryptography. This may be in part due to the fact that we are not sending the quantum-protected data all the way to Kafka (as the non-post-quantum stream is doing). We didn’t yet change the connection to Kafka post-quantum.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Figure 2: Average batch send duration: post-quantum (orange) and non-post-quantum streams (green).

We didn’t encounter any failures during this testing that ran for about some weeks. This gave us good insight that putting post-quantum cryptography into our internal network with actual data is possible. It also gave us confidence to begin migrating codebases to modified TLS libraries, which we will maintain.

What are the next steps for Logfwdr? Now that we confirmed it is possible, we will first start migrating stream by stream to this hybrid mechanism until we reach full post-quantum migration.

A post-quantum gokeyless

gokeyless is our own way to separate servers and TLS long-term private keys. With it, private keys are kept on a specialized key server operated by customers on their own architecture or, if using Geo Key Manager, in selected Cloudflare locations. We also use it for Cloudflare-held private keys with a service creatively known as gokeyless-internal. The final piece of this architecture is another service called Keynotto. Keynotto is a service written in Rust that only mints RSA and ECDSA key signatures (that are executed with the stored private key).

How does the overall architecture of gokeyless work? Let’s start with a request. The request arrives at the Cloudflare network and we perform TLS termination. Any signing request is forwarded to Keynotto. A small portion of requests (specifically from GeoKDL or external gokeyless) cannot be handled by Keynotto directly, and are instead forwarded to gokeyless-internal. gokeyless-internal also acts as a key server proxy, as it redirects connections to the customer’s keyservers (external gokeyless). External gokeyless is both the server that a customer runs and the client that will be used to contact it. The architecture can be seen in Figure 3.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Figure 3: The life of a gokeyless request.

Migrating the transport that this architecture uses to post-quantum cryptography is a bigger challenge, as it involves migrating a service that lives on the customer side. So, for our testing phase, we decided to go for the simpler path that we are able to change ourselves: the TLS handshake between Keynotto and gokeyless-internal. This small test-bed means two things: first, that we needed to change another TLS library (as Keynotto is written in Rust) and, second, that we needed to change gokeyless-internal in such a way that it used post-quantum cryptography only for the handshakes with Keynotto and for nothing else. Note that we did not migrate the signing operations that gokeyless or Keynotto executes with the stored private key; we just migrated the transport connections.

Adding post-quantum cryptography to the rustls codebase was a straightforward exercise and we exposed an easy-to-use API call to signal the usage of post-quantum cryptography (as seen in Figure 4 and Figure 5). One thing that we noted when reviewing the TLS usage in several Cloudflare services is that giving the option to choose the algorithms for a ciphersuite, key share, and authentication in the TLS handshake confuses users. It seemed more straightforward to define the algorithm at the library level, and have a boolean or API call signal the need for this post-quantum algorithm.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Figure 4: post-Quantum API for rustls.
Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Figure 5: usage of the post-quantum API for rustls.

We ran a small test between Keynotto and gokeyless-internal with much success. Our next steps are to integrate this test into the real connection between Keynotto and gokeyless-internal, and to devise a plan for a customer post-quantum protected gokeyless external. This is the first instance in which our migration to post-quantum will not be ending at our edge but rather at the customer’s connection point.

A post-quantum Cloudflare Tunnel

Cloudflare Tunnel is a reverse proxy that allows customers to quickly connect their private services and networks to the Cloudflare network without having to expose their public IPs or ports through their firewall. It is mainly managed at the customer level through the usage of cloudflared, a lightweight server-side daemon, in their infrastructure. cloudflared opens several long-lived TCP connections (although, cloudflared is increasingly using the QUIC protocol) to servers on Cloudflare’s global network. When a request to a hostname comes, it is proxied through these connections to the origin service behind cloudflared.

The easiest part of the service to make post-quantum secure appears to be the connection between our network (with a service part of Tunnel called origintunneld located there) and cloudflared, which we have started migrating. While exploring this path and looking at the whole life of a Tunnel connection, we found something more interesting, though. When the Tunnel connections eventually reach core, they end up going to a service called Tunnelstore. Tunnelstore runs as a stateless application in a Kubernetes deployment, and to provide TLS termination (alongside load balancing and more) it uses a Kubernetes ingress.

The Kubernetes ingress we use at Cloudflare is made of Envoy and Contour. The latter configures the former depending on Kubernetes resources. Envoy uses the BoringSSL library for TLS. Switching TLS libraries in Envoy seemed difficult: there are thoughts on how to integrate OpenSSL to it (and even some thoughts on adding post-quantum cryptography) and ways to switch TLS libraries. Adding post-quantum cryptography to a modified version of BoringSSL, and then specifying that dependency in the Bazel file of Envoy seems to be the path to go for, as our internal test has confirmed (as seen in Figure 6). As for Contour, for many years, Cloudflare has been running their own patched version of it: we will have to again patch this version with our Golang library to provide post-quantum cryptography. We will make these libraries (and the TLS ones) available for usage.

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
Figure 6: Option to allow post-quantum cryptography in Envoy.

Changing the Kubernetes ingress at Cloudflare not only makes Tunnel completely quantum-safe (beyond the connection between our global network and cloudflared), but it also makes any other services using ingress safe. Our first tests on migrating Envoy and Contour to TLS libraries that contain post-quantum protections have been successful, and now we have to test how it behaves in the whole ingress ecosystem.

What is next?

The main tests are now done. We now have TLS libraries (in Go, Rust, and C) that give us post-quantum cryptography. We have two systems ready to deploy post-quantum cryptography, and a shared service (Kubernetes ingress) that we can change. At the beginning of the blog post, we said that “the life of most requests at Cloudflare begins and ends at the edge of our global network”: our aim is that post-quantum cryptography does not end there, but rather reaches all the way to where customers connect as well. Let’s explore the future challenges and this customer post-quantum path in this other blog post!

Using EasyCrypt and Jasmin for post-quantum verification

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/post-quantum-easycrypt-jasmin/

Using EasyCrypt and Jasmin for post-quantum verification

Using EasyCrypt and Jasmin for post-quantum verification

Cryptographic code is everywhere: it gets run when we connect to the bank, when we send messages to our friends, or when we watch cat videos. But, it is not at all easy to take a cryptographic specification written in a natural language and produce running code from it, and it is even harder to validate both the theoretical assumptions and the correctness of the implementation itself. Mathematical proofs, as we talked about in our previous blog post, and code inspection are simply not enough. Testing and fuzzing can catch common or well-known bugs or mistakes, but might miss rare ones that can, nevertheless, be triggered by an attacker. Static analysis can detect mistakes in the code, but cannot check whether the code behaves as described by the specification in natural-language (for functional correctness). This gap between implementation and validation can have grave consequences in terms of security in the real world, and we need to bridge this chasm.

In this blog post, we will be talking about ways to make this gap smaller by making the code we deploy better through analyzing its security properties and its implementation. This blog post continues our work on high assurance cryptography, for example, on using Tamarin to analyze entire protocol specifications. In this one, we want to look more on the side of verifying implementations. Our desire for high assurance cryptography isn’t specific to post-quantum cryptography, but because quantum-safe algorithms and protocols are so new, we want extra reassurance that we’re doing the best we can. The post-quantum era also gives us a great opportunity to try and apply all the lessons we’ve learned while deploying classical cryptography, which will hopefully prevent us from making the same mistakes all over again.

This blog post will discuss formal verification. Formal verification is a technique we can use to prove that a piece of code correctly implements a specification. Formal verification, and formal methods in general, have been around for a long time, appearing as early as the 1950s. Today, they are being applied in a variety of ways: from automating the checking of security proofs to automating checks for functional correctness and the absence of side-channels attacks. Code verified using such formal verification has been deployed in popular products like Mozilla Firefox and Google Chrome.

Formal verification, as opposed to formal analysis, the topic of other blog posts, deals with verifying code and checking that it correctly implements a specification. Formal analysis, on the other hand, deals with establishing that a specification has the desired properties, for example, having a specific security guarantee.

Let’s explore what it means for an algorithm to have a proof that it achieves a certain security goal and what it means to have an implementation we can prove correctly implements that algorithm.

Goals of a formal analysis and verification process

Our goal, given a description of an algorithm in a natural language, is to produce two proofs: first, one that shows that the algorithm has the security properties we want and, second, that we have a correct implementation of it. We can go about this in four steps:

  1. Turn the algorithm and its security goals into a formal specification. This is us defining the problem.
  2. Use formal analysis to prove, in our case using a computer-aided proof tooling, that the algorithm attains the specified properties.
  3. Use formal verification to prove that the implementation correctly implements the algorithm.
  4. Use formal verification to prove that our implementation has additional properties, like memory safety, running in constant time, efficiency, etc.

Interestingly we can do step 2 in parallel with steps 3 and 4, because the two proofs are actually independent. As long as they are both building from the same specification established in step 1, the properties we establish in the formal analysis should flow down to the implementation.

Suppose, more concretely, we’re looking at an implementation and specification of a Key Encapsulation Mechanism (a KEM, such as FrodoKEM). FrodoKEM is designed to achieve IND-CCA security, so we want to prove that it does, and that we have an efficient, side-channel resistant and correct implementation of it.

As you might imagine, achieving even one of these goals is no small feat. Achieving all, especially given the way they conflict (efficiency clashes with side-channel resistance, for example), is a Herculean task. Decades of research have gone into this space, and it is huge; so let’s carve out and examine a small subsection to look at: we’ll look at two tools, EasyCrypt and Jasmin.

Before we jump into the tools, let’s take a brief aside to discuss why we’re not using Tamarin, which we’ve talked about in our other blog posts. Like EasyCrypt, Tamarin is also a tool used for formal analysis, but beyond that, the two tools are quite different. Formal analysis broadly splits into two camps, symbolic analysis and computational analysis. Tamarin, as we saw, uses symbolic analysis, which treats all functions effectively as black boxes, whereas EasyCrypt uses computational analysis. Computational analysis is much closer to how we program, and functions are given specific implementations. This gives computational analysis a much higher “resolution”: we can study properties in much greater detail and, perhaps, with greater ease. This detail, of course, comes at a cost. As functions grow into full protocols, with multiple modes, branching paths, and in the case of the Transport Layer Security (TLS), sometimes even resumption, computational models become unwieldy and difficult to work with, even with computer-assisted tooling. We therefore have to pick the correct tool for the job. When we need maximum assurance, sometimes both computational and symbolic proofs are constructed, with each playing to its strengths and compensating for the other’s drawbacks.

EasyCrypt

EasyCrypt is a proof assistant for cryptographic algorithms and imperative programs. A proof is basically a formal demonstration that some statement is true. EasyCrypt is called a proof assistant because it “assists” you with creating a proof; it does not create a proof for you, but rather, helps you come to it and gives you the power to have a machine check that each step logically follows from the last. It provides a language to write definitions, programs, and theorems along with an environment to develop machine-checked proofs.

A proof starts from a set of assumptions, and by taking a series of logical steps demonstrates that some statement is true. Let’s imagine for a moment that we are the hero Perseus on a quest to kill a mythological being, the terrifying Medusa. How can we prove to everyone that we’ve succeeded? No one is going to want to enter Medusa’s cave to check that she is dead because they’ll be turned to stone. And we cannot just state, “I killed the Medusa.” Who will believe us without proof? After all, is this not a leap of faith?

What we can do is bring the head of the Medusa as proof. Providing the head as a demonstration is our proof because no mortal Gorgon can live without a head. Legend has it that Perseus completed the proof by demonstrating that the head was indeed that of the Medusa: Perseus used the head’s powers to turn Polydectes to stone (the latter was about to force Perseus’ mother to marry him, so let’s just say it wasn’t totally unprovoked). One can say that this proof was done “by hand” in that it was done without any mechanical help. For computer security proofs, sometimes the statements we want to prove are so cumberstone to prove and are so big that having a machine to help us is needed.

How does EasyCrypt achieve this? How does it help you? As we are dealing with cryptography here, first, let’s start by defining how one can reason about cryptography, the security it provides, and the proofs one uses to corroborate them.

When we encrypt something, we do this to hide whatever we want to send. In a perfect world, it would be indistinguishable from noise. Unfortunately, only the one-time-pad truly offers this property, so most of the time we make do with “close enough”: it should be infeasible to differentiate a true encrypted value from a random one.

When we want to show that a certain cryptographic protocol or algorithm has this property, we write it down as an “indistinguishability game.” The idea of the game is as follows:

Imagine a gnome is sitting in a box. The gnome takes a message as input to the box and produces a ciphertext. The gnome records each message and the ciphertext they see generated. A troll outside the box chooses two messages (m1 and m2) of the same length and sends them to the box. The gnome records the box operations and flips a coin. If the coin lands on its face, then the gnome sends the ciphertext (c1) corresponding to m1. Otherwise, they send c2 corresponding to m2. In order to win, the troll, knowing the messages and the ciphertext, has to guess which message was encrypted.

In this example, we can see two things: first, choices are random as the ciphertext sent is chosen by flipping a coin; second, the goal of the adversary is to win a game.

EasyCrypt takes this approach. Security goals are modeled as probabilistic programs (basically, as games) played by an adversary. Tools from program verification and programming language theory are used to justify the cryptographic reasoning. EasyCrypt relies on a “goal directed proof” approach, in which two important mechanisms occur: lemmas and tactics. Let’s see how this approach works (following this amazing paper):

  1. The prover (in this case, you) enters a small statement to prove. For this, one uses the command lemma (meaning this is a minor statement needed to be proven).
  2. EasyCrypt will display the formula as a statement to be proved (i.e the goal) and will also display all the known hypotheses (unproven lemmas) at any given point.
  3. The prover enters a command (a tactic) to either decompose the statement into simpler ones, apply a hypothesis, or make progress in the proof in some other way.
  4. EasyCrypt displays a new set of hypotheses and the parts that still need to be proved.
  5. Back to step 3.

Let’s say you want to prove something small, like the statement “if p in conjunction with q, then q in conjunction with p.” In predicate logic terms, this will be written as (p ∧ q) → (q ∧ p). If we translate this into English statements, as Alice will say in Alice in Wonderland, it could be:

p: I have a world of my own.
q: Everything is nonsense.
p∧q: I have a world of my own and everything is nonsense.
(p ∧ q) → (q ∧ p): If I have a world of my own and everything is nonsense, then, everything is nonsense, and I have a world of my own.

We will walk through such a statement and its proof in EasyCrypt. For more of these examples, see these given by the marvelous Alley Stoughton.

Using EasyCrypt and Jasmin for post-quantum verification
Our lemma and proof in EasyCrypt.

lemma implies_and () :

This line introduces the stated lemma and creatively calls it “implies_and”. It takes no parameters.

(forall (P, Q: bool) => P /\ Q => Q /\ P.

This is the statement we want to prove. We use the variables P and Q of type bool (booleans), and we state that if P and Q, then Q and P.

Up until now we have just declared our statement to prove to EasyCrypt. Let’s see how we write the proof:

proof.
This line demarcates the start of the proof for EasyCrypt.


move => p q H.
We introduce the hypothesis we want to prove (we move them to the “context”). We state that P and Q are both booleans, and that H is the hypothesis P /\ Q.


elim H.

We eliminate H (the conjunctive hypothesis) and we get the components: “p => q => q /\ p”.

trivial.

The proof is now trivial.

qed.

Quod erat demonstrandum (QED) denotes the end of the proof (if both are true, then the conjunction holds). Whew! For such a simple statement, this took quite a bit of work, because EasyCrypt leaves no stone unturned. If you get to this point, you can be sure your proof is absolutely correct, unless there is a bug in EasyCrypt itself (or unless we are proving something that we weren’t supposed to).

As you see, EasyCrypt helped us by guiding us in decomposing the statement into simpler terms, and stating what still needed to be proven. And by strictly following logical principles, we managed to realize a proof. If we are doing something wrong, and our proof is incorrect, EasyCrypt will let us know, saying something like:

Using EasyCrypt and Jasmin for post-quantum verification
Screenshot of EasyCrypt showing us that we did something wrong.

What we have achieved is a computer-checked proof of the statement, giving us far greater confidence in the proof than if we had to scan over one written with pen and paper. But what makes EasyCrypt particularly attractive in addition to this is its tight integration with the Jasmin programming language as we will see later.

EasyCrypt will also interactively guide us to the proof, as it easily works with ProofGeneral in Emacs. In the image below we see, for example, that EasyCrypt is guiding us by showing the variables we have declared (p, q, and H) and what is missing to be proven (after the dashed line).

Using EasyCrypt and Jasmin for post-quantum verification
EasyCrypt interactively showing us at which point of the proof we are at: the cyan section shows us up until which point we have arrived.

If one is more comfortable with Coq proof assistant (you can find very good tutorials of it), a similar proof can be given:

Using EasyCrypt and Jasmin for post-quantum verification
Our lemma and proof in Coq.

EasyCrypt allows us to prove statements in a faster and more assured manner than if we do proofs by hand. Proving the truthness of the statement we just showed would be easy with the usage of truth tables, for example. But, it is only easy to find these truth tables or proofs when the statement is small. If one is given a complex cryptography algorithm or protocol, the situation is much harder.

Jasmin

Jasmin is an assembly-like programming language with some high-level syntactic conveniences such as loops and procedure calls while using assembly-level instructions. It does support function calls and functional arrays, as well. The Jasmin compiler predictably transforms source code into assembly code for a chosen platform (currently only x64 is supported). This transformation is verified: the correctness of some compiler passes (like function inlining or loop unrolling) are proven and verified in the Coq proof assistant. Other passes are programmed in a conventional programming language and the results are validated in Coq. The compiler also comes with a built-in checker for memory safety and constant-time safety.

This assembly-like syntax, combined with the stated assurances of the compiler, means that we have deep control over the output, and we can optimize it however we like without compromising safety. Because low-level cryptographic code tends to be concise and non-branching, Jasmin doesn’t need full support for general purpose language features or to provide lots of libraries. It only needs to support a set of basic features to give us everything we need.

One reason Jasmin is so powerful is that it provides a way to formally verify low-level code. The other reason is that Jasmin code can be automatically converted by the compiler into equivalent EasyCrypt code, which lets us reason about its security. In general terms, whatever guarantees apply to the EasyCrypt code also flow into the Jasmin code, and subsequently into the assembly code.

Let’s use the example of a very simple Jasmin function that performs multiplication to see Jasmin in action:

Using EasyCrypt and Jasmin for post-quantum verification
A multiplication function written in Jasmin.

What the function (“fn”) “mul” does, in this case, is to multiply by whatever number is provided as an argument to the function (the variable a). The syntax of this small function should feel very familiar to anyone that has worked with the C family of programming languages. The only big difference is the use of the words reg and u64. What they state is that the variable a, for example, is allocated in registers (hence, the use of reg: this defines the storage of the variable) and that it is 64-bit machine-word (hence, the use of u64). We can convert now this to “pure” x64 assembly:

Using EasyCrypt and Jasmin for post-quantum verification
A multiplication function written in Jasmin and transformed to Assembly.

The first lines of the assembly code are just “setting all up”. They are then followed by the “imulq” instruction, which just multiplies the variable by the constant (which in this case is labeled as “param”). While this small function might not show the full power of having the ability of safely translating to assembly, it can be seen when more complex functions are created. Functions that use while loops, arrays, calls to other functions are accepted by Jasmin and will be safely translated to assembly.

Assembly language has a little bit of a bad reputation because it is thought to be hard to learn, hard to read, and hard to maintain. Having a tool that helps you with translation is very useful to a programmer, and it is also useful as you can manually or automatically check what the assembly code looks like.

We can further check the code for its safety:

Using EasyCrypt and Jasmin for post-quantum verification
A multiplication function written and checked in Jasmin.

In this check, there are many things to understand. First, it checks that the inputs are allocated in a memory region of at least 0 bytes. Second, the “Rel” entry checks the allocated memory region safety pre-condition: for example, n must point to an allocated memory region of sufficient length.

You can then extract this functionality to EasyCrypt (and even configure EasyCrypt to verify Jasmin programs). Here is the corresponding EasyCrypt code, automatically produced by the Jasmin compiler:

Using EasyCrypt and Jasmin for post-quantum verification
A multiplication function written in Jasmin and extracted to EasyCrypt.

Here’s a slightly more involved example, that of a FrodoKEM utility function written in Jasmin.

Using EasyCrypt and Jasmin for post-quantum verification
A utility function for addition for FrodoKEM.

With a C-like syntax,  this function adds two arrays (a and b), and returns the result (in out). The value NBAR is just a parameter you can specify in a C-like manner. You can then take this function and compile it to assembly. You can also use the Jasmin compiler to analyze the safety of the code (for example, that array accesses are in bounds, that memory accesses are valid, that arithmetic operations are applied to valid arguments) and verify the code runs in constant-time.

The addition function as used by FrodoKEM can also be extracted to EasyCrypt:

Using EasyCrypt and Jasmin for post-quantum verification
The addition function as extracted to EasyCrypt.

A theorem expressing the correctness (meaning that addition is correct) is expressed in EasyCrypt as so:

Using EasyCrypt and Jasmin for post-quantum verification
The theorem of addition function as extracted to EasyCrypt.

Note that EasyCrypt uses While Language and Hoare Logic. The corresponding proof that states that addition is correct:

Using EasyCrypt and Jasmin for post-quantum verification
The proof of the addition function as extracted to EasyCrypt.

Why formal verification for post-quantum cryptography?

As we have previously stated, cryptographic implementations are very hard to get right, and even if they are right, the security properties they claim to provide are sometimes wrong for their intended application. The reason why this matters so much is that post-quantum cryptography is the cryptography we will be using in the future due to the arrival of quantum computers. Deploying post-quantum cryptographic algorithms with bugs or flaws in their security properties would be a disaster because connections and data that travels through it can be decrypted or attacked. We are trying to prevent that.

Cryptography is difficult to get right, and it is not only difficult to get right by people new to it, but it is also difficult for anyone, even for the experts. The designs and code we write are error-prone as we all are, as humans, prone to errors. Some examples of when some designs got it wrong are as follows (luckily, these example  were not deployed, and they did not have the usual disastrous consequences):

  • Falcon (a post-quantum algorithm currently part of the NIST procedure), produced valid signatures “but leaked information on the private key,” according to an official comment posted to the NIST post-quantum process on the algorithm. The comment also noted that “the fact that these bugs existed in the first place shows that the traditional development methodology (i.e. “being super careful”) has failed.“
  • “The De Feo–Jao–Plût identification scheme (the basis for SIDH signatures) contains an invalid assumption and provide[s] a counterexample for this assumption: thus showing the proof of soundness is invalid,” according to a finding that one proof of a post-quantum algorithm was not valid. This is an example of an incorrect proof, whose flaws were discovered and eliminated prior to any deployment.

Perhaps these two examples might convince the reader that formal analysis and formal verification of implementations are needed. While they help us avoid some human errors, they are not perfect. As for us, we are convinced of these methods. We are working towards a formally verified implementation of FrodoKEM (we have a first implementation of it in our cryptographic library, CIRCL), and we are collaborating to create a formally verified and implemented library we can run in real-world connections. If you are interested in learning more about EasyCrypt and Jasmin, visit the resources we have put together, try to install it following our guidelines, or follow some tutorials.

See you on other adventures in post-quantum (and some cat videos for you)!

References:

  • “SoK: Computer-Aided Cryptography” by Manuel Barbosa, Gilles Barthe, Karthik Bhargavan, Bruno Blanchet, Cas Cremers, Kevin Liao and Bryan Parno: https://eprint.iacr.org/2019/1393.pdf
  • “EasyPQC: Verifying Post-Quantum Cryptography” by Manuel Barbosa, Gilles Barthe, Xiong Fan, Benjamin Grégoire, Shih-Han Hung, Jonathan Katz, Pierre-Yves Strub, Xiaodi Wu and Li Zhou: https://eprint.iacr.org/2021/1253
  • “Jasmin: High-Assurance and High-Speed Cryptography” by José Bacelar Almeida, Manuel Barbosa, Gilles Barthe, Arthur Blot, Benjamin Grégoire, Vincent Laporte, Tiago Oliveira, Hugo Pacheco, Benedikt Schmidt and Pierre-Yves Strub: https://dl.acm.org/doi/pdf/10.1145/3133956.3134078
  • “The Last Mile: High-Assurance and High-Speed Cryptographic Implementations” by José Bacelar Almeida, Manuel Barbosa, Gilles Barthe, Benjamin Grégoire, Adrien Koutsos, Vincent Laporte,Tiago Oliveira and Pierre-Yves Strub: https://arxiv.org/pdf/1904.04606.pdf

Making protocols post-quantum

Post Syndicated from Thom Wiggers original https://blog.cloudflare.com/making-protocols-post-quantum/

Making protocols post-quantum

Making protocols post-quantum

Ever since the (public) invention of cryptography based on mathematical trap-doors by Whitfield Diffie, Martin Hellman, and Ralph Merkle, the world has had key agreement and signature schemes based on discrete logarithms. Rivest, Shamir, and Adleman invented integer factorization-based signature and encryption schemes a few years later. The core idea, that has perhaps changed the world in ways that are hard to comprehend, is that of public key cryptography. We can give you a piece of information that is completely public (the public key), known to all our adversaries, and yet we can still securely communicate as long as we do not reveal our piece of extra information (the private key). With the private key, we can then efficiently solve mathematical problems that, without the secret information, would be practically unsolvable.

In later decades, there were advancements in our understanding of integer factorization that required us to bump up the key sizes for finite-field based schemes. The cryptographic community largely solved that problem by figuring out how to base the same schemes on elliptic curves. The world has since then grown accustomed to having algorithms where public keys, secret keys, and signatures are just a handful of bytes and the speed is measured in the tens of microseconds. This allowed cryptography to be bolted onto previously insecure protocols such as HTTP or DNS without much overhead in either time or the data transmitted. We previously wrote about how TLS loves small signatures; similar things can probably be said for a lot of present-day protocols.

But this blog has “post-quantum” in the title; quantum computers are likely to make our cryptographic lives significantly harder by undermining many of the assurances we previously took for granted. The old schemes are no longer secure because new algorithms can efficiently solve their particular mathematical trapdoors. We, together with everyone on the Internet, will need to swap them out. There are whole suites of quantum-resistant replacement algorithms; however, right now it seems that we need to choose between “fast” and “small”. The new alternatives also do not always have the same properties that we have based some protocols on.

Making protocols post-quantum
Fast or small: Cloudflare previously experimented with NTRU-HRSS (a fast key exchange scheme with large public keys and ciphertexts) and SIKE (a scheme with very small public keys and ciphertexts, but much slower algorithm operations).

In this blog post, we will discuss how one might upgrade cryptographic protocols to make them secure against quantum computers. We will focus on the cryptography that they use and see what the challenges are in making them secure against quantum computers. We will show how trade-offs might motivate completely new protocols for some applications. We will use TLS here as a stand-in for other protocols, as it is one of the most deployed protocols.

Making TLS post-quantum

TLS, from SSL and HTTPS fame, gets discussed a lot. We keep it brief here. TLS 1.3 consists of an Ephemeral Elliptic curve Diffie-Hellman (ECDH) key exchange which is authenticated by a digital signature that’s verified by using a public key that’s provided by the server in a certificate. We know that this public key is the right one because the certificate contains another signature by the issuer of the certificate and our client has a repository of valid issuer (“certificate authority”) public keys that it can use to verify the authenticity of the server’s certificate.

In principle, TLS can become post-quantum straightforwardly: we just write “PQ” in front of the algorithms. We replace ECDH key exchange by post-quantum (PQ) key exchange provided by a post-quantum Key Encapsulation Mechanism (KEM). For the signatures on the handshake, we just use a post-quantum signature scheme instead of an elliptic curve or RSA signature scheme. No big changes to the actual “arrows” of the protocol necessary, which is super convenient because we don’t need to revisit our security proofs. Mission accomplished, cake for everyone, right?

Making protocols post-quantum
Upgrading the cryptography in TLS seems as easy as scribbling in “PQ-”.

Key exchange

Of course, it’s not so simple. There are nine different suites of post-quantum key exchange algorithms still in the running in round three of the NIST Post-Quantum standardization project: Kyber, SABER, NTRU, and Classic McEliece (the “finalists”); and SIKE, BIKE, FrodoKEM, HQC, and NTRU Prime (“alternates”). These schemes have wildly different characteristics. This means that for step one, replacing the key exchange by post-quantum key exchange, we need to understand the differences between these schemes and decide which one fits best in TLS. Because we’re doing ephemeral key exchange, we consider the size of the public key and the ciphertext since they need to be transmitted for every handshake. We also consider the “speed” of the key generation, encapsulation, and decapsulation operations, because these will affect how many servers we will need to handle these connections.

Table 1: Post-Quantum KEMs at security level comparable with AES128. Sizes in bytes.

Scheme Transmission size (pk+ct) Speed of operations
Finalists
Kyber512 1,632 Very fast
NTRU-HPS-2048-509 1,398 Very fast
SABER (LightSABER) 1,408 Very fast
Classic McEliece 261,248 Very slow
Alternate candidates
SIKEp434 676 Slow
NTRU Prime (ntrulpr) 1,922 Very fast
NTRU Prime (sntru) 1,891 Fast
BIKE 5,892 Slow
HQC 6,730 Reasonable
FrodoKEM 19,336 Reasonable

Fortunately, once we make this table the landscape for KEMs that are suitable for use in TLS quickly becomes clear. We will have to sacrifice an additional 1,400 – 2,000 bytes, assuming SIKE’s slow runtime is a bigger penalty to the connection establishment (see our previous work here). So we can choose one of the lattice-based finalists (Kyber, SABER or NTRU) and call it a day.1

Signature schemes

For our post-quantum signature scheme, we can draw a similar table. In TLS, we generally care about the sizes of public keys and signatures. In terms of runtime, we care about signing and verification times, as key generation is only done once for each certificate, offline. The round three candidates for signature schemes are: Dilithium, Falcon, Rainbow (the three finalists), and SPHINCS+, Picnic, and GeMSS.

Table 2: Post-Quantum signature schemes at security level comparable with AES128 (or smallest parameter set). Sizes in bytes.

Scheme Public key size Signature size Speed of operations
Finalists
Dilithium2 1,312 2,420 Very fast
Falcon-512 897 690 Fast if you have the right hardware
Rainbow-I-CZ 103,648 66 Fast
Alternate Candidates
SPHINCS+-128f 32 17,088 Slow
SPHINCS+-128s 32 7,856 Very slow
GeMSS-128 352,188 33 Very slow
Picnic3 35 14,612 Very slow

There are many signatures in a TLS handshake. Aside from the handshake signature that the server creates to authenticate the handshake (with public key in the server certificate), there are signatures on the certificate chain (with public keys for intermediate certificates), as well as OCSP Stapling (1) and Certificate Transparency (2) signatures (without public keys).

This means that if we used Dilithium for all of these, we require 17KB of public keys and signatures. Falcon is looking very attractive here, only requiring 6KB, but it might not run fast enough on embedded devices that don’t have special hardware to accelerate 64-bit floating point computations in constant time. SPHINCS+, GeMSS, or Rainbow each have significant deployment challenges, so it seems that there is no one-scheme-fits-all solution.

Picking and choosing specific algorithms for particular use cases, such as using a scheme with short signatures for root certificates, OCSP Stapling, and CT might help to alleviate the problems a bit. We might use Rainbow for the CA root, OCSP staples, and CT logs, which would mean we only need 66 bytes for each signature. It is very nice that Rainbow signatures are only very slightly larger than 64-byte ed25519 elliptic curve signatures, and they are significantly smaller than 256-byte RSA-2048 signatures. This gives us a lot of space to absorb the impact of the larger handshake signatures required. For intermediate certificates, where both the public key and the signature are transmitted, we might use Falcon because it’s nice and small, and the client only needs to do signature verification.

Using KEMs for authentication

In the pre-quantum world, key exchange and signature schemes used to be roughly equivalent in terms of work required or bytes transmitted. As we saw in the previous section, this doesn’t hold up in the post-quantum world. This means that this might be a good opportunity to also investigate alternatives to the classic “signed key exchange” model. Deploying significant changes to an existing protocol might be harder than just swapping out primitives, but we might gain better characteristics. We will look at such a proposed redesign for TLS here.

The idea is to use key exchange not just for confidentiality, but also for authentication. This uses the following idea: what a signature in a protocol like TLS is actually proving is that the person signing has possession of the secret key that corresponds to the public key that’s in the certificate. But we can also do this with a key exchange key by showing you can derive the same shared secret (if you want to prove this explicitly, you might do so by computing a Message Authentication Code using the established shared secret).

This isn’t new; many modern cryptographic protocols, such as the Signal messaging protocol, have used such mechanisms. They offer privacy benefits like (offline) deniability. But now we might also use this to obtain a faster or “smaller” protocol.

However, this does not come for free. Because authentication via key exchange (via KEM at least) inherently requires two participants to exchange keys, we need to send more messages. In TLS, this means that the server that wants to authenticate first needs to give the client their public key. The client obviously can not encapsulate a shared secret to a key he does not know.

We also still need to verify signatures on the certificate chain and the signatures for OCSP stapling and Certificate Transparency are still necessary. Because we need to do “offline” verification for those elements of the handshake, it is hard to get rid of those signatures. So we will still need to carefully look at those signatures and pick an algorithm that fits there.

   Client                                  Server
 ClientHello         -------->
                     <--------         ServerHello
                                             <...>
                     <--------       <Certificate>  ^
 <KEMEncapsulation>                                 | Auth
 {Finished}          -------->                      |
 [Application Data]  -------->                      |
                     <--------          {Finished}  v
 [Application Data]  <------->  [Application Data]

<msg>: encrypted w/ keys derived from ephemeral KEX (HS)
{msg}: encrypted w/ keys derived from HS+KEM (AHS)
[msg]: encrypted w/ traffic keys derived from AHS (MS)

Authentication via KEM in TLS from the AuthKEM proposal

Authentication via KEM in TLS from the AuthKEM proposal

If we put the necessary arrows to authenticate via KEM into TLS it looks something like Figure 2. This is actually a fully-fledged proposal for an alternative to the usual TLS handshake. The academic proposal KEMTLS was published at the ACM CCS conference in 2020; a proposal to integrate this into TLS 1.3 is described in the draft-celi-wiggers-tls-authkem draft RFC.

What this proposal illustrates is that the transition to post-quantum cryptography might motivate, or even require, us to have a brand-new look at what the desired characteristics of our protocol are and what properties we need, like what budget we have for round-trips versus the budget for data transmitted. We might even pick up some properties, like deniability, along the way. For some protocols this is somewhat easy, like TLS; in other protocols there isn’t even a clear idea of where to start (DNSSEC has very tight limits).

Conclusions

We should not wait until NIST has finished standardizing post-quantum key exchange and signature schemes before thinking about whether our protocols are ready for the post-quantum world. For our current protocols, we should investigate how the proposed post-quantum key exchange and signature schemes can be fitted in. At the same time, we might use this opportunity for careful protocol redesigns, especially if the constraints are so tight that it is not easy to fit in post-quantum cryptography. Cloudflare is participating in the IETF and working with partners in both academia and the industry to investigate the impact of post-quantum cryptography and make the transition as easy as possible.

If you want to be a part of the future of cryptography on the Internet, either as an academic or an engineer, be sure to check out our academic outreach or jobs pages.

Reference

…..

1Of course, it’s not so simple. The performance measurements were done on a beefy Macbook, using AVX2 intrinsics. For stuff like IoT (yes, your smart washing machine will also need to go post-quantum) or a smart card you probably want to add another few columns to this table before making a choice, such as code size, side channel considerations, power consumption, and execution time on your platform.

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

Post Syndicated from Goutam Tamvada original https://blog.cloudflare.com/post-quantum-key-encapsulation/

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

The Internet is accustomed to the fact that any two parties can exchange information securely without ever having to meet in advance. This magic is made possible by key exchange algorithms, which are core to certain protocols, such as the Transport Layer Security (TLS) protocol, that are used widely across the Internet.

Key exchange algorithms are an elegant solution to a vexing, seemingly impossible problem. Imagine a scenario where keys are transmitted in person: if Persephone wishes to send her mother Demeter a secret message, she can first generate a key, write it on a piece of paper and hand that paper to her mother, Demeter. Later, she can scramble the message with the key, and send the scrambled result to her mother, knowing that her mother will be able to unscramble the message since she is also in possession of the same key.

But what if Persephone is kidnapped (as the story goes) and cannot deliver this key in person? What if she can no longer write it on a piece of paper because someone (by chance Hades, the kidnapper) might read that paper and use the key to decrypt any messages between them? Key exchange algorithms come to the rescue: Persephone can run a key exchange algorithm with Demeter, giving both Persephone and Demeter a secret value that is known only to them (no one else knows it) even if Hades is eavesdropping. This secret value can be used to encrypt messages that Hades cannot read.

The most widely used key exchange algorithms today are based on hard mathematical problems, such as integer factorization and the discrete logarithm problem. But these problems can be efficiently solved by a quantum computer, as we have previously learned, breaking the secrecy of the communication.

There are other mathematical problems that are hard even for quantum computers to solve, such as those based on lattices or isogenies. These problems can be used to build key exchange algorithms that are secure even in the face of quantum computers. Before we dive into this matter, we have to first look at one algorithm that can be used for Key Exchange: Key Encapsulation Mechanisms (KEMs).

Two people could agree on a secret value if one of them could send the secret in an encrypted form to the other one, such that only the other one could decrypt and use it. This is what a KEM makes possible, through a collection of three algorithms:

  • A key generation algorithm, Generate, which generates a public key and a private key (a keypair).
  • An encapsulation algorithm, Encapsulate, which takes as input a public key, and outputs a shared secret value and an “encapsulation” (a ciphertext) of this secret value.
  • A decapsulation algorithm, Decapsulate, which takes as input the encapsulation and the private key, and outputs the shared secret value.

A KEM can be seen as similar to a Public Key Encryption (PKE) scheme, since both use a combination of public and private keys. In a PKE, one encrypts a message using the public key and decrypts using the private key. In a KEM, one uses the public key to create an “encapsulation” — giving a randomly chosen shared key — and one decrypts this “encapsulation” with the private key. The reason why KEMs exist is that PKE schemes are usually less efficient than symmetric encryption schemes; one can use a KEM to only transmit the shared/symmetric key, and later use it in a symmetric algorithm to efficiently encrypt data.

Nowadays, in most of our connections, we do not use KEMs or PKEs per se. We either use Key Exchanges (KEXs) or Authenticated Key Exchanges (AKE). The reason for this is that a KEX allows us to use public keys (solving the key exchange problem of how to securely transmit keys) in order to generate a shared/symmetric key which, in turn, will be used in a symmetric encryption algorithm to encrypt data efficiently. A famous KEX algorithm is Diffie-Hellman, but classical Diffie-Hellman based mechanisms do not provide security against a quantum adversary; post-quantum KEMs do.

When using a KEM, Persephone would run Generate and publish the public key. Demeter takes this public key, runs Encapsulate, keeps the generated secret to herself, and sends the encapsulation (the ciphertext) to Persephone. Persephone then runs Decapsulate on this encapsulation and, with it, arrives at the same shared secret that Demeter holds. Hades will not be able to guess even a bit of this secret value even if he sees the ciphertext.

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

In this post, we go over the construction of one particular post-quantum KEM, called FrodoKEM. Its design is simple, which makes it a good choice to illustrate how a KEM can be constructed. We will look at it from two perspectives:

  • The underlying mathematics: a cryptographic algorithm is built as a Matryoshka doll. The first doll is, most of the time, the mathematical base, which hardness should be strong so that security is maintained. In the post-quantum world, this is usually the hardness of some lattice problems (more on this in the next section).
  • The algorithmic construction : these are all the subsequent dolls that take the mathematical base and construct an algorithm out of it. In the case of a KEM, first you construct a Public Key Encryption (PKE) scheme and transform it (putting another doll on top) to make a KEM, so better security properties are attained, as we will see.
Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

The core of FrodoKEM is a public-key encryption scheme called FrodoPKE, whose security is based on the hardness of the “Learning with Errors” (LWE) problem over lattices. Let us look now at the first doll of a KEM.

Note to the reader: Some mathematics is coming in the next sections, but do not worry, we will guide you through it.

The Learning With Errors Problem

The security (and mathematical foundation) of FrodoKEM relies on the hardness of the Learning With Errors (LWE) problem, a generalization of the classic Learning Parities with Noise problem, first defined by Regev.

In cryptography, specifically in the mathematics underlying it, we often use sets to define our operations. A set is a collection of any element, in this case, we will refer to collections of numbers. In cryptography textbooks and articles, one can often read:

Let $Z_q$ denote the set of integers $\{0, …, q-1\}$ where $(q > 2)$,

which means that we have a collection of integers from 0 to a number q (which has to be bigger than 2. It is assumed that q, in a cryptographic application, is a prime. In the main theorem, it is an arbitrary integer).

Let $\{Z^n\}_q$ denote a vector $(v1, v2, …, vn)$ of n elements, each of which belongs to $Z_q$.

The LWE problem asks to recover a secret vector $s = (s1, s2, …, sn)$ in $\{Z^n\}_q$ given a sequence of random, “approximate” linear equations on s. For instance, if $(q = 23)$ the equations might be:

[s1 + s2 + s3 + s4 ≈ 30 (mod 23)

2s1 + s3 + s5 + … + sn ≈ 40 (mod 23)

10s2 + 13s3 + 1s4 ≈ 50 (mod 23)

…]

We see the left-hand sides of the equations above are not exactly equal to the right-hand side (the equality sign is not used but rather the “≈” sign: approximately equal to); they are off by an introduced slight “error”, (which will be defined as the variable e. In the equations above, the error is, for example, the number 10). If the error was a known, public value, recovering s (the hidden variable) would be easy: after about n equations, we can recover s in a reasonable time using Gaussian elimination. Introducing this unknown error makes the problem difficult to solve (it is difficult with accuracy to find s), even for quantum computers.

An equivalent formulation of the LWE problem is:

  1. There exists a vector s in $\{Z^n\}_q$, called the secret (the hidden variable).
  2. There exists random variables a.
  3. χ is a distribution, e is the integer error introduced from the distribution χ.
  4. You have: (a, ⟨a, s⟩ + e). ⟨a, s⟩ is the inner product modulo q of s and a.
  5. Given ⟨a, s⟩ + e ≈ b, the input to the problem is a and b, the goal is to output a guess for s which is very hard to achieve with accuracy.

Blum, Kalai and Wasserman provided the first subexponetial algorithm for solving this problem. It requires 2O(n /log n) equations/time.

There are two main kinds of computational LWE problems that are difficult to solve for quantum computers (given certain choices of both q and χ):

  1. Search, which is to recover the secret/hidden variable s by only being given a certain number of samples drawn from the distribution χ.
  2. Decision, which is to distinguish a certain number of samples drawn from the distribution (a, ⟨a, s⟩ + e) from random samples.
Deep Dive Into a Post-Quantum Key Encapsulation Algorithm
The LWE problem: search and decision.

LWE is just noisy linear algebra, and yet it seems to be a very hard problem to solve. In fact, there are many reasons to believe that the LWE problem is hard: the best algorithms for solving it run in exponential time. It also is closely related to the Learning Parity with Noise (LPN) problem, which is extensively studied in learning theory, and it is believed to be hard to solve (any progress in breaking LPN will potentially lead to a breakthrough in coding theory). How does it relate to building cryptography? LWE is applied to the cryptographic applications of the type of public-key. In this case, the secret value s becomes the private key, and the values bi and ei are the public key.

So, why is this problem related to lattices? In other blog posts, we have seen that certain algorithms of post-quantum cryptography are based on lattices. So, how does LWE relate to them? One can view LWE as the problem of decoding from random linear codes, or reduce it to lattices, in particular to problems such as the Short Vector Problem (SVP) or the Shortest independent vectors problem (SIVP): an efficient solution to LWE implies a quantum algorithm to SVP and SIVP. In other blog posts, we talk about SVP, so, in this one, we will focus on the random bounded distance decoding problem on lattices.

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

Lattices (as seen in the image), as a regular and periodic arrangement of points in space, have emerged as a foundation of cryptography in the face of quantum adversaries; one modern problem in which they rely on is the Bounded Distance Decoding (BDD) problem. In the BDD problem, you are given a lattice with an arbitrary basis (a basis is a list of vectors that generate all the other points in a lattice. In the case of the image, it is the pair of vectors b1 and b2). You are then given a vector b3 on it. You then perturb the lattice point b3 by adding some noise (or error) to give x. Given x, the goal is to find the nearest lattice point (in this case b3), as seen in the image. In this case, LWE is an average-case form of BDD (Regev also gave a worst-case to average-case reduction from BDD to LWE: the security of a cryptographic system is related to the worst-case complexity of BDD).

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

The first doll is built. Now, how do we build encryption from this mathematical base? From LWE, we can build a public key encryption algorithm (PKE), as we will see next with FrodoPKE as an example.

Deep Dive Into a Post-Quantum Key Encapsulation Algorithm

Public Key Encryption: FrodoPKE

The second doll of the Matryoshka is using a mathematical base to build a Public Key Encryption algorithm from it. Let’s look at FrodoPKE. FrodoPKE is a public-key encryption scheme which is the building block for FrodoKEM. It is made up of three components: key generation, encryption, and decryption. Let’s say again that Persephone wants to communicate with Demeter. They will run the following operations:

  1. Generation: Generate a key pair by taking a LWE sample (like (A, B = As + e mod q)). The public key is A, B and the private key is s. Persephone sends this public key to Demeter.
  2. Encryption: Demeter receives this public key and wants to send a private message with it, something like “come back”. She generates two secret vectors ((s1, e1) and (e2)). She then:
  3. Makes the sample (b1 = As1 + e1 mod q).
  4. Makes the sample (v1 = Bs1 + e2 mod q).
  5. Adds the message m to the most significant bit of v1.
  6. Sends b1 and v1 to Persephone (this is the ciphertext).
  7. Decryption: Persephone receives the ciphertext and proceeds to:
  8. Calculate m = v1 – b1  *  s and is able to recover the message, and she proceeds to leave to meet her mother.

Notice that computing v = v1- b1  * s gives us m + e2 (the message plus the error matrix sampled during encryption). The decryption process performs rounding, which will output the original message m if the error matrix e2 is carefully chosen. If not, notice that there is the potential of decryption failure.

What kind of security does this algorithm give? In cryptography, we design algorithms with security notions in mind, notions they have to attain. This algorithm, FrodoPKE (as with other PKEs), satisfies only IND-CPA (Indistinguishability under chosen-plaintext attack) security. Intuitively, this notion means that a passive eavesdropper listening in can get no information about a message from a ciphertext. Even if the eavesdropper knows that a ciphertext is an encryption of just one of two messages of their choice, looking at the ciphertext should not tell the adversary which one was encrypted. We can also think of it as a game:

A gnome can be sitting inside a box. This box takes a message and produces a ciphertext. All the gnome has to do is record each message and the ciphertext they see generated. An outside-of-the-box adversary, like a troll, wants to beat this game and know what the gnome knows: what ciphertext is produced if a certain message is given. The troll chooses two messages (m1 and m2) of the same length and sends them to the box. The gnome records the box operations and flips a coin. If the coin lands on its face, then they send the ciphertext (c1) corresponding to m1. Otherwise, they send c2 corresponding to m2. The troll, knowing the messages and the ciphertext, has to guess which message was encrypted.

IND-CPA security is not enough for all secure communication on the Internet. Adversaries can not only passively eavesdrop, but also mount chosen-ciphertext attacks (CCA): they can actively modify messages in transit and trick the communicating parties into decrypting these modified messages, thereby obtaining a decryption oracle. They can use this decryption oracle to gain information about a desired ciphertext, and so compromise confidentiality. Such attacks are practical and all that an attacker has to do is, for example, ​​send several million test ciphertexts to a decryption oracle, see Bleichenbacher’s attack and the ROBOT attack, for example.

Without CCA security, in the case of Demeter and Persephone, what this security means is that Hades can generate and send several million test ciphertexts to the decryption oracle and eventually reveal the content of a valid ciphertext that Hades did not generate. Demeter and Persephone then might not want to use this scheme.

Key Encapsulation Mechanisms: FrodoKEM

The last figure of the Matryoshka doll is taking a secure-against-CPA scheme and making it secure against CCA. A secure-against-CCA scheme must not leak information about its private key, even when decrypting arbitrarily chosen ciphertexts. It must also be the case that an adversary cannot craft valid ciphertexts without knowing what the plaintext message is; suppose, again, that the adversary knows the messages encrypted could only be either m0 or m1. If the attacker can craft another valid ciphertext, for example, by flipping a bit of the ciphertext in transit, they can send this modified ciphertext, and see whether a message close to m1 or m0 is returned.

To make a CPA scheme secure against CCA, one can use the Hofheinz, Hovelmanns, and Kiltz (HHK) transformations (see this thesis for more information). The HHK transformation constructs an IND-CCA-secure KEM from both an IND-CPA PKE and three hash functions. In the case of the algorithm we are exploring, FrodoKEM, it uses a slightly tweaked version of the HHK transform. It has, again, three functions (some parts of this description are simplified):

Generation:

  1. We need a hash function G1.
  2. We need a PKE scheme, such as FrodoPKE.
  3. We call the Generation function of FrodoPKE, which returns a public (pk) and private key (sk).
  4. We hash the public key pkh ← G1(pk).
  5. We chose a value s at random.
  6. The public key is pk and the private key sk1 is (sk, s, pk, pkh).

Encapsulate:

  1. We need two hash functions: G2 and F.
  2. We generate a random message u.
  3. We hash the received public key pkh with the random message (r, k) ← G2(pkh || u).
  4. We call the Encryption function of FrodoPKE: ciphertext ← Encrypt(u, pk, r).
  5. We hash: shared secret ← F(c || k).
  6. We send the ciphertext and the shared secret.

Decapsulate:

  1. We need two hash functions (G2 and F) and we have (sk, s, pk, pkh).
  2. We receive the ciphertext and the shared secret.
  3. We call the decryption function of FrodoPKE: message ← Decrypt(shared secret, ciphertext).
  4. We hash: (r , k) ← G2(pkh || message).
  5. We call the Encryption function of FrodoPKE: ciphertext1 ← Encrypt(message, pk, r).
  6. If ciphertext1 == ciphertext, k = k0; else, k = s.
  7. We hash: ss ← F(ciphertext || k).
  8. We return the shared secret ss.

What this algorithm achieves is the generation of a shared secret and ciphertext which can be used to establish a secure channel. It also means that no matter how many ciphertexts Hades sends to the decryption oracle, they will never reveal the content of a valid ciphertext that Hades himself did not generate. This is ensured when we run the encryption process again in Decapsulate to check if the ciphertext was computed correctly, which ensures that an adversary cannot craft valid ciphertexts simply by modifying them.

With this last doll, the algorithm has been created, and it is safe in the face of a quantum adversary.

Other KEMs beyond Frodo

While the ring bearer, Frodo, wanders around and transforms, he was not alone in his journey.  FrodoKEM is currently designated as an alternative candidate for standardization as part of the post-quantum NIST process. But, there are others:

  • Kyber, NTRU, Saber: which are based on variants of the LWE problem over lattices and,
  • Classic McEliece: which is based on error correcting codes.

The lattice-based variants have the advantage of being fast, while producing relatively small keys and ciphertexts. There are concerns about their security, which need to be properly verified, however. More confidence is found in the security of the Classic McEliece scheme, as its underlying problem has been studied for longer (It is only one year older than RSA!). It has a disadvantage: it produces extremely large public keys. Classic-McEliece-348864 for example, produces public keys of size 261,120 bytes, whereas Kyber512, which claims comparable security, produces public keys of size 800 bytes.

They are all Matryoshka dolls (including sometimes non-post-quantum ones). They are all algorithms that are placed one inside the other. They all start with a small but powerful idea: a mathematical problem whose solution is hard to find in an efficient time. They then take the algorithm approach and achieve one cryptographic security. And, by the magic of hashes and length preservation, they achieve more cryptographic security. This just goes to show that cryptographic algorithms are not perfect in themselves; they stack on top of each other to get the best of each one. Facing quantum adversaries with them is the same, not a process of isolation but rather a process of stacking and creating the big picture from the smallest one.

References:

Deep Dive Into a Post-Quantum Signature Scheme

Post Syndicated from Goutam Tamvada original https://blog.cloudflare.com/post-quantum-signatures/

Deep Dive Into a Post-Quantum Signature Scheme

Deep Dive Into a Post-Quantum Signature Scheme

To provide authentication is no more than to assert, to provide proof of, an identity. We can claim who we claim to be but if there is no proof of it (recognition of our face, voice or mannerisms) there is no assurance of that. In fact, we can claim to be someone we are not. We can even claim we are someone that does not exist, as clever Odysseus did once.

The story goes that there was a man named Odysseus who angered the gods and was punished with perpetual wandering. He traveled and traveled the seas meeting people and suffering calamities. On one of his trips, he came across the Cyclops Polyphemus who, in short, wanted to eat him. Clever Odysseus got away (as he usually did) by wounding the cyclops’ eye. As he was wounded, he asked for Odysseus name to which the latter replied:

“Cyclops, you asked for my glorious name, and I will tell it; but do give the stranger’s gift, just as you promised. Nobody I am called. Nobody they called me: by mother, father, and by all my comrades”

(As seen in The Odyssey, book 9. Translation by the authors of the blogpost).

The cyclops believed that was Odysseus’ name (Nobody) and proceeded to tell everyone, which resulted in no one believing him. “How can nobody have wounded you?” they questioned the cyclops. It was a trick, a play of words by Odysseus. Because to give an identity, to tell the world who you are (or who you are pretending to be) is easy. To provide proof of it is very difficult. The cyclops could have asked Odysseus to prove who he was, and the story would have been different. And Odysseus wouldn’t have left the cyclops laughing.

Deep Dive Into a Post-Quantum Signature Scheme

In the digital world, proving your identity is more complex. In face-to-face conversations, we can often attest to the identity of someone by knowing and verifying their face, their voice, or by someone else introducing them to us. From computer to computer, the scenario is a little different. But there are ways. When a user connects to their banking provider on the Internet, they need assurance not only that the information they send is secured; but that they are also sending it to their bank, and not a malicious website masquerading as their provider. The Transport Layer Security (TLS) protocol provides this through digitally signed statements of identity (certificates). Digital signature schemes also play a central role in DNSSEC as well, an extension to the Domain Name System (DNS) that protects applications from accepting forged or manipulated DNS data, which is what happens during DNS cache poisoning, for example.

A digital signature is a demonstration of authorship of a document, conversation, or message sent using digital means. As with “regular” signatures, they can be publicly verified by anyone that knows that it is a signature made by someone.

A digital signature scheme is a collection of three algorithms:

  1. A key generation algorithm, Generate, which generates a public verification key and a private signing key (a keypair).
  2. A signing algorithm, Sign, which takes the private signing key, a message, and outputs a signature of the message.
  3. A verification algorithm, Verify, which takes the public verification key, the signature and the message, and outputs a value stating whether the signature is valid or not.

In the case of the Odysseus’ story, what the cyclops could have done to verify his identity (to verify that he indeed was Nobody) was to ask for a proof of identity: for example, for other people to vouch that he is who he claims to be. Or he could have asked for a digital signature (attested by several people or registered as his own) attesting he was Nobody. Nothing like that happened, so the cyclops was fooled.

In the Transport Layer Security protocol, TLS, authentication needs to be executed at the time a connection or conversation is established (as data sent after this point will be authenticated until that is explicitly disabled), rather than for the full lifetime of the data (as with confidentiality). Because of that, the need to transition to post-quantum signatures is not as urgent as it is for post-quantum key exchange schemes, and we do not believe there are sufficiently powerful quantum computers at the moment that can be used to listen in on connections and forge signatures. At some point, that will no longer be true, and the transition will have to be made.

There are various candidates for authentication schemes (including digital signatures) that are quantum secure: some use cryptographic hash functions, some use problems over lattices, while others use techniques from the field of multi-party computation. It is also possible to use Key Encapsulation Mechanisms (or KEMs) to achieve authentication in cryptographic protocols.

In this post, much like in the one about Key Encapsulation Mechanisms, we will give a bird’s-eye view of the construction of one particular post-quantum signature algorithm. We will discuss CRYSTALS-Dilithium, as an example of how a signature scheme can be constructed. Dilithium is a finalist candidate in the NIST post-quantum cryptography standardization process and provides an example of a standard technique used to construct digital signature schemes. We chose to explain Dilithium here as it is a finalist and its design is straightforward to explain.

We will again build the algorithm up layer-by-layer. We will look at:

  • Its mathematical underpinnings: as we see in other blog posts, a cryptographic algorithm can be built as a Matryoshka doll or a Chinese box. Let us use the Chinese box analogy here. The first box, in this case, is the mathematical base, whose hardness should be strong so that security is maintained. In the post-quantum world, this is usually the hardness of some lattice or isogeny problems.
  • Its algorithmic construction: these are all the subsequent boxes that take the mathematical base and construct an algorithm out of it. In the case of a signature, first one constructs an identification scheme, which we will define in the next sections, and then transform it to a signature scheme using the Fiat-Shamir transformation.
Deep Dive Into a Post-Quantum Signature Scheme

The mathematical core of Dilithium is, as with FrodoKEM, based on the hardness of a variant of the Learning with Errors (LWE) problem and the Short Integer Solution (SIS) problem. As we have already talked about LWE, let’s now briefly go over SIS.

Note to the reader: Some mathematics is coming in the next sections; but don’t worry, we will guide you through it.

The Short Integer Solution Problem

In order to properly explain what the SIS problem is, we need to first start by understanding what a lattice is. A lattice is a regular repeated arrangement of objects or elements over a space. In geometry, these objects can be points; in physics, these objects can be atoms. For our purposes, we can think of a lattice as a set of points in n-dimensional space with a periodic (repeated) structure, as we see in the image. It is important to understand the meaning of n-dimensional space here: a two-dimensional space is, for example, the one that we often see represented on planes: a projection of the physical universe into a plane with two dimensions which are length and width. Historically, lattices have been investigated since the late 18th century for various reasons. For a more comprehensive introduction to lattices, you can read this great paper.

Deep Dive Into a Post-Quantum Signature Scheme
Picture of a lattice. They are found in the wild in Portugal.

What does SIS pertain to? You are given a positive integer q and a matrix (a rectangular array of numbers) A of dimensions n x m (the number of rows is n and the number of columns is m), whose elements are integers between 0 and a number q. You are then asked to find a vector r (smaller than a certain amount, called the “norm bound”) such that Ar = 0. The conjecture is that, for a sufficiently large n, finding this solution is hard even for quantum computers. This problem is “dual” to the LWE problem that we explored in another blog post.

Deep Dive Into a Post-Quantum Signature Scheme

We can define this same problem over a lattice. Take a lattice L(A), made up of m different n-dimensional vectors y (the repeated elements). The goal is to find non-zero vectors in the lattice such that Ay = 0 (mod q) (for some q), whose size is less than a certain specified amount. This problem can be seen as trying to find the “short” solutions in the lattice, which makes the problem the Short Vector Problem (SVP) in the average case. Finding this solution is simple to do in two dimensions (as seen in the diagram), but finding the solution in more dimensions is hard.

Deep Dive Into a Post-Quantum Signature Scheme
The SIS problem as the SVP. The goal is to find the “short” vectors in the radius.

The SIS problem is often used in cryptographic constructions such as one-way functions, collision resistant hash functions, digital signature schemes, and identification schemes.

We have now built the first Chinese box: the mathematical base. Let’s take this base now and create schemes from it.

Identification Schemes

From the mathematical base of our Chinese box, we build the first computational algorithm: an identification scheme. An identification scheme consists of a key generation algorithm, which outputs a public and private key, and an interactive protocol between a prover P and a verifier V. The prover has access to the public key and private key, and the verifier only has access to the public key. A series of messages are then exchanged such that the prover can demonstrate to the verifier that they know the private key, without leaking any other information about the private key.

More specifically, a three-move (three rounds of interaction) identification scheme is a collection of algorithms. Let’s think about it in the terms of Odysseus trying to prove to the cyclops that he is Nobody:

  • Odysseus (the prover) runs a key generation algorithm, Generate, that outputs a public and private keypair.
  • Odysseus then runs a commitment algorithm, Commit, that uses the private key, and outputs a commitment Y. The commitment is nothing more than a statement that this specific private key is the one that will be used. He sends this to the cyclops.
  • The cyclops (the verifier) takes the commitment and runs a challenge algorithm, Challenge, and outputs a challenge c. This challenge is a question that asks: are you really the owner of the private key?
  • Odysseus receives the challenge and runs a response algorithm, Response. This outputs a response z to the challenge. He sends this value to the cyclops.
  • The cyclops runs the verification algorithm, Verify, which outputs either accept (1) or reject (0) if the answer is correct.

If Odysseus was really the owner of the private key for Nobody, he would have been able to answer the challenge in a positive manner (with a 1). But, as he is not, he runs away (and this is the last time we see him in this blogpost).

The Dilithium Identification Scheme

The basic building blocks of Dilithium are polynomials and rings. This is the second-last box of the Chinese box, and we will explore it now.

A polynomial ring, R, is a ring of all polynomials. A ring is a set in which two operations can exist: addition and multiplication of integers; and a polynomial is an expression of variables and coefficients. The “size” of these polynomials, defined as the size of the largest coefficient, plays a crucial role for these kinds of algorithms.

In the case of Dilithium, the Generation algorithm creates a k x l matrix A. Each entry of this matrix is a polynomial in the defined ring. The generation algorithm also creates random private vectors s1 and s2, whose components are elements of R, the ring. The public key is the matrix A and t = As1 + s2. It is infeasible for a quantum computer to know the secret values given just t and A. This problem is called Module-Learning With Errors (MLWE) problem, and it is a variant of LWE as seen in this blog post.

Armed with the public and private keys, the Dilithium identification scheme proceeds as follows (some details are left out for simplicity, like the rejection sampling):

  1. The prover wants to prove they know the private key. They generate a random secret nonce y whose coefficient is less than a security parameter. They then compute Ay and set a commitment w1 to be the “high-order”1 bits of the coefficients in this vector.
  2. The verifier accepts the commitment and creates a challenge c.
  3. The prover creates the potential signature z = y + cs1 (notice the usage of the random secret nonce and of the private key) and performs checks on the sizes of several parameters which makes the signature secure. This is the answer to the challenge.
  4. The verifier receives the signature and computes w1 to be the “high-order” bits of Az−ct (notice the usage of the public key). They accept this answer if all the coefficients of z are less than the security parameter, and if w1 is equal to w0.

The identity scheme previously mentioned is an interactive protocol that requires participation from both parties. How do we turn this into a non-interactive signature scheme where one party issues signatures and other parties can verify them (the reason for this conversation is that anyone should be able to publicly verify)? Here, we place the last Chinese box.

A three-move identification scheme can be turned into a signature scheme using the Fiat–Shamir transformation: instead of the verifier accepting the commitment and sending a challenge c, the prover computes the challenge as a hash H(M || w1)  of the message M and of the value w1 (computed in step 1 of the previous scheme). This is an approach in which the signer has created an instance of a lattice problem, which only the signer knows the solution to.

This in turn means that if a message was signed with a key, it could have only been signed by the person with access to the private key, and it can be verified by anyone with access to the public key.

How is this procedure related to the lattice’s problems we have seen? It is used to prove the security of the scheme: specifically the M-SIS (module SIS) problem and the LWE decisional problem.

The Chinese box is now constructed, and we have a digital signature scheme that can be used safely in the face of quantum computers.

Other Digital Signatures beyond Dilithium

In Star Trek, Dilithium is a rare material that cannot be replicated. Similarly, signatures cannot be replicated or forged: each one is unique. But this does not mean that there are no other algorithms we can use to generate post-quantum signatures. Dilithium is currently designated as a finalist candidate for standardization as part of the post-quantum NIST process. But, there are others:

  • Falcon, another lattice-based candidate, based on NTRU lattices.
  • Rainbow, a scheme based on multivariate polynomials.

We have seen examples of KEMs in other blog posts and signatures that are resistant to attacks by quantum computers. Now is the time to step back and take a look at the bigger picture. We have the building blocks, but the problem of actually building post-quantum secure cryptographic protocols with them remains, as well as making existing protocols such as TLS post-quantum secure. This problem is not entirely straightforward, owing to the trade-offs that post-quantum algorithms present. As we have carefully stitched together mathematical problems and cryptographic tools to get algorithms with the properties we desire, so do we have to carefully compose these algorithms to get the secure protocols that we need.

References:

…..

1This “high-order” and “low-order” procedure decomposes a vector, and there is a specific procedure for this for Dilithium. It aims to reduce the size of the public key.

The Post-Quantum State: a taxonomy of challenges

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/post-quantum-taxonomy/

The Post-Quantum State: a taxonomy of challenges

The Post-Quantum State: a taxonomy of challenges

At Cloudflare, we help to build a better Internet. In the face of quantum computers and their threat to cryptography, we want to provide protections for this future challenge. The only way that we can change the future is by analyzing and perusing the past. Only in the present, with the past in mind and the future in sight, can we categorize and unfold events. Predicting, understanding and anticipating quantum computers (with the opportunities and challenges they bring) is a daunting task. We can, though, create a taxonomy of these challenges, so the future can be better unrolled.

This is the first blog post in a post-quantum series, where we talk about our past, present and future “adventures in the Post-Quantum land”. We have written about previous post-quantum efforts at Cloudflare, but we think that here first we need to understand and categorize the problem by looking at what we have done and what lies ahead. So, welcome to our adventures!

A taxonomy of the challenges ahead that quantum computers and their threat to cryptography bring (for more information about it, read our other blog posts) could be a good way to approach this problem. This taxonomy should not only focus at the technical level, though. Quantum computers fundamentally change the way certain protocols, properties, storage and retrieval systems, and infrastructure need to work. Following the biological tradition, we can see this taxonomy as the idea of grouping together problems into spaces and the arrangement of them into a classification that helps to understand what and who the problems impact. Following the same tradition, we can use the idea of kingdoms to classify those challenges. The kingdoms are (in no particular order):

  1. Challenges at the Protocols level, Protocolla
  2. Challenges at the implementation level, Implementa
  3. Challenges at the standardization level, Regulae
  4. Challenges at the community level: Communitates
  5. Challenges at the research level, Investigationes

Let’s explore them one by one.

The Post-Quantum State: a taxonomy of challenges
The taxonomy tree of the post-quantum challenges.

Challenges at the Protocols level, Protocolla

One conceptual model of the Internet is that it is composed of layers that stack on top of each other, as defined by the Open Systems Interconnection (OSI) model. Communication protocols in each layer enable parties to interact with a corresponding one at the same layer. While quantum computers are a threat to the security and privacy of digital connections, they are not a direct threat to communication itself (we will see, though, that one of the consequences of the existence of quantum computers is how new algorithms that are safe to their attacks can impact the communication itself). But, if the protocol used in a layer aims to provide certain security or privacy properties, those properties can be endangered by a quantum computer. The properties that these protocols often aim to provide are confidentiality (no one can read the content of communications), authentication (communication parties are assured who they are talking to) and integrity (the content of the communication is assured to have not been changed in transit).

Well-known examples of protocols that aim to provide security or privacy properties to protect the different layers are:

We know how to provide those properties (and protect the protocols) against the threat of quantum computers: the solution is to use post-quantum cryptography, and, for this reason, the National Institute of Standards and Technology (NIST) has been running an ongoing process to choose the most suitable algorithms. At this protocol level, then, the problem doesn’t seem to be adding these new algorithms, as nothing impedes it from a theoretical perspective. But protocols and our connections often have other requirements or constraints. Requirements or constraints are, for example, that data to be exchanged fits into a specific packet or segment size. In the case of the Transport Control Protocol (TCP), which ensures that data packets are delivered and received in order, for example, the Maximum Segment SizeMSS — sets the largest segment size that a network-connected device can receive and, if a packet exceeds the MSS, it is dropped (certain middleboxes or software desire all data to fit into a single segment). IPv4 hosts are required to handle an MSS of 536 bytes and IPv6 hosts are required to handle an MSS of 1220 bytes. In the case of DNS, it has primarily answered queries over the User Datagram Protocol (UDP), which limits the answer size to 512 bytes unless the extension EDNS is in use. Larger packets are subject to fragmentation or to the request that TCP should be used: this results in added round-trips, which should be avoided.

Another important requirement is that the operations that algorithms execute (such as multiplication or addition) or the resources they consume (which can be time but also space/memory) are fast enough that connection times are not impacted. This is even more pressing when fast, reliable and cheap Internet access is not available, as access to this “type” of Internet is not globally available.

TLS is a protocol that needs to handle heavy HTTPS traffic load and one that gets impacted by the additional cost that cryptographic operations add. Since 2010, TLS has been deemed “not computationally expensive any more” due to the usage of fast cryptography implementations and abbreviated handshakes (by using resumption, for example). But what if we add post-quantum algorithms into the TLS handshake? Some post-quantum algorithms can be suitable, while others might not be.

Many of the schemes that are part of the third round of the NIST post-quantum process, that can be used for confidentiality, seem to have encryption and decryption performance time comparable (or faster) to fast elliptic-curve cryptography. This in turn means that, from this point of view, they can be practically used. But, what about storage or transmission of the public/private keys that they produce (an attack can be mounted, for example, to force a server into constantly storing big public keys, which can lead to a denial-of-service attack or variants of the SYN flood attack)? What about their impacts on bandwidth and latency? Does having larger keys that span multiple packets (or congestion windows) affect performance especially in scenarios with degraded networks or packet loss?

In 2019, we ran a wide-scale post-quantum experiment with Google’s Chrome Browser team to test these ideas. The goal of that experiment was to assess, in a controlled environment, the impact of adding post-quantum algorithms to the Key Exchange phase of TLS handshakes. This investigation gave us some good insight into what kind of post-quantum algorithms can be used in the Key Agreement part of a TLS handshake (mainly lattice-based schemes, or so it seems) and allowed us to test them in a real-world setting. It is worth noting that these algorithms were tested in a “hybrid mode”: a combination of classical cryptographic algorithms with post-quantum ones.

The key exchange of TLS ensures confidentiality: it is the most pressing task to update it as non-post-quantum traffic captured today can be decrypted by a quantum computer in the future. Luckily, many of the post-quantum Key Encapsulation Mechanisms (KEMs) that are under consideration by NIST seem to be well suited with minimal performance impact. Unfortunately, a problem that has arisen before when considering protocol changes is the presence and behavior of old servers and clients, and of “middleboxes” (devices that manipulate traffic —by inspecting— it for purposes other than continuing the communication process). For instance, some middleboxes assume that the first message of a handshake from a browser fits in a single network packet, which is not the case when adding the majority of post-quantum KEMs. Such false assumptions (“protocol ossification”) are not a problem unique to post-quantum cryptography: the TLS 1.3 standard is carefully crafted to work around quirks of older clients, servers and middleboxes.

While all the data seems to suggest that replacing classical cryptography by post-quantum cryptography in the key exchange phase of TLS handshakes is a straightforward exercise, the problem seems to be much harder for handshake authentication (or for any protocol that aims to give authentication, such as DNSSEC or IPsec). The majority of TLS handshakes achieve authentication by using digital signatures generated via advertised public keys in public certificates (what is called “certificate-based” authentication). Most of the post-quantum signature algorithms currently being considered for standardization in the NIST post-quantum process, have signatures or public keys that are much larger than their classical counterparts. Their operations’ computation time, in the majority of cases, is also much bigger. It is unclear how this will affect the TLS handshake latency and round-trip times, though we have a better insight now in respect to which sizes can be used. We still need to know how much slowdown will be acceptable for early adoption.

There seems to be several ways by which we can add post-quantum cryptography to the authentication phase of TLS. We can:

  • Change the standard to reduce the number of signatures needed.
  • Use different post-quantum signature schemes that fit.
  • Or achieve authentication in a novel way.

On the latter, a novel way to achieve certificate-based TLS authentication is to use KEMs, as their post-quantum versions have smaller sizes than post-quantum signatures. This mechanism is called KEMTLS and we ran a controlled experiment showing that it performs well, even when it adds an extra or full round trip to the handshake (KEMTLS adds half a round trip for server-only authentication and a full round-trip for mutual authentication). It is worth noting that we only experimented with replacing the authentication algorithm in the handshake itself and not all the authentication algorithms needed for the certificate chain. We used a mechanism called “delegated credentials” for this: since we can’t change the whole certificate chain to post-quantum cryptography (as it involves other actors beyond ourselves), we use this short-lived credential that advertises new algorithms. More details around this experiment can be found in our paper.

Lastly, on the TLS front, we wanted to test the notion that having bigger signatures (such as the post-quantum ones) noticeably impacts TLS handshake times. Since it is difficult to deploy post-quantum signatures to real-world connections, we found a way to emulate bigger signatures without having to modify clients. This emulation was done by using dummy data. The result of this experiment showed that even if large signatures fit in the TCP congestion window, there will still be a double-digit percentage slowdown due to the relatively low average Internet speed. This slowdown is a hard sell for browser vendors and for content servers to adopt. The ideal situation for early adoption seems to be that the six signatures and two public keys of the TLS handshake fit together within 9kB (the signatures are: two in the certificate chain, one handshake signature, one OCSP staple and two SCTs for certificate transparency).

After this TLS detour, we can now list the challenges at this kingdom, Protocolla, level. The challenges (in no order in particular) seem to be (divided into sections):

Storage of cryptographic parameters used during the protocol’s execution:

  • How are we going to properly store post-quantum cryptographic parameters, such as keys or certificates, that are generated for/during protocol execution (their sizes are bigger than what we are accustomed to)?
  • How is post-quantum cryptography going to work with stateless servers, ones that do not store session state and where every client request is treated as a new one, such as NFS, Sun’s Network File System (for an interesting discussion on the matter, see this paper)?

Long-term operations and ephemeral ones:

  • What are the impacts of using post-quantum cryptography for long-term operations or for ephemeral ones: will bigger parameters make ephemeral connections a problem?
  • Are security properties assumed in protocols preserved and could we relax others (such as IND-CCA or IND-CPA. For an interesting discussion on the matter, see this paper)?

Managing bigger keys and signatures:

  • What are the impacts on latency and bandwidth?
  • Does the usage of post-quantum increase the roundtrips at the Network layer, for example? And, if so, are these increases tolerable?
  • Will the increased sizes cause dropped or fragmented packets?
  • Devices can occasionally have settings for packets smaller than expected: a router, for example, along a network path can have a maximum transmission unit, MTU (the MSS plus the TCP and IP headers), value set lower than the typical 1,500 bytes. In these scenarios, will post-quantum cryptography make these settings more difficult (one can apply MSS clamping for some cases)?

Preservation of protocols as we know them:

  • Can we achieve the same security or privacy properties as we use them today?
  • Can protocols change: should we change, for example, the way DNSSEC or the PKI work? Can we consider this radical change?
  • Can we integrate and deploy novel ways to achieve authentication?

Hardware (or novel alternative to hardware) usage during protocol’s execution:

Novel attacks:

  • Will post-quantum cryptography increase the possibility of mounting denial of service attacks?

Challenges at the Implementation level, Implementa

The second kingdom that we are going to look at is the one that deals with the implementation of post-quantum algorithms. The ongoing NIST process is standardizing post-quantum algorithms on two fronts: those that help preserve confidentiality (KEMs) and those that provide authentication (signatures). There are other algorithms not currently part of the process that already can be used in a post-quantum world, such as hash-based signatures (for example, XMSS).

What must happen for algorithms to be widely deployed? What are the steps they need to take in order to be usable by protocols or for data at rest? The usual path that algorithms take is:

  1. Standardization: usually by a standardization body. We will talk about it further in the next section.
  2. Efficiency at an algorithmic level: by finding new ways to speed up operations in an algorithm. In the case of Elliptic Curve Cryptography, for example, it happened with the usage of endomorphisms for faster scalar multiplication.
  3. Efficient software implementations: by identifying the pitfalls that cause increase in time or space consumption (in the case of ECC, this paper can illustrate these efforts), and fixing them. An optimal implementation is always dependent on the target, though: where it will be used.
  4. Hardware efficiency: by using, for example, hardware acceleration.
  5. Avoidance of attacks: by looking at the usual pitfalls of algorithms which, in practice, are side-channel attacks.

Implementation of post-quantum cryptography will follow (and is following) the same path. Lattice-based cryptography for KEMs, for example, has taken many steps in order to be faster than ECC (but, from a protocol level perspective, it is inefficient than them, as their parameters are bigger than ECC ones and might cause extra round-trips). Isogeny-based cryptography, on the other hand, is still too slow (due to long isogenies evaluation), but it is an active area of research.

The challenges at this kingdom, Implementa, (in no particular order) level are:

  • Efficiency of algorithms: can we make them faster at the software, hardware (by using  acceleration or FPGA-based research) or at an algorithmic level (with new data structures or parallelization techniques) to meet the requirements of network protocols and ever-fastest connections?
  • Can we use new mechanisms to accelerate algorithms (such as, for example, the usage of floating point numbers as in the Falcon signature scheme)? Will this lead to portability issues as it might be dependent on the underlying architecture?
  • What is the asymptotic complexity of post-quantum algorithms (how they impact time and space)?
  • How will post-quantum algorithms work on embedded devices due to their limited capacity (see this paper for more explanations)?
  • How can we avoid attacks, failures in security proofs and misuse of APIs?
  • Can we provide correct testing of these algorithms?
  • Can we ensure constant-time needs for the algorithms?
  • What will happen in a disaster-recovery mode: what happens if an algorithm is found to be weaker than expected or is fully broken? How will we be able to remove or update this algorithm? How can we make sure there are transition paths to recover from a cryptographical weakening?

At Cloudflare, we have also worked on implementation of post-quantum algorithms. We published our own library (CIRCL) that contains high-speed assembly versions of several post-quantum algorithms (like Kyber, Dilithium, SIKE and CSIDH). We believe that providing these implementations for public use will help others with the transition to post-quantum cryptography by giving easy-to-use APIs that developers can integrate into their projects.

Challenges at the Standards level, Regulae

The third kingdom deals with the standardization process as done by different bodies of organizations (such NIST or the Internet Engineering Task Force — IETF). We have talked a little about the matter in the previous section as it involves the standardization of both protocols and algorithms. Standardization can be a long process due to the need for careful discussion, and this discussion will be needed for the standardization of post-quantum algorithms. Post-quantum cryptography is based on mathematical constructions that are not widely known by the engineering community, which can then lead to difficulty levels when standardizing.

The challenges in this kingdom, Regulae, (in no particular order) are:

  • The mathematical base of post-quantum cryptography is an active area of development and research, and there are some concerns in the security they give (are there new attacks in the confidentiality or authentication they give?). How will standardization bodies approach this problem?
  • Post-quantum cryptography introduces new models in which to analyze the security of algorithms (for example, the usage of the Quantum Random Oracle Model). Will this mean that new attacks or adversaries will not be noted at the standards level?
  • What will be the recommendation of migrating to post-quantum cryptography from the standards’ perspective: will we use a hybrid approach?
  • How can we bridge the academic/research community into the standardization community, so analysis of protocols are executed and attacks are found on time (prior to being widely deployed)1? How can we make sure that standards bodies are informed enough to make the right practical/theoretical trade-offs?

At Cloudflare, we are closely collaborating with standardization bodies to prepare the path for post-quantum cryptography (see, for example, the AuthKEM draft at IETF).

Challenges at the Community level, Comunitates

The Internet is not an isolated system: it is a community of different actors coming together to make protocols and systems work. Migrating to post-quantum cryptography means sitting together as a community to update systems and understand the different needs. This is one of the reasons why, at Cloudflare, we are organizing a second installment of the PQNet workshop (expected to be colocated with Real World Crypto 2022 on April 2022) for experts on post-quantum cryptography to talk about the challenges of putting it into protocols, systems and architectures.

The challenges in this kingdom, Comunitates, are:

  • What are the needs of different systems? While we know what the needs of different protocols are, we don’t know exactly how all deployed systems and services work. Are there further restrictions?
  • On certain systems (for example, on the PKI), when will the migration happen, and how will it be coordinated?
  • How will the migration be communicated to the end-user?
  • How will we deprecate pre-quantum cryptography?
  • How will we integrate post-quantum cryptography into systems where algorithms are hardcoded (such as IoT devices)?
  • Who will maintain implementations of post-quantum algorithms and protocols? Is there incentive and funding for a diverse set of interoperable implementations?

Challenges at the Research level, Investigationes

Post-quantum cryptography is an active area of research. This research is not devoted only to how algorithms interact with protocols, systems and architectures (as we have seen), but it is heavily interested at the foundational level. The open challenges on this front are many. We will list four that are of the most interest to us:

  • Are there any efficient and secure post-quantum non-interactive key exchange (NIKE) algorithms?

NIKE is a cryptographic algorithm which enables two participants, who know each others’ public keys, to agree on a shared key, without requiring any interaction. An example of a NIKE is the Diffie-Hellman algorithm. There are no efficient and secure post-quantum NIKEs. A candidate seems to be CSIDH, which is rather slow and whose security is debated.

  • Are there post-quantum alternatives to (V)OPRFs based protocols, such as Privacy Pass or OPAQUE?
  • Are there post-quantum alternatives to other cryptographic schemes such as threshold signature schemes, credential based signatures and more?
  • How can post-quantum algorithms be formally verified with new notions such as the QROM?

Post-Quantum Future

The Post-Quantum State: a taxonomy of challenges
The future of Cloudflare is post-quantum.

What is the post-quantum future at Cloudflare? There are many avenues that we explore in this blog series. While all of these experiments have given us some good and reliable information for the post-quantum migration, we further need tests in different network environments and with broader connections. We also need to test how post-quantum cryptography fits into different architectures and systems. We are preparing a bigger, wide-scale post-quantum effort that will give more insight into what can be done for real-world connections.

In this series of blog posts, we will be looking at:

  • What has been happening in the quantum research world in the last years and how it impacts post-quantum efforts.
  • What is a post-quantum algorithm: explanations of KEMs and signatures, and their security properties.
  • How one integrates post-quantum algorithms into protocols.
  • What is formal verification, analysis and implementation, and how it is needed for the post-quantum transition.
  • What does it mean to implement post-quantum cryptography: we will look at our efforts making Logfwdr, Cloudflare Tunnel and Gokeyless post-quantum.
  • What does the future hold for a post-quantum Cloudflare.

See you in the next blogpost and prepare for a better, safer, faster and quantum-protected Internet!

If you are a student enrolled in a PhD or equivalent research program and looking for an internship for 2022, see open opportunities.

If you’re interested in contributing to projects helping Cloudflare, our engineering teams are hiring.

You can reach us with questions, comments, and research ideas at [email protected].

…….

1A success scenario of this is the standardization of TLS 1.3 (in comparison to TLS 1.0-1.2) as it involved the formal verification community, which helped bridge the academic-standards communities to good effect. Read the analysis of this novel process.

The quantum solace and spectre

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/quantum-solace-and-spectre/

The quantum solace and spectre

Not only is the universe stranger than we think, but it is stranger than we can think of
Werner Heisenberg

The quantum solace and spectre

Even for a physicist as renowned as Heisenberg, the universe was strange. And it was strange because several phenomena could only be explained through the lens of quantum mechanics. This field changed the way we understood the world, challenged our imagination, and, since the Fifth Solvay Conference in 1927, has been integrated into every explanation of the physical world (it is, to this day, our best description of the inner workings of nature). Quantum mechanics created a rift: every physical phenomena (even the most micro and macro ones) stopped being explained only by classical physics and started to be explained by quantum mechanics. There is another world in which quantum mechanics has not yet created this rift: the realm of computers (note, though, that manufacturers have been affected by quantum effects for a long time). That is about to change.

In the 80s, several physicists (including, for example, Richard Feynman and Yuri Manin) asked themselves these questions: are there computers that can, with high accuracy and in a reasonable amount of time, simulate physics? And, specifically, can they simulate quantum mechanics? These two questions started a long field of work that we now call “quantum computing”. Note that you can simulate quantum phenomena with classical computers. But we can simulate in a much faster manner with a quantum computer.

Any computer is already a quantum system, but a quantum computer uses the true computational power that quantum mechanics offers. It solves some problems much faster than a classical computer. Any computational problem solvable by a classical computer is also solvable by a quantum computer, as any physical phenomena (such as operations performed in classical computers) can be described using quantum mechanics. Contrariwise, any problem solvable by a quantum computer is also solvable by a classical computer: quantum computers provide no additional power over classical computers in terms of what they could theoretically compute. Quantum computers cannot solve undecidable problems, and they do not disprove the Church–Turing thesis. But, they are faster for specific problems! Let’s explore now why this is the case.

In a classical computer, data is stored in bits: 0 or 1, current on or current off, true or false. Nature is not that binary: any physical bit (current or not; particle or not; magnetized or not) is in fact a qubit and exists on a spectrum (or rather a sphere) of superpositions between 0 and 1. We have mentioned two terms that might be new to the reader: qubit and superposition. Let’s first define the second term.

A superposition is a normalized linear combination of all the possible configurations of something. What is this something? If we take the Schrödinger equation to explain, “for any isolated region of the universe that you want to consider, that equation describes the evolution in time of the state of that region, which we represent as a normalized linear combination — a superposition — of all the possible configurations of elementary particles in that region”, as Aaronson put it. Something, then, refers to all elementary particles. Does it mean, then, that we all are in a superposition? The answer to that question is open to many interpretations (for a nice introduction on the topic, one can read “Quantum Computing Since Democritus” by Scott Aaronson).

As you see, we are concerned then no longer on matter or energy or waves (as in classical physics), but rather on information and probabilities. In a quantum system, we first initialize information (like a bit) to some “easy” states, like 0 and 1. This information will be in a superposition of both states at the same time rather than just one or the other. If we examine a classical bit, we will either get the states of 1 or 0. Instead, with a qubit (quantum information) we can only get results with some probability. A qubit can exist in a continuum of states between ⟨0 and ⟨1 (using Dirac’s notation) all waiting to happen with some probability until it is observed.

The quantum solace and spectre
A qubit stores a combination of two or more states.

In day-to-day life, we don’t see this quantum weirdness because we’re looking at huge jittery ensembles of qubits where this weirdness gets lost in noise. However, if we turn the temperature way down and isolate some qubits from noise, we can do some proper quantum computation. For instance, to find a 256-bit decryption key, we can set 256 qubits to a superposition of all possible keys. Then, we can run the decryption algorithm, trying the key in the qubits. We end up with a superposition of many failures and one success. We did 2256 computations in a single go! Unfortunately, we can’t read out this superposition. In fact, if we open up the quantum computer to inspect the qubits inside, the outside noise will collapse the superposition into one outcome, which will most certainly be a failure. The real art of quantum computing is interfering this superposition with itself in a clever way to amplify out the answer we want. For the case we are talking about, Lov Grover figured out how to do it. Unfortunately, it requires about 2128 steps: we have a nice, but not exponential speed-up.

Quantum computers do not provide any additional power over classical computers in terms of computability: they don’t solve unsolvable problems. But they provide efficiency: they find solutions in faster time, even for algorithms that are too “expensive” for a non-quantum (classical) computer. A classical computer can solve any known computational problem if it is given enough time and resources. But, on some occasions, they will need so much time and so many resources that it will be deemed inefficient to try to solve this problem: waiting millions of years with millions of computers working at the same time trying to solve a problem is not something that we consider feasible to do. This idea of efficient and inefficient algorithms was made mathematically precise by the field of computational complexity. Roughly speaking, an efficient algorithm is one which runs in time polynomial in the size of the problem solved.

We can look at this differently. The many quantum states can be potentially observed (even in a classical computer), but states interfere with each other, which prevents the usage of statistical sampling to obtain the state. We, then, have to track every possible configuration that a quantum system can be in. Tracking every possible configuration is theoretically possible by a classical computer, but would exceed the memory capacity of even the most powerful classical computers. So, we need quantum computers to efficiently solve these problems.

Quantum computers are powerful. They can efficiently simulate quantum mechanics experiments, and are able to perform a number of mathematical, optimization, and search problems in a faster manner than classical computers. A sufficiently scaled quantum computer can, for example, factor large integers efficiently, as was shown by Shor. They also have significant advantages when it comes to searching on unstructured databases or solving the discrete logarithm problem. These advances are the solace: they efficiently advance computational problems that seem stalled. But, they are also a spectre, a threat: the problems that they efficiently solve are what is used as the core of much of cryptography as it exists today. They, hence, can break most of the secure connections we use nowadays. For a longer explanation of the matter, see our past blog post.

The realization of large-scale quantum computers will pose a threat to our connections and infrastructure. This threat is urgent and acts as a spectre of the future: an attacker can record and store secured traffic now and decrypt it later when a quantum computer exists. The spectre is ever more present if one recalls that migrating applications, architectures and protocols is a long procedure. Migrating these services to quantum-secure cryptography will take a lot of time.

In our other blog posts, we talk at length about what quantum computers are and what threats they pose. In this one, we want to give an overview of the advances in quantum computing and their threats to cryptography, as well as an update to what is considered quantum-secure cryptography.

The new upcoming decade of quantum computing

In our previous blog post, we mainly talked about improvements to reduce the cost of factoring integers and computing discrete logarithms during the middle of 2019. But recent years have brought new advances to the field. Google, for example, is aiming to build a “useful, error-corrected quantum computer” by the end of the decade. Their goal is to be able to better simulate nature and, specifically, to better simulate molecules. This research can then be used, for example, to create better batteries or generate better targeted medicines.

Why will it take them (and, generally, anyone) so long to create this quantum computer? Classical mechanisms, like Newtonian physics or classical computers, provide great detail and prediction of events and phenomena at a macroscopic level, which makes them perfect for our everyday computational needs. Quantum mechanics, on the other hand, are better manifested and understood at the microscopic level. Quantum phenomena are, furthermore, very fragile: uncontrolled interaction of quantum phenomena with the environment can make quantum features disappear, a process which is referred to as decoherence. Due to this, measurement of quantum states has to be controlled, properly prepared and isolated from its environment. Furthermore, measuring a quantum phenomenon disturbs its state, so reliable methods of computing information have to be developed.

Creating a computer that achieves computational tasks with these challenges is a daunting one. In fact, it may ultimately be impossible to completely eliminate errors in the computational tasks. It can even be an impossible task at a perfect level, as a certain amount of errors will persist. Reliable computation seems to be only achievable in a quantum computational model by employing error correction.

So, when will we know that a created quantum device is ready for the “quantum” use cases? One can use the idea of “quantum supremacy”, which is defined as the ability for a quantum device to perform a computational task that would be practically impossible (in terms of efficiency and usage of computational power) for classical computers. Nevertheless, this idea can be criticized as classical algorithms improve over time, and there is no clear definition on how to compare quantum computations with their classical counterparts (one must prove that no classical means would allow us to perform the same computational task in a faster time) in order to say that indeed quantum computations of the task are faster.

A molecule can indeed be claimed to be “quantum supreme” if researchers find it computationally prohibitive (unrealistically expensive) to solve the Schrödinger equation and calculate its ground state. That molecule constitutes a “useful quantum computer, for the task of simulating itself.” But, for computer science, this individual instance is not enough: “for any one molecule, the difficulty in simulating it classically might reflect genuine asymptotic hardness, but it might also reflect other issues (e.g., a failure to exploit special structure in the molecule, or the same issues of modeling error, constant-factor overheads, and so forth that arise even in simulations of classical physics)”, as stated by Aaronson and Chen.

By the end of 2019, Google argued they had reached quantum supremacy (with the quantum processor named “Sycamore”) but their claim was debated mainly due to the perceived lack of criteria to properly define when “quantum supremacy” is reached1. Google claimed that “a state-of-the-art supercomputer would require approximately 10,000 years to perform the equivalent task” and it was refuted by IBM by saying that “an ideal simulation of the same task can be performed on a classical system in 2.5 days and with far greater fidelity”. Regardless of this lack of criteria, this result in itself is a remarkable achievement and shows great interest of parties into creating a quantum computer. Last November, it was claimed that the most challenging sampling task performed on Sycamore can be accomplished within one week on the Sunway supercomputer. A crucial point has to be understood on the matter, though: “more like a week still seems to be needed on the supercomputer”, as noted by Aaronson. Quantum supremacy then is still not an irrevocable agreement.

Google’s quantum computer has already been applied to simulating real physical phenomena. Early in 2021, it was used to demonstrate a “time crystal” (a quantum system of particles which move in a regular, repeating motion without burning any energy to the environment). On the other hand, by late 2021, IBM unveiled its 127-qubit chip Eagle, but we don’t yet know how much gate fidelity (how close two operations are to each other) it really attains, as outlined by Aaronson in his blog post.

What do all of these advances mean for cryptography? Does it mean that soon all secure connections will be broken? If not “soon”, when will it be? The answers to all of these questions are complex and unclear. The Quantum Threat Timeline Report of 2020 suggests that the majority of researchers on the subject think that 15 to 40 years2 will be the time frame for when a “cryptography-breaking” quantum computer arrives. It is worth noting that these predictions are more optimistic than compared to the same report of 2019, which suggests that, due to the advances of the quantum computing field, researchers feel more assured of their imminent arrival.

The upcoming decades of breaking cryptography

The quantum solace and spectre

Quantum computing will shine a light into new areas of development and advances, but it will also expose secret areas: faster mechanisms to solve mathematical problems and break cryptography. The metaphor of the dawn can be used to show how close we are to quantum computers breaking some types of cryptography (how far are we to twilight), which will be catastrophic to the needs of privacy, communication and security we have.

It is worth noting again, though, that the advent of quantum computing is not a bad scenario in itself. Quantum computing will bring much progress in the form of advances to the understanding and mimicking of the fundamental way in which nature interacts. Their inconvenience is that they break most of the cryptography algorithms we use nowadays. They also have their limits: they cannot solve every problem, and in cryptography there are algorithms that will not be broken by them.

Let’s start by defining first what dawn is. Dawn is the arrival of stable quantum computers that break cryptography in a reasonable amount of time and space (explicitly: breaking the mathematical foundation of the p256 algorithm; although some research defines this as breaking RSA2048). It is also the time by which a new day arrives: a day that starts with new advances in medicine, for example, by better mimicking molecules’ nature with a computer.

How close to dawn are we? To answer this question, we have to look at the recent advances in the aim of breaking cryptography with a quantum computer. One of the most interesting advances is that the number 15 was factored in 2001 (it is 20 years already) and number 21 was factored in 2012 by a quantum computer. While this does not seem so exciting for us (as we can factor it in our minds), it is astonishing to see that a quantum computer was stable enough to do so.

Why do we care about computers factoring or not factoring integers? The reason is that they are the mathematical base of certain public key cryptography algorithms (like, RSA) and, if they are factored, the algorithm is broken.

Taking a step further, quantum computers are also able to break encryption algorithms themselves, such as AES, as we can use quantum efficient searching algorithms for the task. This kind of searching algorithm, Grover’s algorithm, solves the task of function inversion, which speeds up collision and preimage attacks (attacks that you don’t want to be sped up in your encryption algorithms). These attacks are related to hash functions: a collision attack is one that tries to find two messages that produce the same hash; a preimage attack is one that tries to find a message given its hash value. A smart evolution of this algorithm, using Hidden Variables as part of a computational model, will theoretically make searching much more efficient.

So what are the tasks needed for a quantum computer to break cryptography? It should:

  • Be able to factor big integers, or,
  • Be able to efficiently search for collisions or pre-images, or,
  • Be able to solve the discrete logarithm problem.

Why do we care about these quantum computers breaking cryptographic algorithms that we rarely hear of? What has it to do with the connections we make? Cryptographic algorithms are the base of the secure protocols that we use today. Every time we visit a website, we are using a broad suite of cryptographic algorithms that preserve the security, privacy and integrity of connections. Without them, we don’t have the properties we are used to when connecting over digital means.

What can we do? Do we now stop using digital means altogether? There are solutions. Of course, the easy answer will be, for example, to increase the size of integers to make them impossible to be factored by a quantum computer. But, is it really a usable algorithm? How slow will our connections be if we have to increase these algorithms exponentially?

As we get closer to twilight, we also get closer to the dawn of new beginnings, the rise of post-quantum cryptography.

The rise and state of post-quantum cryptography

As dawn approaches, researchers everywhere have been busy looking for solutions. If the mathematical underpinnings of an algorithm are what is broken by a quantum computer, can we just change them? Are there mathematical problems that are hard even for quantum computers? The answer is yes and, even more interestingly, we already know how to create algorithms out of them.

The mathematical problems that we use in this new upcoming era, called post-quantum, are of many kinds:

  • Lattices by using learning with errors (LWE) or ring learning with errors problems (rLWE).
  • Isogenies, which rely on properties of supersingular elliptic curves and supersingular isogeny graphs.
  • Multivariate cryptography, which relies on the difficulty of solving systems of multivariate equations.
  • Code-based cryptography, which relies on the theory of error-correcting codes.

With these mathematical constructions, we can build algorithms that can be used to protect our connections and data. How do we know, nevertheless, that the algorithms are safe? Even if a mathematical construction that an algorithm uses is safe from the quantum spectre, is the interaction with other parts of the system or in regard to adversaries safe? Who decides on this?

Because it is difficult to determine answers to these questions on an isolated level, the National Institute of Standards and Technology of the United States has been running, since 2016, a post-quantum process that will define how these algorithms work, their design, what properties they provide and how they will be used. We now have a set of finalists that are close to being standardized. The competition standardizes on two tracks: algorithms for key encapsulation mechanisms and public-key encryption (or KEMs) and algorithms for the authentication (mainly, digital signatures). The current finalists are:

For the public-key encryption and key-establishment algorithms:

  • Classic McEliece, a code-based cryptography scheme.
  • CRYSTALS-KYBER, a lattice based scheme.
  • NTRU, a lattice based scheme.
  • SABER, a lattice based scheme.

For the digital signature algorithms:

  • CRYSTALS-DILITHIUM, a lattice based scheme.
  • FALCON, a lattice based scheme.
  • Rainbow, a multivariate cryptography scheme.

What do these finalists show us? It seems that we can rely on the security given by lattice-based schemes (which is no slight to the security of any of the other schemes). It also shows us that in the majority of the cases, we prioritize schemes that can be used in the same way cryptography is used nowadays: with fast operations’ computation time, and with small parameters’ sizes. In the migration to post-quantum cryptography, it is not only about using safe mathematical constructions and secure algorithms but also about how they integrate into the systems, architectures, protocols, and expected speed and space efficiencies we currently depend on. More research seems to be needed in the signatures/authentication front and the inclusion of new innovative signature schemes (like SQISign or MAYO), and that is why NIST will be calling for a fourth round of its competition.

It is worth noting that there exists another flavor of upcoming cryptography called “quantum cryptography”, but its efficient dawn seems to be far away (though, some Quantum Key Distribution — QKD — protocols are being deployed now). Quantum cryptography is one that uses quantum mechanical properties to perform cryptographic tasks. It has mainly focused on QKD), but it seems to be too inefficient to be used in practice outside very limited settings. And even though, in theory, these schemes should be perfectly secure against any eavesdropper; in practice, the physical machines may still fall victim to side-channel attacks.

Is dawn, then, the collapse of cryptography? It can be, on one hand, as it is the collapse of most classical cryptography. But, on the other hand, it is also the dawn of new, post-quantum cryptography and the start of quantum computing for the benefit of the world. Will the process of reaching dawn be simple? Certainly, not. Migrating systems, architectures and protocols to new algorithms is an intimidating task. But we at Cloudflare and as a community are taking the steps, as you will read in other blog posts throughout this week.

References:

…….
1In the original paper, they estimate of 10,000 years is based on the observation that there is not enough Random Access Memory (RAM) to store the full state vector in a Schrödinger-type simulation and, then, it is needed to use a hybrid Schrödinger–Feynman algorithm, which is memory-efficient but exponentially more computationally expensive. In contrast, IBM used a Schrödinger-style classical simulation approach that uses both RAM and hard drive space.
2There is a 50% or more likelihood of “cryptography-breaking” quantum computers arriving. Slightly more than half of the researchers answered that, in the next 15 years, they will arrive. About 86% of the researchers answered that, in the next 20 years, they will arrive. All but one among the researchers answered that, in the next 30 years, they will arrive.
3Although, in practice, this seems difficult to attain.

KEMTLS: Post-quantum TLS without signatures

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/kemtls-post-quantum-tls-without-signatures/

KEMTLS: Post-quantum TLS without signatures

KEMTLS: Post-quantum TLS without signatures

The Transport Layer Security protocol (TLS), which secures most Internet connections, has mainly been a protocol consisting of a key exchange authenticated by digital signatures used to encrypt data at transport[1]. Even though it has undergone major changes since 1994, when SSL 1.0 was introduced by Netscape, its main mechanism has remained the same. The key exchange was first based on RSA, and later on traditional Diffie-Hellman (DH) and Elliptic-curve Diffie-Hellman (ECDH). The signatures used for authentication have almost always been RSA-based, though in recent years other kinds of signatures have been adopted, mainly ECDSA and Ed25519. This recent change to elliptic curve cryptography in both at the key exchange and at the signature level has resulted in considerable speed and bandwidth benefits in comparison to traditional Diffie-Hellman and RSA.

TLS is the main protocol that protects the connections we use everyday. It’s everywhere: we use it when we buy products online, when we register for a newsletter — when we access any kind of website, IoT device, API for mobile apps and more, really. But with the imminent threat of the arrival of quantum computers (a threat that seems to be getting closer and closer), we need to reconsider the future of TLS once again. A wide-scale post-quantum experiment was carried out by Cloudflare and Google: two post-quantum key exchanges were integrated into our TLS stack and deployed at our edge servers as well as in Chrome Canary clients. The goal of that experiment was to evaluate the performance and feasibility of deployment of two post-quantum key exchanges in TLS.

Similar experiments have been proposed for introducing post-quantum algorithms into the TLS handshake itself. Unfortunately, it seems infeasible to replace both the key exchange and signature with post-quantum primitives, because post-quantum cryptographic primitives are bigger, or slower (or both), than their predecessors. The proposed algorithms under consideration in the NIST post-quantum standardization process use mathematical objects that are larger than the ones used for elliptic curves, traditional Diffie-Hellman, or RSA. As a result, the overall size of public keys, signatures and key exchange material is much bigger than those from elliptic curves, Diffie-Hellman, or RSA.

How can we solve this problem? How can we use post-quantum algorithms as part of the TLS handshake without making the material too big to be transmitted? In this blogpost, we will introduce a new mechanism for making this happen. We’ll explain how it can be integrated into the handshake and we’ll cover implementation details. The key observation in this mechanism is that, while post-quantum algorithms have bigger communication size than their predecessors, post-quantum key exchanges have somewhat smaller sizes than post-quantum signatures, so we can try to replace signatures with key exchanges in some places to save space.  We will only focus on the TLS 1.3 handshake as it is the TLS version that should be currently used.

Past experiments: making the TLS 1.3 handshake post-quantum

KEMTLS: Post-quantum TLS without signatures

TLS 1.3 was introduced in August 2018, and it brought many security and performance improvements (notably, having only one round-trip to complete the handshake). But TLS 1.3 is designed for a world with classical computers, and some of its functionality will be broken by quantum computers when they do arrive.

The primary goal of TLS 1.3 is to provide authentication (the server side of the channel is always authenticated, the client side is optionally authenticated), confidentiality, and integrity by using a handshake protocol and a record protocol. The handshake protocol, the one of interest for us today, establishes the cryptographic parameters for securing and authenticating a connection. It can be thought of as having three main phases, as defined in RFC8446:

–  The Parameter Negotiation phase (referred to as ‘Server Parameters’ in RFC8446), which establishes other handshake parameters (whether the client is authenticated, application-layer protocol support, etc).

–  The Key Exchange phase, which establishes shared keying material and selects the cryptographic parameters to be used. Everything after this phase will be encrypted.

–  The Authentication phase, which authenticates the server (and, optionally, the client) and provides key confirmation and handshake integrity.

The main idea of past experiments that introduced post-quantum algorithms into the handshake of TLS 1.3 was to use them in place of classical algorithms by advertising them as part of the supported groups[2] and key share[3] extensions, and, therefore, establishing with them the negotiated connection parameters. Key encapsulation mechanisms (KEMs) are an abstraction of the basic key exchange primitive, and were used to generate the shared secrets. When using a pre-shared key, its symmetric algorithms can be easily replaced by post-quantum KEMs as well; and, in the case of password-authenticated TLS, some ideas have been proposed on how to use post-quantum algorithms with them.

Most of the above ideas only provide what is often defined as ‘transitional security’, because its main focus is to provide quantum-resistant confidentiality, and not to take quantum-resistant authentication into account. It is possible to use post-quantum signatures for TLS authentication, but the post-quantum signatures are larger than traditional ones. Furthermore, it is worth noting that using post-quantum signatures is much more expensive than using post-quantum KEMs.

We can estimate the impact of such a replacement on network traffic by simply looking at the sum of the cryptographic objects that are transmitted during the handshake. A typical TLS 1.3 handshake using elliptic curve X25519 and RSA-2048 would transmit 1,376 bytes, which would correspond to the public keys for key exchange, the certificate, the signature of the handshake, and the certificate chain. If we were to replace X25519 by the post-quantum KEM Kyber512 and RSA by the post-quantum signature Dilithium II, two of the more efficient proposals, the size transmitted data would increase to 10,036 bytes[4]. The increase is mostly due to the size of the post-quantum signature algorithm.

The question then is: how can we achieve full post-quantum security and give a handshake that is efficient to be used?

A more efficient proposal: KEMTLS

There is a long history of other mechanisms, besides signatures, being used for authentication. Modern protocols, such as the Signal protocol, the Noise framework, or WireGuard, rely on key exchange mechanisms for authentication; but they are unsuitable for the TLS 1.3 case as they expect the long-term key material to be known in advance by the interested parties.

The OPTLS proposal by Krawczyk and Wee authenticates the TLS handshake without signatures by using a non-interactive key exchange (NIKE). However, the only somewhat efficient construction for a post-quantum NIKE is CSIDH, the security of which is the subject of an ongoing debate. But we can build on this idea, and use KEMs for authentication. KEMTLS, the current proposed experiment, replaces the handshake signature by a post-quantum KEM key exchange. It was designed and introduced by Peter Schwabe, Douglas Stebila and Thom Wiggers in the publication ‘Post-Quantum TLS Without Handshake Signatures’.

KEMTLS, therefore, achieves the same goals as TLS 1.3 (authentication, confidentiality and integrity) in the face of quantum computers. But there’s one small difference compared to the TLS 1.3 handshake. KEMTLS allows the client to send encrypted application data in the second client-to-server TLS message flow when client authentication is not required, and in the third client-to-server TLS message flow when mutual authentication is required. Note that with TLS 1.3, the server is able to send encrypted and authenticated application data in its first response message (although, in most uses of TLS 1.3, this feature is not actually used). With KEMTLS, when client authentication is not required, the client is able to send its first encrypted application data after the same number of handshake round trips as in TLS 1.3.

Intuitively, the handshake signature in TLS 1.3 proves possession of the private key corresponding to the public key certified in the TLS 1.3 server certificate. For these signature schemes, this is the straightforward way to prove possession; another way to prove possession is through key exchanges. By carefully considering the key derivation sequence, a server can decrypt any messages sent by the client only if it holds the private key corresponding to the certified public key. Therefore, implicit authentication is fulfilled. It is worth noting that KEMTLS still relies on signatures by certificate authorities to authenticate the long-term KEM keys.

With KEMTLS, application data transmitted during the handshake is implicitly authenticated rather than explicitly (as in TLS 1.3), and has slightly weaker downgrade resilience and forward secrecy; but full downgrade resilience and forward secrecy are achieved once the KEMTLS handshake completes.

KEMTLS: Post-quantum TLS without signatures

By replacing the handshake signature by a KEM key exchange, we reduce the size of the data transmitted in the example handshake to 8,344 bytes, using Kyber512 and Dilithium II — a significant reduction. We can reduce the handshake size even for algorithms such as the NTRU-assumption based KEM NTRU and signature algorithm Falcon, which have a less-pronounced size gap. Typically, KEM operations are computationally much lighter than signing operations, which makes the reduction even more significant.

KEMTLS was presented at ACM CCS 2020. You can read more about its details in the paper. It was initially implemented in the RustTLS library by Thom Wiggers using optimized C/assembly implementations of the post-quantum algorithms provided by the PQClean and Open Quantum Safe projects.

Cloudflare and KEMTLS: the implementation

As part of our effort to show that TLS can be completely post-quantum safe, we implemented the full KEMTLS handshake in Golang’s TLS 1.3 suite. The implementation was done in several steps:

  1. We first needed to clone our own version of Golang, so we could add different post-quantum algorithms to it. You can find our own version here. This code gets constantly updated with every release of Golang, following these steps.
  2. We needed to implement post-quantum algorithms in Golang, which we did on our own cryptographic library, CIRCL.
  3. As we cannot force certificate authorities to use certificates with long-term post-quantum KEM keys, we decided to use Delegated Credentials. A delegated credential is a short-lasting key that the certificate’s owner has delegated for use in TLS. Therefore, they can be used for post-quantum KEM keys. See its implementation in our Golang code here.
  4. We implemented mutual auth (client and server authentication) KEMTLS by using Delegated Credentials for the authentication process. See its implementation in our Golang code here. You can also check its test for an overview of how it works.

Implementing KEMTLS was a straightforward process, although it did require changes to the way Golang handles a TLS 1.3 handshake and how the key schedule works.

A “regular” TLS 1.3 handshake in Golang (from the server perspective) looks like this:

func (hs *serverHandshakeStateTLS13) handshake() error {
    c := hs.c

    // For an overview of the TLS 1.3 handshake, see RFC 8446, Section 2.
    if err := hs.processClientHello(); err != nil {
   	 return err
    }
    if err := hs.checkForResumption(); err != nil {
   	 return err
    }
    if err := hs.pickCertificate(); err != nil {
   	 return err
    }
    c.buffering = true
    if err := hs.sendServerParameters(); err != nil {
   	 return err
    }
    if err := hs.sendServerCertificate(); err != nil {
   	 return err
    }
    if err := hs.sendServerFinished(); err != nil {
   	 return err
    }
    // Note that at this point we could start sending application data without
    // waiting for the client's second flight, but the application might not
    // expect the lack of replay protection of the ClientHello parameters.
    if _, err := c.flush(); err != nil {
   	 return err
    }
    if err := hs.readClientCertificate(); err != nil {
   	 return err
    }
    if err := hs.readClientFinished(); err != nil {
   	 return err
    }

    atomic.StoreUint32(&c.handshakeStatus, 1)

    return nil
}

We had to interrupt the process when the server sends the Certificate (sendServerCertificate()) in order to send the KEMTLS specific messages. In the same way, we had to add the appropriate KEM TLS messages to the client’s handshake. And, as we didn’t want to change so much the way Golang handles TLS 1.3, we only added one new constant to the configuration that can be used by a server in order to ask for the Client’s Certificate (the constant is serverConfig.ClientAuth = RequestClientKEMCert).

The implementation is easy to work with: if a delegated credential or a certificate has a public key of a supported post-quantum KEM algorithm, the handshake will proceed with KEMTLS. If the server requests a Client KEMTLS Certificate, the handshake will use client KEMTLS authentication.

Running the Experiment

So, what’s next? We’ll take the code we have produced and run it on actual Cloudflare infrastructure to measure how efficiently it works.

Thanks

Many thanks to everyone involved in the project: Chris Wood, Armando Faz-Hernández, Thom Wiggers, Bas Westerbaan, Peter Wu, Peter Schwabe, Goutam Tamvada, Douglas Stebila, Thibault Meunier, and the whole Cloudflare Research team.

1It is worth noting that the RSA key transport in TLS ≤1.2 has the server only authenticated by RSA public key encryption, although the server’s RSA public key is certified using RSA signatures by Certificate Authorities.
2An extension used by the client to indicate which named groups -Elliptic Curve Groups, Finite Field Groups- it supports for key exchange.
3An extension which contains the endpoint’s cryptographic parameters.
4These numbers, as it is noted in the paper, are based on the round-2 submissions.

Round 2 post-quantum TLS is now supported in AWS KMS

Post Syndicated from Alex Weibel original https://aws.amazon.com/blogs/security/round-2-post-quantum-tls-is-now-supported-in-aws-kms/

AWS Key Management Service (AWS KMS) now supports three new hybrid post-quantum key exchange algorithms for the Transport Layer Security (TLS) 1.2 encryption protocol that’s used when connecting to AWS KMS API endpoints. These new hybrid post-quantum algorithms combine the proven security of a classical key exchange with the potential quantum-safe properties of new post-quantum key exchanges undergoing evaluation for standardization. The fastest of these algorithms adds approximately 0.3 milliseconds of overheard compared to a classical TLS handshake. The new post-quantum key exchange algorithms added are Round 2 versions of Kyber, Bit Flipping Key Encapsulation (BIKE), and Supersingular Isogeny Key Encapsulation (SIKE). Each organization has submitted their algorithms to the National Institute of Standards and Technology (NIST) as part of NIST’s post-quantum cryptography standardization process. This process spans several rounds of evaluation over multiple years, and is likely to continue beyond 2021.

In our previous hybrid post-quantum TLS blog post, we announced that AWS KMS had launched hybrid post-quantum TLS 1.2 with Round 1 versions of BIKE and SIKE. The Round 1 post-quantum algorithms are still supported by AWS KMS, but at a lower priority than the Round 2 algorithms. You can choose to upgrade your client to enable negotiation of Round 2 algorithms.

Why post-quantum TLS is important

A large-scale quantum computer would be able to break the current public-key cryptography that’s used for key exchange in classical TLS connections. While a large-scale quantum computer isn’t available today, it’s still important to think about and plan for your long-term security needs. TLS traffic using classical algorithms recorded today could be decrypted by a large-scale quantum computer in the future. If you’re developing applications that rely on the long-term confidentiality of data passed over a TLS connection, you should consider a plan to migrate to post-quantum cryptography before the lifespan of the sensitivity of your data would be susceptible to an unauthorized user with a large-scale quantum computer. As an example, this means that if you believe that a large-scale quantum computer is 25 years away, and your data must be secure for 20 years, you should migrate to post-quantum schemes within the next 5 years. AWS is working to prepare for this future, and we want you to be prepared too.

We’re offering this feature now instead of waiting for standardization efforts to be complete so you have a way to measure the potential performance impact to your applications. Offering this feature now also gives you the protection afforded by the proposed post-quantum schemes today. While we believe that the use of this feature raises the already high security bar for connecting to AWS KMS endpoints, these new cipher suites will impact bandwidth utilization and latency. However, using these new algorithms could also create connection failures for intermediate systems that proxy TLS connections. We’d like to get feedback from you on the effectiveness of our implementation or any issues found so we can improve it over time.

Hybrid post-quantum TLS 1.2

Hybrid post-quantum TLS is a feature that provides the security protections of both the classical and post-quantum key exchange algorithms in a single TLS handshake. Figure 1 shows the differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2. Hybrid post-quantum TLS 1.2 has three major differences from classical TLS 1.2:

  • The negotiated post-quantum key is appended to the ECDHE key before being used as the hash-based message authentication code (HMAC) key.
  • The text hybrid in its ASCII representation is prepended to the beginning of the HMAC message.
  • The entire client key exchange message from the TLS handshake is appended to the end of the HMAC message.
Figure 1: Differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2

Figure 1: Differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2

Some background on post-quantum TLS

Today, all requests to AWS KMS use TLS with key exchange algorithms that provide perfect forward secrecy and use one of the following classical schemes:

While existing FFDHE and ECDHE schemes use perfect forward secrecy to protect against the compromise of the server’s long-term secret key, these schemes don’t protect against large-scale quantum computers. In the future, a sufficiently capable large-scale quantum computer could run Shor’s Algorithm to recover the TLS session key of a recorded classical session, and thereby gain access to the data inside. Using a post-quantum key exchange algorithm during the TLS handshake protects against attacks from a large-scale quantum computer.

The possibility of large-scale quantum computing has spurred the development of new quantum-resistant cryptographic algorithms. NIST has started the process of standardizing post-quantum key encapsulation mechanisms (KEMs). A KEM is a type of key exchange that’s used to establish a shared symmetric key. AWS has chosen three NIST KEM submissions to adopt in our post-quantum efforts:

Hybrid mode ensures that the negotiated key is as strong as the weakest key agreement scheme. If one of the schemes is broken, the communications remain confidential. The Internet Engineering Task Force (IETF) Hybrid Post-Quantum Key Encapsulation Methods for Transport Layer Security 1.2 draft describes how to combine post-quantum KEMs with ECDHE to create new cipher suites for TLS 1.2.

These cipher suites use a hybrid key exchange that performs two independent key exchanges during the TLS handshake. The key exchange then cryptographically combines the keys from each into a single TLS session key. This strategy combines the proven security of a classical key exchange with the potential quantum-safe properties of new post-quantum key exchanges being analyzed by NIST.

The effect of hybrid post-quantum TLS on performance

Post-quantum cipher suites have a different performance profile and bandwidth usage from traditional cipher suites. AWS has measured bandwidth and latency across 2,000 TLS handshakes between an Amazon Elastic Compute Cloud (Amazon EC2) C5n.4xlarge client and the public AWS KMS endpoint, which were both in the us-west-2 Region. Your own performance characteristics might differ, and will depend on your environment, including your:

  • Hardware–CPU speed and number of cores.
  • Existing workloads–how often you call AWS KMS and what other work your application performs.
  • Network–location and capacity.

The following graphs and table show latency measurements performed by AWS for all newly supported Round 2 post-quantum algorithms, in addition to the classical ECDHE key exchange algorithm currently used by most customers.

Figure 2 shows the latency differences of all hybrid post-quantum algorithms compared with classical ECDHE alone, and shows that compared to ECDHE alone, SIKE adds approximately 101 milliseconds of overhead, BIKE adds approximately 9.5 milliseconds of overhead, and Kyber adds approximately 0.3 milliseconds of overhead.
 

Figure 2: TLS handshake latency at varying percentiles for four key exchange algorithms

Figure 2: TLS handshake latency at varying percentiles for four key exchange algorithms

Figure 3 shows the latency differences between ECDHE with Kyber, and ECDHE alone. The addition of Kyber adds approximately 0.3 milliseconds of overhead.
 

Figure 3: TLS handshake latency at varying percentiles, with only top two performing key exchange algorithms

Figure 3: TLS handshake latency at varying percentiles, with only top two performing key exchange algorithms

The following table shows the total amount of data (in bytes) needed to complete the TLS handshake for each cipher suite, the average latency, and latency at varying percentiles. All measurements were gathered from 2,000 TLS handshakes. The time was measured on the client from the start of the handshake until the handshake was completed, and includes all network transfer time. All connections used RSA authentication with a 2048-bit key, and ECDHE used the secp256r1 curve. All hybrid post-quantum tests used the NIST Round 2 versions. The Kyber test used the Kyber-512 parameter, the BIKE test used the BIKE-1 Level 1 parameter, and the SIKE test used the SIKEp434 parameter.

Item Bandwidth
(bytes)
Total
handshakes
Average
(ms)
p0
(ms)
p50
(ms)
p90
(ms)
p99
(ms)
ECDHE (classic) 3,574 2,000 3.08 2.07 3.02 3.95 4.71
ECDHE + Kyber R2 5,898 2,000 3.36 2.38 3.17 4.28 5.35
ECDHE + BIKE R2 12,456 2,000 14.91 11.59 14.16 18.27 23.58
ECDHE + SIKE R2 4,628 2,000 112.40 103.22 108.87 126.80 146.56

By default, the AWS SDK client performs a TLS handshake once to set up a new TLS connection, and then reuses that TLS connection for multiple requests. This means that the increased cost of a hybrid post-quantum TLS handshake is amortized over multiple requests sent over the TLS connection. You should take the amortization into account when evaluating the overall additional cost of using post-quantum algorithms; otherwise performance data could be skewed.

AWS KMS has chosen Kyber Round 2 to be KMS’s highest prioritized post-quantum algorithm, with BIKE Round 2, and SIKE Round 2 next in priority order for post-quantum algorithms. This is because Kyber’s performance is closest to the classical ECDHE performance that most AWS KMS customers are using today and are accustomed to.

How to use hybrid post-quantum cipher suites

To use the post-quantum cipher suites with AWS KMS, you need the preview release of the AWS Common Runtime (CRT) HTTP client for the AWS SDK for Java 2.x. Also, you will need to configure the AWS CRT HTTP client to use the s2n post-quantum hybrid cipher suites. Post-quantum TLS for AWS KMS is available in all AWS Regions except for AWS GovCloud (US-East), AWS GovCloud (US-West), AWS China (Beijing) Region operated by Beijing Sinnet Technology Co. Ltd (“Sinnet”), and AWS China (Ningxia) Region operated by Ningxia Western Cloud Data Technology Co. Ltd. (“NWCD”). Since NIST has not yet standardized post-quantum cryptography, connections that require Federal Information Processing Standards (FIPS) compliance cannot use the hybrid key exchange. For example, kms.<region>.amazonaws.com supports the use of post-quantum cipher suites, while kms-fips.<region>.amazonaws.com does not.

  1. If you’re using the AWS SDK for Java 2.x, you must add the preview release of the AWS Common Runtime client to your Maven dependencies.
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>aws-crt-client</artifactId>
        <version>2.14.13-PREVIEW</version>
    </dependency>
    

  2. You then must configure the new SDK and cipher suite in the existing initialization code of your application:
    if(!TLS_CIPHER_PREF_KMS_PQ_TLSv1_0_2020_07.isSupported()){
        throw new RuntimeException("Post Quantum Ciphers not supported on this Platform");
    }
    
    SdkAsyncHttpClient awsCrtHttpClient = AwsCrtAsyncHttpClient.builder()
              .tlsCipherPreference(TLS_CIPHER_PREF_KMS_PQ_TLSv1_0_2020_07)
              .build();
              
    KmsAsyncClient kms = KmsAsyncClient.builder()
             .httpClient(awsCrtHttpClient)
             .build();
             
    ListKeysResponse response = kms.listKeys().get();
    

Now, all connections made to AWS KMS in supported Regions will use the new hybrid post-quantum cipher suites! To see a complete example of everything set up, check out the example application here.

Things to try

Here are some ideas about how to use this post-quantum-enabled client:

  • Run load tests and benchmarks. These new cipher suites perform differently than traditional key exchange algorithms. You might need to adjust your connection timeouts to allow for the longer handshake times or, if you’re running inside an AWS Lambda function, extend the execution timeout setting.
  • Try connecting from different locations. Depending on the network path your request takes, you might discover that intermediate hosts, proxies, or firewalls with deep packet inspection (DPI) block the request. This could be due to the new cipher suites in the ClientHello or the larger key exchange messages. If this is the case, you might need to work with your security team or IT administrators to update the relevant configuration to unblock the new TLS cipher suites. We’d like to hear from you about how your infrastructure interacts with this new variant of TLS traffic. If you have questions or feedback, please start a new thread on the AWS KMS discussion forum.

Conclusion

In this blog post, I announced support for Round 2 hybrid post-quantum algorithms in AWS KMS, and showed you how to begin experimenting with hybrid post-quantum key exchange algorithms for TLS when connecting to AWS KMS endpoints.

More info

If you’d like to learn more about post-quantum cryptography check out:

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Alex Weibel

Alex is a Senior Software Engineer on the AWS Crypto Algorithms team. He’s one of the maintainers for Amazon’s TLS Library s2n. Previously, Alex worked on TLS termination and request proxying for S3 and the Elastic Load Balancing Service developing new features for customers. Alex holds a Bachelor of Science degree in Computer Science from the University of Texas at Austin.