Tag Archives: crypto

Sizing Up Post-Quantum Signatures

Post Syndicated from Bas Westerbaan original https://blog.cloudflare.com/sizing-up-post-quantum-signatures/

Sizing Up Post-Quantum Signatures

Sizing Up Post-Quantum Signatures

Quantum computers are a boon and a bane. Originally conceived by Manin and Feyman to simulate nature efficiently, large-scale quantum computers will speed-up innovation in material sciences by orders of magnitude. Consider the technical advances enabled by the discovery of new materials (with bronze, iron, steel and silicon each ascribed their own age!); quantum computers could help to unlock the next age of innovation. Unfortunately, they will also break the majority of the cryptography that’s currently used in TLS to protect our web browsing. They fall in two categories:

  1. Digital signatures, such as RSA, which ensure you’re talking to the right server.
  2. Key exchanges, such as Diffie–Hellman, which are used to agree on encryption keys.

A moderately-sized stable quantum computer will easily break the signatures and key exchanges currently used in TLS using Shor’s algorithm. Luckily this can be fixed: over the last two decades, there has been great progress in so-called post-quantum cryptography. “Post quantum”, abbreviated PQ, means secure against quantum computers. Five years ago, the standards institute NIST started a public process to standardise post-quantum signature schemes and key exchanges. The outcome is expected to be announced early 2022.

At Cloudflare, we’re not just following this process closely, but are also testing the real-world performance of PQ cryptography. In our 2019 experiment with Google, we saw that we can switch to a PQ key exchange with little performance impact. Among the NIST finalists, there are many with even better performance. This is good news, as we would like to switch to PQ key exchanges as soon as possible — indeed, an attacker could intercept sensitive data today, then keep and decrypt it years into the future using a quantum computer.

Why worry about PQ signatures today

One would think we can take it easy with signatures for TLS: we only need to have them replaced before a large quantum computer is built. The situation, however, is more complicated.

  • The lead time to change signatures is higher. Not only do we need to change the browsers and servers, we also need to change certificate authorities (CAs) and everyone’s certificate management.
  • TLS is addicted to small and fast signatures. For this page that you’re viewing we sent six signatures: two in the certificate chain; one handshake signature; one OCSP staple and finally two SCTs used for certificate transparency.
  • PQ signature schemes have wildly varying performance trade-offs and quirks (as we’ll see below) which stack up quickly with six signatures, which all have slightly different requirements.

One might ask: can’t we be clever and get rid of some of these signatures? We think so! For instance, we can replace the handshake signature with a smaller key exchange or suppress intermediate certificates. Such fundamental changes take years to be adopted. That is why we are also investigating the performance of plain TLS with drop-in PQ signatures.

So, what are our options?

The zoo of PQ signatures

The three finalists of the NIST competition are Dilithium, Falcon and Rainbow. In the table below we compare them against RSA and ECDSA, both of which are in common use today, and a selection of other PQ schemes that might see standardisation in the future.

Sizing Up Post-Quantum Signatures
(* There are many caveats to this table. We compare instances of PQC security level 1. Signing and verification times vary considerably by hardware platform and implementation constraints. They should be taken as a rough indication only. The signing time of Falcon512 is discussed later on. We do not list all relevant variants of the NIST alternates or promising schemes. This instance of XMSS can only sign a million messages, is stateful, requires quite a bit of storage for quick signing, is not standardised and thus far from a drop-in replacement. Rainbow has one other variant, which has smaller private keys.)

None of these PQ signatures are a clear-cut drop-in replacement. To start, all have (much) larger signatures, except for Rainbow, GeMMS and SQISign. Rainbow and GeMMS have huge public keys and SQISign is very slow.

TLS signatures

To confuse matters even more, the signatures within TLS are not all the same:

  • Online. Only the handshake signature is created with every incoming TLS connection, and so signing needs to be fast. Dilithium fits this role well.
  • Offline. All other signatures are made months/years in advance, and so signing time is not that important. This group splits in two:
    • With a public key. The certificate chain includes signatures and their public keys. Here Falcon seems most suited.
    • Without a public key. The remaining three (SCTs and OCSP staple) are just signatures. For these, Rainbow seems optimal, as its large public keys are not transmitted.

Using Dilithium, Falcon, and Rainbow, together, allows optimization for both speed and size simultaneously, which seems like a great idea. However, combining different signatures at the same time has disadvantages:

  • A security issue in the design or implementation of one of the signatures compromises the whole.
  • Clients need to implement multiple cryptographic algorithms, in this case three of them, which is troublesome for smaller devices — especially if separate hardware support is needed for each of them.

So do we really need to eke out every byte and every cycle of performance? Or can we stick to a single signature scheme for simplicity and security?

Can we pick just one?

If we stick to one signature scheme, looking just at the numbers, Falcon512 seems like a reasonable option. It needs 5KB of extra space (compared to a classical handshake), about the same as the Dilithium–Falcon–Rainbow chimera of before. Unfortunately Falcon comes with a caveat: creating signatures efficiently requires constant-time 64-bit floating point arithmetic. Without it, signing is 20x slower. But speed alone is not enough; it has to run in constant time. Without that, one can slowly learn the secret key by measuring the time it takes to create a signature.

Although PCs typically have a sufficiently constant-time floating-point unit, many smaller devices do not. Thus, Falcon seems ill-suited for general purpose online signatures.

What about Dilithium2? It needs 17KB extra — let’s find out if that makes a big difference.

Evidence by Experiment

All the different variables and constraints clearly complicate an already challenging puzzle. The best thing is to just try the options. Over the last few years several interesting papers have appeared studying the various options, such as SKD20, PST20, SKD21 and PKNLN22. These are great starts, but don’t provide a complete picture:

  • SCTs and OCSP staples have yet to be considered. Leaving half (three) of the signatures out changes the results significantly.
  • The networks tested or emulated offer insights, but are far from representative of real-world conditions. All tests were conducted between two datacenters (which does not include real-world last-mile conditions such as Wi-Fi or spotty mobile connections); or a network was simulated with unrealistic packet loss rates.

Here, Cloudflare can contribute. One of the things we like to do is to put new ideas in the community to the test on a global scale.

In this case we’re just taking a first step. Setting up a real-world experiment with a modified browser is quite involved, especially when we consider the many possible variations. Instead, as a first step, we decided first to investigate the most striking variable, the size, and try to answer the question:

How do larger signatures affect the TLS handshake?

There are two parts to this: how fast are they, and, more importantly, do they work at all?

Experimental setup

We need some way to emulate bigger signatures without having to modify the clients. We considered several options. The first idea we had was to pad a valid certificate with a dummy extension. That would require a custom certificate for each size to test, which is cumbersome. Then we considered responding with a dummy ServerHello extension. This is, however, not allowed by TLS 1.2 without a corresponding ClientHello extension. In the end, we went for adding dummy certificates.

Dummy certificates

These dummy certificates are 1kB self-signed invalid certificates that have nothing to do with the certificate chain. To vary the size to test, we simply add more copies. Adding unrelated certificates to the certificate chain is a common misconfiguration and clients have learnt to ignore them. In fact, TLS 1.3 stipulates that these (in rfc-speak) SHOULD be ignored by the client. Testing out hundreds of browsers, we saw no issues.

Standards and reality don’t always agree: when inserting dummy certificates on actual traffic, we saw issues with a small, but not insignificant number of clients. We don’t want to ruin anyone’s connection, and so we decided to use separate connections for this purpose.

Using challenge pages to launch probes

So what did we actually do? On a small percentage of the challenge pages (those with the CAPTCHA), we pick a number n and a random key and send this key in two separate background requests to:

  • 0.tls-size-experiment-c.cloudflareresearch.com
  • [n].tls-size-experiment-1.cloudflareresearch.com

The first, the control, is a normal webpage that stores the TLS handshake time under the key that’s been sent. The real action happens at the second, the live, which adds the n dummy certificates to its chain. The live also stores handshake time under the given key. We could call it “experimental” instead of “live”, but the benign control connection is also an important part of the experiment. Indeed, it allows us to see if live connections are missing. These endpoints were a breeze to write using Cloudflare Workers and KV.

How much dummy data to test?

Before launching the experiment, we tested several libraries and browsers on the live endpoint to see whether they would error due to the dummy certificates. None rejected a single certificate, but how far can we go? TLS 1.3 theoretically allows a certificate chain of 16MB, but in practice many clients reject a much shorter chain. OpenSSL, for instance, rejects one of 102kB. The most stingy we found is Go’s TLS client, which rejects a handshake larger than 64kB. Because of this, we tested with between 1 and 59 dummy certificates.

Intermezzo: TCP’s congestion window

So, what did we find? The graphs are in the next section, have a peek! Before diving right in, we would like to explain a crucial concept, the TCP congestion window, that helps us read the results.

Data sent over the Internet is broken down in packets of around 1.4kB that traverse many routers to reach their destination. Sometimes a router has more incoming packets than it can handle and it has to drop them — this is called congestion. To avoid causing congestion, TCP initially sends just a few packets (typically ten, so ~14kB). Then, with every acknowledgement received in return, the TCP sender will very quickly ramp up the number of packets that it keeps in flight. This number is called the congestion window (cwnd). When it gets too high, congestion occurs, packets are dropped and in response the sender backs off by dialing down the congestion window. Any dropped packet is seen as a sign of congestion by TCP. For this reason, Wi-Fi has its own retransmission mechanism transparent to TCP.

Considering all this, we would expect to see two effects with larger signatures:

  • Gentle slope. Every single packet needs some extra time to transmit, due to limited bandwidth and possible physical-layer retransmissions. This slope isn’t so gentle if your internet connection is slow or spotty.
  • cwnd wall. Once we fill the congestion window, we have to wait for a whole roundtrip before we can continue. This effect is stronger if the roundtrip time (RTT) is higher.

The strength of the two effects can differ. With a fast connection and high RTT we expect to see the graph below on the left. With a slow connection and low RTT, we expect the one on the right.

Sizing Up Post-Quantum Signatures

There might be other unknown effects. The best thing is to have a look.

In PQ research, the second effect has gained the most attention. The larger signatures simply do not fit in the initial congestion windows used today. A common suggestion in response has been to simply increase the initial congestion window to accommodate the larger signatures. This is far from a simple change to make globally, and we have to understand if this solves the problem to begin with.

Results

Over 24 days we’ve received 964,499 live connections from 454,218 different truncated IPs (to 24 bits, “/24”, for IPv4 and 48 bits for IPv6) and 11,239 different ASNs. First, let’s check how many clients had trouble with the bigger handshakes.

Can clients handle the larger handshakes?

The control connection was missing for 2.4% of the live connections. This is not alarming: we expect some connections to be missing for harmless reasons, such as the user browsing away from the challenge page. There are, however, significantly more live connections without control connection at 3.6%.

In the graph below on the left we break the number of received live connections down by the number of dummy certificates added. Because we pick the number of certificates randomly, the graph is noisy. To get a clearer picture, we started storing the number of certificates added in the corresponding control request, which gives us the graph on the right. The bumps at 10kB and 30kB suggest that there are clients or middleboxes that cannot handle these handshake sizes.

Sizing Up Post-Quantum Signatures

Handshake times with larger signatures

What is the effect on the handshake time? The graph on the left shows the weighted median and 75th percentile TLS handshake times for different amounts of dummy data added. We use the weight so that every truncated IP contributes equally. On the right we show the slowdowns for each size, relative to the handshake time of the control connection.

Sizing Up Post-Quantum Signatures

We can see the not-so-gentle slope until 40kB, where we hit a little wall that corresponds to Cloudflare’s default initial congestion window of 30 packets.

Adding 35kB fits within our initial congestion window. Nonetheless, the median handshake with 35kB extra is 40% slower. The slowest 10% are even worse off, taking 60% as much time. Thus even though we stay within the congestion window, the added data is not for free at all.

We can now translate these insights back to concrete PQ signatures. For example, using Dilithium2 as a drop-in replacement, we need around 17kB extra. That also fits within our initial congestion window with a median slowdown of 20%, which gets worse for the tail-end of users. For the normal initial congestion window of ten, we expect the slowdown to be much worse — around 60–80%.

There are several caveats to point out:

  • These experiments used an initial congestion window of 30 packets instead of ten. With a smaller initial congestion window of ten, which is the default for most systems, we would expect the wall to move from 40kB to around 10kB.
  • Because of our presence all across the world, our RTTs are fairly low. Thus the effect of the cwnd wall is smaller for us.
  • Challenge pages are served, by design, to those clients that we expect to be bots. This adds a significant bias because bots are generally hosted at well-connected providers, and so are closer than users.
  • HTTP/3 was not supported by the server we used for the endpoint. Support for IPv6 was only added ten days into the experiment and accounts for 10.9% of the measurements.
  • Actual TLS handshakes differ in size much more than tested in this setup due to differences in certificate sizes and extensions and other factors.

What have we learned?

The TLS handshake is just one step (~5–20%) in a long chain required to show you a webpage. Casually browsing, it would be hard to notice a TLS handshake that’s 60% slower. But such differences add up. To make a website really fast, you need many seemingly insignificant speedups. Browser developers take this seriously: only in exceptional cases does Chrome allow a change that slows down any microbenchmark by even a percent.

Because of the many parties and complexities involved, we should avoid waiting too long to adopt post-quantum signatures in TLS. That’s a hard sell if it comes at the price of a double-digit slowdown, not least to content servers but also to browser vendors and clients.

A timely adoption of PQ signatures on the web would be great. Our evidence so far suggests that this will be easiest, if six signatures and two public keys would fit in 9kB.

We will continue our efforts to help build a post-quantum secure Internet. To follow along, keep an eye on this blog or have a look at research.cloudflare.com.

Bas Westerbaan is co-submitter of the SPHINCS+ signature scheme.

References

SKD20: Sikeridis, Kampanakis, Devetsikiotis. Assessing the overhead of post-quantum cryptography in TLS 1.3 and SSH. CoNEXT’20.
PST20: Paquin, Stebila, Tamvada. Benchmarking Post-Quantum Cryptography in TLS. PQCrypto 2020.
SKD21: Sikeridis, Kampanakis, Devetsikiotis. Post-Quantum Authentication in TLS 1.3: A Performance Study. NDSS2020.
PKNLN22: Paul, Kuzovkova, Lahr, Niederhagen. Mixed Certificate Chains for the Transition to Post-Quantum Authentication in TLS 1.3. To appear in AsiaCCS 2022.

Using HPKE to Encrypt Request Payloads

Post Syndicated from Miguel de Moura original https://blog.cloudflare.com/using-hpke-to-encrypt-request-payloads/

Using HPKE to Encrypt Request Payloads

Using HPKE to Encrypt Request Payloads

The Managed Rules team was recently given the task of allowing Enterprise users to debug Firewall Rules by viewing the part of a request that matched the rule. This makes it easier to determine what specific attacks a rule is stopping or why a request was a false positive, and what possible refinements of a rule could improve it.

The fundamental problem, though, was how to securely store this debugging data as it may contain sensitive data such as personally identifiable information from submissions, cookies, and other parts of the request. We needed to store this data in such a way that only the user who is allowed to access it can do so. Even Cloudflare shouldn’t be able to see the data, following our philosophy that any personally identifiable information that passes through our network is a toxic asset.

This means we needed to encrypt the data in such a way that we can allow the user to decrypt it, but not Cloudflare. This means public key encryption.

Now we needed to decide on which encryption algorithm to use. We came up with some questions to help us evaluate which one to use:

  • What requirements do we have for the algorithm?
  • What language do we implement it in?
  • How do we make this as secure as possible for users?

Here’s how we made those decisions.

Algorithm Requirements

While we knew we needed to use public key encryption, we also needed to keep an eye on performance. This led us to select Hybrid Public Key Encryption (HPKE) early on as it has a best-of-both-worlds approach to using symmetric as well as public-key cryptography to increase performance. While these best-of-both-worlds schemes aren’t new [1][2][3], HPKE aims to provide a single, future-proof, robust, interoperable combination of a general key encapsulation mechanism and a symmetric encryption algorithm.

HPKE is an emerging standard developed by the Crypto Forum Research Group (CFRG), the research body that supports the development of Internet standards at the IETF. The CFRG produces specifications called RFCs (such as RFC 7748 for elliptic curves) that are then used in higher level protocols including two we talked about previously: ODoH and ECH. Cloudflare has long been a supporter of Internet standards, so HPKE was a natural choice to use for this feature. Additionally, HPKE was co-authored by one of our colleagues at Cloudflare.

How HPKE Works

HPKE combines an asymmetric algorithm such as elliptic curve Diffie-Hellman and a symmetric cipher such as AES. One of the upsides of HPKE is that the algorithms aren’t dictated to the implementer, but making a combination that’s provably secure and meets the developer’s intuitive notions of security is important. All too often developers reach for a scheme without carefully understanding what it does, resulting in security vulnerabilities.

HPKE solves these problems by providing a high level of security in a generic manner and providing necessary hooks to tie messages to the context in which they are generated. This is the application of decades of research into the correct security notions and schemes.

Using HPKE to Encrypt Request Payloads

HPKE is built in stages. First it turns a Diffie-Hellman key agreement into a Key Encapsulation Mechanism. A key encapsulation mechanism has two algorithms: Encap and Decap. The Encap algorithm creates a symmetric secret and wraps it in a public key, so that only the holder of the private key can unwrap it. An attacker with the encapsulation cannot recover the random key. Decap takes the encapsulation and the private key associated to the public key, and computes the same random key. This translation gives HPKE the flexibility to work almost unchanged with any kind of public key encryption or key agreement algorithm.

HPKE mixes this key with an optional info argument, as well as information relating to the cryptographic parameters used by each side. This ensures that attackers cannot modify messages’ meaning by taking them out of context. A postcard marked “So happy to see you again soon” is ominous from the dentist and endearing from one’s grandmother.

The specification for HPKE is open and available on the IETF website. It is on its way to becoming an RFC after passing multiple rounds of review and analysis by cryptography experts at the CFRG. HPKE is already gaining adoption in IETF protocols like ODoH, ECH, and the new Messaging Layer Security (MLS) protocol. HPKE is also designed with the post-quantum future since it is built to work with any KEM, including all the NIST finalists for post-quantum public-key encryption.

Implementation Language

Once we had an encryption scheme selected, we needed to settle on an implementation. HPKE is still fairly new, so the libraries aren’t quite mature yet. There is a reference implementation, and we’re in the process of developing an implementation in Go as part of CIRCL. However, in the absence of a clear “go to” that is widely known to be the best, we decided to go with an implementation leveraging the same language already powering much of the Firewall code running at the Cloudflare edge – Rust.

Aside from this, the language benefits from features like native primitives, and crucially the ability to easily compile to WebAssembly (WASM).

As we mentioned in a previous blog post, customers are able to generate a key pair and decrypt payloads either from the dashboard UI or from a CLI. Instead of writing and maintaining two different codebases for these, we opted to reuse the same implementation across the edge component that encrypts the payloads and the UI and CLI that decrypt them. To achieve this we compile our library to target WASM so it can be used in the dashboard UI code that runs in the browser. While this approach may yield a slightly larger JavaScript bundle size and relatively small computational overhead, we found it preferable to spending a significant amount of time securely re-implementing HPKE using JavaScript WebCrypto primitives.

The HPKE implementation we decided on comes with the caveat of not yet being formally audited, so we performed our own internal security review. We analyzed the cryptography primitives being used and the corresponding libraries. Between the composition of said primitives and secure programming practices like correctly zeroing memory and safe usage of random number generators, we found no security issues.

Making It Secure For Users

To encrypt on behalf of users, we need them to provide us with a public key. To make this as easy as possible, we built a CLI tool along with the ability to do it right in the browser. Either option allows the user to generate a public/private key pair without needing to talk to Cloudflare servers at all.

In our API, we specifically do not accept the private key of the key pair — we don’t want it! We don’t need and don’t want to be able to decrypt the data we’re storing.

For the dashboard, once the user provides the private key for decryption, the key is held in a temporary JavaScript variable and used for the in-browser decryption. This allows the user to not constantly have to provide the key while browsing the Firewall event logs. The private key is also not persisted in any way in the browser, so any action that refreshes the page such as refreshing or navigating away will require the user to provide the key again. We believe this is an acceptable usability compromise for better security.

How Payload Extraction Works

After deciding how to encrypt the data, we just had to figure out the rest of the feature: what data to encrypt, how to store and transmit it, and how to allow users to decrypt it.

When an HTTP request reaches the L7 Firewall, it is evaluated against a set of rulesets. Each of these rulesets contain several rules written in the wirefilter syntax.

An example of one such rule would be:

http.request.version eq "HTTP/1.1"
and
(
    http.request.uri.path matches "\n+."
    or
    http.request.uri.query matches "\x00+."
)

This expression evaluates to a boolean “true” for HTTP/1.1 requests that either contain one or more newlines followed by a character in the request path or one or more NULL bytes followed by a character in the query string.

Say we had the following request that would match the rule above:

GET /cms/%0Aadmin?action=%00post HTTP/1.1
Host: example.com

If matched data logging is enabled, the rules that match would be executed again in a special context that tags all fields that are accessed during execution. We do this second execution because this tagging adds a noticeable computational overhead, and since the vast majority of requests don’t trigger a rule at all we would be unnecessarily adding overhead to each request. Requests that do match any rules will only match a few rules as well, so we don’t need to re-execute a large portion of the ruleset.

You may notice that although http.request.uri.query matches "\x00+." evaluates to true for this request, it won’t be executed, because the expression short-circuits with the first or condition that also matches. This results in only http.request.version and http.request.uri.path being tagged as accessed:

http.request.version -> HTTP/1.1
http.request.uri.path -> /cms/%0Aadmin

Having gathered the fields that were accessed, the Firewall engine does some post-processing; removing fields that are a subset of others (e.g., the query string and the full URI), or truncating fields that are beyond a certain character length.

Finally, these get serialized as JSON, encrypted with the customer’s public key, serialized again as a set of bytes, and prefixed with a version number should we need to change/update it in the future. To simplify consumption of these blobs, our APIs display a base64 encoded version of the bytes:

Using HPKE to Encrypt Request Payloads

Now that we have encrypted the data at the edge and persisted it in ClickHouse, we need to allow users to decrypt it. As part of the setup of turning this feature on, users generated a key-pair: the public key which was used to encrypt the payloads and a private key which is used to decrypt them. Decryption is done completely offline via either the command line using cloudflare/matched-data-cli:

$ MATCHED_DATA=AkjQDktMX4FudxeQhwa0UPNezhkgLAUbkglNQ8XVCHYqPgAAAAAAAACox6cEwqWQpFVE2gCFyOFsSdm2hCoE0/oWKXZJGa5UPd5mWSRxNctuXNtU32hcYNR/azLjsGO668Jwk+qCdFvmKjEqEMJgI+fvhwLQmm4=
$ matched-data-cli decrypt -d $MATCHED_DATA -k $PRIVATE_KEY
{"http.request.version": "HTTP/1.1", "http.request.uri.path": "/cms/%0Aadmin"}

Or the dashboard UI:

Using HPKE to Encrypt Request Payloads

Since our CLI tool is open-source and HPKE is interoperable, it can also be used in other tooling as part of a user’s logging pipeline, for example in security information and event management (SIEM) software.

Conclusion

This was a team effort with help from our Research and Security teams throughout the process. We relied on them for recommendations on how best to evaluate the algorithms as well as vetting the libraries we wanted to use.

We’re very pleased with how HPKE has worked out for us from an ease-of-implementation and performance standpoint. It was also an easy choice for us to make due to its impending standardization and best-of-both-worlds approach to security.

Round 2 post-quantum TLS is now supported in AWS KMS

Post Syndicated from Alex Weibel original https://aws.amazon.com/blogs/security/round-2-post-quantum-tls-is-now-supported-in-aws-kms/

AWS Key Management Service (AWS KMS) now supports three new hybrid post-quantum key exchange algorithms for the Transport Layer Security (TLS) 1.2 encryption protocol that’s used when connecting to AWS KMS API endpoints. These new hybrid post-quantum algorithms combine the proven security of a classical key exchange with the potential quantum-safe properties of new post-quantum key exchanges undergoing evaluation for standardization. The fastest of these algorithms adds approximately 0.3 milliseconds of overheard compared to a classical TLS handshake. The new post-quantum key exchange algorithms added are Round 2 versions of Kyber, Bit Flipping Key Encapsulation (BIKE), and Supersingular Isogeny Key Encapsulation (SIKE). Each organization has submitted their algorithms to the National Institute of Standards and Technology (NIST) as part of NIST’s post-quantum cryptography standardization process. This process spans several rounds of evaluation over multiple years, and is likely to continue beyond 2021.

In our previous hybrid post-quantum TLS blog post, we announced that AWS KMS had launched hybrid post-quantum TLS 1.2 with Round 1 versions of BIKE and SIKE. The Round 1 post-quantum algorithms are still supported by AWS KMS, but at a lower priority than the Round 2 algorithms. You can choose to upgrade your client to enable negotiation of Round 2 algorithms.

Why post-quantum TLS is important

A large-scale quantum computer would be able to break the current public-key cryptography that’s used for key exchange in classical TLS connections. While a large-scale quantum computer isn’t available today, it’s still important to think about and plan for your long-term security needs. TLS traffic using classical algorithms recorded today could be decrypted by a large-scale quantum computer in the future. If you’re developing applications that rely on the long-term confidentiality of data passed over a TLS connection, you should consider a plan to migrate to post-quantum cryptography before the lifespan of the sensitivity of your data would be susceptible to an unauthorized user with a large-scale quantum computer. As an example, this means that if you believe that a large-scale quantum computer is 25 years away, and your data must be secure for 20 years, you should migrate to post-quantum schemes within the next 5 years. AWS is working to prepare for this future, and we want you to be prepared too.

We’re offering this feature now instead of waiting for standardization efforts to be complete so you have a way to measure the potential performance impact to your applications. Offering this feature now also gives you the protection afforded by the proposed post-quantum schemes today. While we believe that the use of this feature raises the already high security bar for connecting to AWS KMS endpoints, these new cipher suites will impact bandwidth utilization and latency. However, using these new algorithms could also create connection failures for intermediate systems that proxy TLS connections. We’d like to get feedback from you on the effectiveness of our implementation or any issues found so we can improve it over time.

Hybrid post-quantum TLS 1.2

Hybrid post-quantum TLS is a feature that provides the security protections of both the classical and post-quantum key exchange algorithms in a single TLS handshake. Figure 1 shows the differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2. Hybrid post-quantum TLS 1.2 has three major differences from classical TLS 1.2:

  • The negotiated post-quantum key is appended to the ECDHE key before being used as the hash-based message authentication code (HMAC) key.
  • The text hybrid in its ASCII representation is prepended to the beginning of the HMAC message.
  • The entire client key exchange message from the TLS handshake is appended to the end of the HMAC message.
Figure 1: Differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2

Figure 1: Differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2

Some background on post-quantum TLS

Today, all requests to AWS KMS use TLS with key exchange algorithms that provide perfect forward secrecy and use one of the following classical schemes:

While existing FFDHE and ECDHE schemes use perfect forward secrecy to protect against the compromise of the server’s long-term secret key, these schemes don’t protect against large-scale quantum computers. In the future, a sufficiently capable large-scale quantum computer could run Shor’s Algorithm to recover the TLS session key of a recorded classical session, and thereby gain access to the data inside. Using a post-quantum key exchange algorithm during the TLS handshake protects against attacks from a large-scale quantum computer.

The possibility of large-scale quantum computing has spurred the development of new quantum-resistant cryptographic algorithms. NIST has started the process of standardizing post-quantum key encapsulation mechanisms (KEMs). A KEM is a type of key exchange that’s used to establish a shared symmetric key. AWS has chosen three NIST KEM submissions to adopt in our post-quantum efforts:

Hybrid mode ensures that the negotiated key is as strong as the weakest key agreement scheme. If one of the schemes is broken, the communications remain confidential. The Internet Engineering Task Force (IETF) Hybrid Post-Quantum Key Encapsulation Methods for Transport Layer Security 1.2 draft describes how to combine post-quantum KEMs with ECDHE to create new cipher suites for TLS 1.2.

These cipher suites use a hybrid key exchange that performs two independent key exchanges during the TLS handshake. The key exchange then cryptographically combines the keys from each into a single TLS session key. This strategy combines the proven security of a classical key exchange with the potential quantum-safe properties of new post-quantum key exchanges being analyzed by NIST.

The effect of hybrid post-quantum TLS on performance

Post-quantum cipher suites have a different performance profile and bandwidth usage from traditional cipher suites. AWS has measured bandwidth and latency across 2,000 TLS handshakes between an Amazon Elastic Compute Cloud (Amazon EC2) C5n.4xlarge client and the public AWS KMS endpoint, which were both in the us-west-2 Region. Your own performance characteristics might differ, and will depend on your environment, including your:

  • Hardware–CPU speed and number of cores.
  • Existing workloads–how often you call AWS KMS and what other work your application performs.
  • Network–location and capacity.

The following graphs and table show latency measurements performed by AWS for all newly supported Round 2 post-quantum algorithms, in addition to the classical ECDHE key exchange algorithm currently used by most customers.

Figure 2 shows the latency differences of all hybrid post-quantum algorithms compared with classical ECDHE alone, and shows that compared to ECDHE alone, SIKE adds approximately 101 milliseconds of overhead, BIKE adds approximately 9.5 milliseconds of overhead, and Kyber adds approximately 0.3 milliseconds of overhead.
 

Figure 2: TLS handshake latency at varying percentiles for four key exchange algorithms

Figure 2: TLS handshake latency at varying percentiles for four key exchange algorithms

Figure 3 shows the latency differences between ECDHE with Kyber, and ECDHE alone. The addition of Kyber adds approximately 0.3 milliseconds of overhead.
 

Figure 3: TLS handshake latency at varying percentiles, with only top two performing key exchange algorithms

Figure 3: TLS handshake latency at varying percentiles, with only top two performing key exchange algorithms

The following table shows the total amount of data (in bytes) needed to complete the TLS handshake for each cipher suite, the average latency, and latency at varying percentiles. All measurements were gathered from 2,000 TLS handshakes. The time was measured on the client from the start of the handshake until the handshake was completed, and includes all network transfer time. All connections used RSA authentication with a 2048-bit key, and ECDHE used the secp256r1 curve. All hybrid post-quantum tests used the NIST Round 2 versions. The Kyber test used the Kyber-512 parameter, the BIKE test used the BIKE-1 Level 1 parameter, and the SIKE test used the SIKEp434 parameter.

Item Bandwidth
(bytes)
Total
handshakes
Average
(ms)
p0
(ms)
p50
(ms)
p90
(ms)
p99
(ms)
ECDHE (classic) 3,574 2,000 3.08 2.07 3.02 3.95 4.71
ECDHE + Kyber R2 5,898 2,000 3.36 2.38 3.17 4.28 5.35
ECDHE + BIKE R2 12,456 2,000 14.91 11.59 14.16 18.27 23.58
ECDHE + SIKE R2 4,628 2,000 112.40 103.22 108.87 126.80 146.56

By default, the AWS SDK client performs a TLS handshake once to set up a new TLS connection, and then reuses that TLS connection for multiple requests. This means that the increased cost of a hybrid post-quantum TLS handshake is amortized over multiple requests sent over the TLS connection. You should take the amortization into account when evaluating the overall additional cost of using post-quantum algorithms; otherwise performance data could be skewed.

AWS KMS has chosen Kyber Round 2 to be KMS’s highest prioritized post-quantum algorithm, with BIKE Round 2, and SIKE Round 2 next in priority order for post-quantum algorithms. This is because Kyber’s performance is closest to the classical ECDHE performance that most AWS KMS customers are using today and are accustomed to.

How to use hybrid post-quantum cipher suites

To use the post-quantum cipher suites with AWS KMS, you need the preview release of the AWS Common Runtime (CRT) HTTP client for the AWS SDK for Java 2.x. Also, you will need to configure the AWS CRT HTTP client to use the s2n post-quantum hybrid cipher suites. Post-quantum TLS for AWS KMS is available in all AWS Regions except for AWS GovCloud (US-East), AWS GovCloud (US-West), AWS China (Beijing) Region operated by Beijing Sinnet Technology Co. Ltd (“Sinnet”), and AWS China (Ningxia) Region operated by Ningxia Western Cloud Data Technology Co. Ltd. (“NWCD”). Since NIST has not yet standardized post-quantum cryptography, connections that require Federal Information Processing Standards (FIPS) compliance cannot use the hybrid key exchange. For example, kms.<region>.amazonaws.com supports the use of post-quantum cipher suites, while kms-fips.<region>.amazonaws.com does not.

  1. If you’re using the AWS SDK for Java 2.x, you must add the preview release of the AWS Common Runtime client to your Maven dependencies.
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>aws-crt-client</artifactId>
        <version>2.14.13-PREVIEW</version>
    </dependency>
    

  2. You then must configure the new SDK and cipher suite in the existing initialization code of your application:
    if(!TLS_CIPHER_PREF_KMS_PQ_TLSv1_0_2020_07.isSupported()){
        throw new RuntimeException("Post Quantum Ciphers not supported on this Platform");
    }
    
    SdkAsyncHttpClient awsCrtHttpClient = AwsCrtAsyncHttpClient.builder()
              .tlsCipherPreference(TLS_CIPHER_PREF_KMS_PQ_TLSv1_0_2020_07)
              .build();
              
    KmsAsyncClient kms = KmsAsyncClient.builder()
             .httpClient(awsCrtHttpClient)
             .build();
             
    ListKeysResponse response = kms.listKeys().get();
    

Now, all connections made to AWS KMS in supported Regions will use the new hybrid post-quantum cipher suites! To see a complete example of everything set up, check out the example application here.

Things to try

Here are some ideas about how to use this post-quantum-enabled client:

  • Run load tests and benchmarks. These new cipher suites perform differently than traditional key exchange algorithms. You might need to adjust your connection timeouts to allow for the longer handshake times or, if you’re running inside an AWS Lambda function, extend the execution timeout setting.
  • Try connecting from different locations. Depending on the network path your request takes, you might discover that intermediate hosts, proxies, or firewalls with deep packet inspection (DPI) block the request. This could be due to the new cipher suites in the ClientHello or the larger key exchange messages. If this is the case, you might need to work with your security team or IT administrators to update the relevant configuration to unblock the new TLS cipher suites. We’d like to hear from you about how your infrastructure interacts with this new variant of TLS traffic. If you have questions or feedback, please start a new thread on the AWS KMS discussion forum.

Conclusion

In this blog post, I announced support for Round 2 hybrid post-quantum algorithms in AWS KMS, and showed you how to begin experimenting with hybrid post-quantum key exchange algorithms for TLS when connecting to AWS KMS endpoints.

More info

If you’d like to learn more about post-quantum cryptography check out:

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Alex Weibel

Alex is a Senior Software Engineer on the AWS Crypto Algorithms team. He’s one of the maintainers for Amazon’s TLS Library s2n. Previously, Alex worked on TLS termination and request proxying for S3 and the Elastic Load Balancing Service developing new features for customers. Alex holds a Bachelor of Science degree in Computer Science from the University of Texas at Austin.

Fall 2020 RPKI Update

Post Syndicated from Louis Poinsignon original https://blog.cloudflare.com/rpki-2020-fall-update/

Fall 2020 RPKI Update

The Internet is a network of networks. In order to find the path between two points and exchange data, the network devices rely on the information from their peers. This information consists of IP addresses and Autonomous Systems (AS) which announce the addresses using Border Gateway Protocol (BGP).

One problem arises from this design: what protects against a malevolent peer who decides to announce incorrect information? The damage caused by route hijacks can be major.

Routing Public Key Infrastructure (RPKI) is a framework created in 2008. Its goal is to provide a source of truth for Internet Resources (IP addresses) and ASes in signed cryptographically signed records called Route Origin Objects (ROA).

Recently, we’ve seen the significant threshold of two hundred thousands of ROAs being passed. This represents a big step in making the Internet more secure against accidental and deliberate BGP tampering.

We have talked about RPKI in the past but we thought it would be a good time for an update.

In a more technical context, the RPKI framework consists of two parts:

  • IP addresses need to be cryptographically signed by their owners in a database managed by a Trust Anchor: Afrinic, APNIC, ARIN, LACNIC and RIPE. Those five organizations are in charge of allocating Internet resources. The ROA indicates which Network Operator is allowed to announce the addresses using BGP.
  • Network operators download the list of ROAs, perform the cryptographic checks and then apply filters on the prefixes they receive: this is called BGP Origin Validation.

The “Is BGP Safe Yet” website

The launch of the website isbgpsafeyet.com to test if your ISP correctly performs BGP Origin Validation was a success. Since launch, it has been visited more than five million times from over 223 countries and 13,000 unique networks (20% of the entire Internet), generating half a million BGP Origin Validation tests.

Many providers subsequently indicated on social media (for example, here or here) that they had an RPKI deployment in the works. This increase in Origin Validation by networks is increasing the security of the Internet globally.

The site’s test for Origin Validation consists of queries toward two addresses, one of which is behind an RPKI invalid prefix and the other behind an RPKI valid prefix. If the query towards the invalid succeeds, the test fails as the ISP does not implement Origin Validation. We counted the number of queries that failed to reach invalid.cloudflare.com. This also included a few thousand RIPE Atlas tests that were started by Cloudflare and various contributors, providing coverage for smaller networks.

Every month since launch we’ve seen that around 10 to 20 networks are deploying RPKI Origin Validation. Among the major providers we can build the following table:

Month Networks
August Swisscom (Switzerland), Salt (Switzerland)
July Telstra (Australia), Quadranet (USA), Videotron (Canada)
June Colocrossing (USA), Get Norway (Norway), Vocus (Australia), Hurricane Electric (Worldwide), Cogent (Worldwide)
May Sengked Fiber (Indonesia), Online.net (France), WebAfrica Networks (South Africa), CableNet (Cyprus), IDnet (Indonesia), Worldstream (Netherlands), GTT (Worldwide)

With the help of many contributors, we have compiled a list of network operators and public statements at the top of the isbgpsafeyet.com page.

We excluded providers that manually blocked the traffic towards the prefix instead of using RPKI. Among the techniques we see are firewall filtering and manual prefix rejection. The filtering is often propagated to other customer ISPs. In a unique case, an ISP generated a “more-specific” blackhole route that leaked to multiple peers over the Internet.

The deployment of RPKI by major transit providers, also known as Tier 1, such as Cogent, GTT, Hurricane Electric, NTT and Telia made many downstream networks more secure without them having them deploying validation software.

Overall, we looked at the evolution of the successful tests per ASN and we noticed a steady increase over the recent months of 8%.

Fall 2020 RPKI Update

Furthermore, when we probed the entire IPv4 space this month, using a similar technique to the isbgpsafeyet.com test, many more networks were not able to reach an RPKI invalid prefix than compared to the same period last year. This confirms an increase of RPKI Origin Validation deployment across all network operators. The picture below shows the IPv4 space behind a network with RPKI Origin Validation enabled in yellow and the active space in blue. It uses a Hilbert Curve to efficiently plot IP addresses: for example one /20 prefix (4096 IPs) is a pixel, a /16 prefix (65536 IPs) will form a 4×4 pixels square.

The more the yellow spreads, the safer the Internet becomes.

Fall 2020 RPKI Update

What does it mean exactly? If you were hijacking a prefix, the users behind the yellow space would likely not be affected. This also applies if you miss-sign your prefixes: you would not be able to reach the services or users behind the yellow space. Once RPKI is enabled everywhere, there will only be yellow squares.

Progression of signed prefixes

Owners of IP addresses indicate the networks allowed to announce them. They do this by signing prefixes: they create Route Origin Objects (ROA). As of today, there are more than 200,000 ROAs. The distribution shows that the RIPE region is still leading in ROA count, then followed by the APNIC region.

Fall 2020 RPKI Update

2020 started with 172,000 records and the count is getting close to 200,000 at the beginning of November, approximately a quarter of all the Internet routes. Since last year, the database of ROAs grew by more than 70 percent, from 100,000 records, an average pace of 5% every month.

On the following graph of unique ROAs count per day, we can see two points that were followed by a change in ROA creation rate: 140/day, then 231/day, and since August, 351 new ROAs per day.

It is not yet clear what caused the increase in August.

Fall 2020 RPKI Update

Free services and software

In 2018 and 2019, Cloudflare was impacted by BGP route hijacks. Both could have been avoided with RPKI. Not long after the first incident, we started signing prefixes and developing RPKI software. It was necessary to make BGP safer and we wanted to do more than talk about it. But we also needed enough networks to be deploying RPKI as well. By making deployment easier for everyone, we hoped to increase adoption.

The following is a reminder of what we built over the years around RPKI and how it grew.

OctoRPKI is Cloudflare’s open source RPKI Validation software. It periodically generates a JSON document of validated prefixes that we pass onto our routers using GoRTR. It generates most of the data behind the graphs here.

The latest version, 1.2.0, of OctoRPKI was released at the end of October. It implements important security fixes, better memory management and extended logging. This is the first validator to provide detailed information around cryptographically invalid records into Sentry and performance data in distributed tracing tools.
GoRTR remains heavily used in production, including by transit providers. It can natively connect to other validators like rpki-client.

When we released our public rpki.json endpoint in early 2019, the idea was to enable anyone to see what Cloudflare was filtering.

The file is also used as a bootstrap by GoRTR, so that users can test a deployment. The file is cached on more than 200 data centers, ensuring quick and secure delivery of a list of valid prefixes, making RPKI more accessible for smaller networks and developers.

Between March 2019 and November 2020, the number of queries more than doubled and there are five times more networks querying this file.

The growth of queries follows approximately the rate of ROA creation (~5% per month).

Fall 2020 RPKI Update

A public RTR server is also available on rtr.rpki.cloudflare.com. It includes a plaintext endpoint on port 8282 and an SSH endpoint on port 8283. This allows us to test new versions of GoRTR before release.

Later in 2019, we also built a public dashboard where you can see in-depth RPKI validation. With a GraphQL API, you can now explore the validation data, test a list of prefixes, or see the status of the current routing table.

Fall 2020 RPKI Update

Currently, the API is used by BGPalerter, an open-source tool that detects routing issues (including hijacks!) from a stream of BGP updates.

Additionally, starting in November, you can access the historical data from May 2019. Data is computed daily and contains the unique records. The team behind the dashboard worked hard to provide a fast and accurate visualization of the daily ROA changes and the volumes of files changed over the day.

Fall 2020 RPKI Update

The future

We believe RPKI is going to continue growing, and we would like to thank the hundreds of network engineers around the world who are making the Internet routing more secure by deploying RPKI.

25% of routes are signed and 20% of the Internet is doing origin validation and those numbers grow everyday. We believe BGP will be safer before reaching 100% of deployment; for instance, once the remaining transit providers enable Origin Validation, it is unlikely a BGP hijack will make it to the front page of world news outlets.

While difficult to quantify, we believe that critical mass of protected resources will be reached in late 2021.

We will keep improving the tooling; OctoRPKI and GoRTR are open-source and we welcome contributions. In the near future, we plan on releasing a packaged version of GoRTR that can be directly installed on certain routers. Stay tuned!

NTS is now an RFC

Post Syndicated from Watson Ladd original https://blog.cloudflare.com/nts-is-now-rfc/

NTS is now an RFC

Earlier today the document describing Network Time Security for NTP officially became RFC 8915. This means that Network Time Security (NTS) is officially part of the collection of protocols that makes the Internet work. We’ve changed our time service to use the officially assigned port of 4460 for NTS key exchange, so you can use our service with ease. This is big progress towards securing a ubiquitous Internet protocol.

Over the past months we’ve seen many users of our time service, but very few using Network Time Security. This leaves computers vulnerable to attacks that imitate the server they use to obtain NTP. Part of the problem was the lack of available NTP daemons that supported NTS. That problem is now solved: chrony and ntpsec both support NTS.

Time underlies the security of many of the protocols such as TLS that we rely on to secure our online lives. Without accurate time, there is no way to determine whether or not credentials have expired. The absence of an easily deployed secure time protocol has been a problem for Internet security.

Without NTS or symmetric key authentication there is no guarantee that your computer is actually talking NTP with the computer you think it is. Symmetric key authentication is difficult and painful to set up, but until recently has been the only secure and standardized mechanism for authenticating NTP.  NTS uses the work that goes into the Web Public Key Infrastructure to authenticate NTP servers and ensure that when you set up your computer to talk to time.cloudflare.com, that’s the server your computer gets the time from.

Our involvement in developing and promoting NTS included making a specialized server and releasing the source code, participation in the standardization process, and much working with implementers to hunt down bugs. We also set up our time service with support for NTS from the beginning, and it was a useful resource for implementers to test interoperability.

NTS is now an RFC
NTS operation diagram

When Cloudflare supported TLS 1.3 browsers were actively updating, and so deployment quickly took hold. However, the long tail of legacy installs and extended support releases slowed adoption. Similarly until Let’s Encrypt made encryption easy for webservers most web traffic was not encrypted.

By contrast ssh quickly displaced telnet as the way to access remote systems: the security benefits were substantial, and the experience was better. Adoption of protocols is slow, but when there is a real security need it can be much faster. NTS is a real security improvement that is vital to adopt. We’re proud to continue making the Internet a better place by supporting secure protocols.

We hope that operating systems will incorporate NTS support and TLS 1.3 in their supplied NTP daemons. We also urge administrators to deploy NTS as quickly as possible, and NTP server operators to adopt NTS. With Let’s Encrypt provided certificates this is simpler than it has been in the past

We’re continuing our work in this area with the continued development of the Roughtime protocol for even better security as well as engagement with the standardization process to help develop the future of Internet time.

Cloudflare is willing to allow any device to point to time.cloudflare.com and supports NTS. Just as our Universal SSL made it easy for any website to get the security benefits of TLS, our time service makes it easy for any computer to get the benefits of secure time.