Tag Archives: Cryptography

The quantum solace and spectre

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/quantum-solace-and-spectre/

The quantum solace and spectre

Not only is the universe stranger than we think, but it is stranger than we can think of
Werner Heisenberg

The quantum solace and spectre

Even for a physicist as renowned as Heisenberg, the universe was strange. And it was strange because several phenomena could only be explained through the lens of quantum mechanics. This field changed the way we understood the world, challenged our imagination, and, since the Fifth Solvay Conference in 1927, has been integrated into every explanation of the physical world (it is, to this day, our best description of the inner workings of nature). Quantum mechanics created a rift: every physical phenomena (even the most micro and macro ones) stopped being explained only by classical physics and started to be explained by quantum mechanics. There is another world in which quantum mechanics has not yet created this rift: the realm of computers (note, though, that manufacturers have been affected by quantum effects for a long time). That is about to change.

In the 80s, several physicists (including, for example, Richard Feynman and Yuri Manin) asked themselves these questions: are there computers that can, with high accuracy and in a reasonable amount of time, simulate physics? And, specifically, can they simulate quantum mechanics? These two questions started a long field of work that we now call “quantum computing”. Note that you can simulate quantum phenomena with classical computers. But we can simulate in a much faster manner with a quantum computer.

Any computer is already a quantum system, but a quantum computer uses the true computational power that quantum mechanics offers. It solves some problems much faster than a classical computer. Any computational problem solvable by a classical computer is also solvable by a quantum computer, as any physical phenomena (such as operations performed in classical computers) can be described using quantum mechanics. Contrariwise, any problem solvable by a quantum computer is also solvable by a classical computer: quantum computers provide no additional power over classical computers in terms of what they could theoretically compute. Quantum computers cannot solve undecidable problems, and they do not disprove the Church–Turing thesis. But, they are faster for specific problems! Let’s explore now why this is the case.

In a classical computer, data is stored in bits: 0 or 1, current on or current off, true or false. Nature is not that binary: any physical bit (current or not; particle or not; magnetized or not) is in fact a qubit and exists on a spectrum (or rather a sphere) of superpositions between 0 and 1. We have mentioned two terms that might be new to the reader: qubit and superposition. Let’s first define the second term.

A superposition is a normalized linear combination of all the possible configurations of something. What is this something? If we take the Schrödinger equation to explain, “for any isolated region of the universe that you want to consider, that equation describes the evolution in time of the state of that region, which we represent as a normalized linear combination — a superposition — of all the possible configurations of elementary particles in that region”, as Aaronson put it. Something, then, refers to all elementary particles. Does it mean, then, that we all are in a superposition? The answer to that question is open to many interpretations (for a nice introduction on the topic, one can read “Quantum Computing Since Democritus” by Scott Aaronson).

As you see, we are concerned then no longer on matter or energy or waves (as in classical physics), but rather on information and probabilities. In a quantum system, we first initialize information (like a bit) to some “easy” states, like 0 and 1. This information will be in a superposition of both states at the same time rather than just one or the other. If we examine a classical bit, we will either get the states of 1 or 0. Instead, with a qubit (quantum information) we can only get results with some probability. A qubit can exist in a continuum of states between ⟨0 and ⟨1 (using Dirac’s notation) all waiting to happen with some probability until it is observed.

The quantum solace and spectre
A qubit stores a combination of two or more states.

In day-to-day life, we don’t see this quantum weirdness because we’re looking at huge jittery ensembles of qubits where this weirdness gets lost in noise. However, if we turn the temperature way down and isolate some qubits from noise, we can do some proper quantum computation. For instance, to find a 256-bit decryption key, we can set 256 qubits to a superposition of all possible keys. Then, we can run the decryption algorithm, trying the key in the qubits. We end up with a superposition of many failures and one success. We did 2256 computations in a single go! Unfortunately, we can’t read out this superposition. In fact, if we open up the quantum computer to inspect the qubits inside, the outside noise will collapse the superposition into one outcome, which will most certainly be a failure. The real art of quantum computing is interfering this superposition with itself in a clever way to amplify out the answer we want. For the case we are talking about, Lov Grover figured out how to do it. Unfortunately, it requires about 2128 steps: we have a nice, but not exponential speed-up.

Quantum computers do not provide any additional power over classical computers in terms of computability: they don’t solve unsolvable problems. But they provide efficiency: they find solutions in faster time, even for algorithms that are too “expensive” for a non-quantum (classical) computer. A classical computer can solve any known computational problem if it is given enough time and resources. But, on some occasions, they will need so much time and so many resources that it will be deemed inefficient to try to solve this problem: waiting millions of years with millions of computers working at the same time trying to solve a problem is not something that we consider feasible to do. This idea of efficient and inefficient algorithms was made mathematically precise by the field of computational complexity. Roughly speaking, an efficient algorithm is one which runs in time polynomial in the size of the problem solved.

We can look at this differently. The many quantum states can be potentially observed (even in a classical computer), but states interfere with each other, which prevents the usage of statistical sampling to obtain the state. We, then, have to track every possible configuration that a quantum system can be in. Tracking every possible configuration is theoretically possible by a classical computer, but would exceed the memory capacity of even the most powerful classical computers. So, we need quantum computers to efficiently solve these problems.

Quantum computers are powerful. They can efficiently simulate quantum mechanics experiments, and are able to perform a number of mathematical, optimization, and search problems in a faster manner than classical computers. A sufficiently scaled quantum computer can, for example, factor large integers efficiently, as was shown by Shor. They also have significant advantages when it comes to searching on unstructured databases or solving the discrete logarithm problem. These advances are the solace: they efficiently advance computational problems that seem stalled. But, they are also a spectre, a threat: the problems that they efficiently solve are what is used as the core of much of cryptography as it exists today. They, hence, can break most of the secure connections we use nowadays. For a longer explanation of the matter, see our past blog post.

The realization of large-scale quantum computers will pose a threat to our connections and infrastructure. This threat is urgent and acts as a spectre of the future: an attacker can record and store secured traffic now and decrypt it later when a quantum computer exists. The spectre is ever more present if one recalls that migrating applications, architectures and protocols is a long procedure. Migrating these services to quantum-secure cryptography will take a lot of time.

In our other blog posts, we talk at length about what quantum computers are and what threats they pose. In this one, we want to give an overview of the advances in quantum computing and their threats to cryptography, as well as an update to what is considered quantum-secure cryptography.

The new upcoming decade of quantum computing

In our previous blog post, we mainly talked about improvements to reduce the cost of factoring integers and computing discrete logarithms during the middle of 2019. But recent years have brought new advances to the field. Google, for example, is aiming to build a “useful, error-corrected quantum computer” by the end of the decade. Their goal is to be able to better simulate nature and, specifically, to better simulate molecules. This research can then be used, for example, to create better batteries or generate better targeted medicines.

Why will it take them (and, generally, anyone) so long to create this quantum computer? Classical mechanisms, like Newtonian physics or classical computers, provide great detail and prediction of events and phenomena at a macroscopic level, which makes them perfect for our everyday computational needs. Quantum mechanics, on the other hand, are better manifested and understood at the microscopic level. Quantum phenomena are, furthermore, very fragile: uncontrolled interaction of quantum phenomena with the environment can make quantum features disappear, a process which is referred to as decoherence. Due to this, measurement of quantum states has to be controlled, properly prepared and isolated from its environment. Furthermore, measuring a quantum phenomenon disturbs its state, so reliable methods of computing information have to be developed.

Creating a computer that achieves computational tasks with these challenges is a daunting one. In fact, it may ultimately be impossible to completely eliminate errors in the computational tasks. It can even be an impossible task at a perfect level, as a certain amount of errors will persist. Reliable computation seems to be only achievable in a quantum computational model by employing error correction.

So, when will we know that a created quantum device is ready for the “quantum” use cases? One can use the idea of “quantum supremacy”, which is defined as the ability for a quantum device to perform a computational task that would be practically impossible (in terms of efficiency and usage of computational power) for classical computers. Nevertheless, this idea can be criticized as classical algorithms improve over time, and there is no clear definition on how to compare quantum computations with their classical counterparts (one must prove that no classical means would allow us to perform the same computational task in a faster time) in order to say that indeed quantum computations of the task are faster.

A molecule can indeed be claimed to be “quantum supreme” if researchers find it computationally prohibitive (unrealistically expensive) to solve the Schrödinger equation and calculate its ground state. That molecule constitutes a “useful quantum computer, for the task of simulating itself.” But, for computer science, this individual instance is not enough: “for any one molecule, the difficulty in simulating it classically might reflect genuine asymptotic hardness, but it might also reflect other issues (e.g., a failure to exploit special structure in the molecule, or the same issues of modeling error, constant-factor overheads, and so forth that arise even in simulations of classical physics)”, as stated by Aaronson and Chen.

By the end of 2019, Google argued they had reached quantum supremacy (with the quantum processor named “Sycamore”) but their claim was debated mainly due to the perceived lack of criteria to properly define when “quantum supremacy” is reached1. Google claimed that “a state-of-the-art supercomputer would require approximately 10,000 years to perform the equivalent task” and it was refuted by IBM by saying that “an ideal simulation of the same task can be performed on a classical system in 2.5 days and with far greater fidelity”. Regardless of this lack of criteria, this result in itself is a remarkable achievement and shows great interest of parties into creating a quantum computer. Last November, it was claimed that the most challenging sampling task performed on Sycamore can be accomplished within one week on the Sunway supercomputer. A crucial point has to be understood on the matter, though: “more like a week still seems to be needed on the supercomputer”, as noted by Aaronson. Quantum supremacy then is still not an irrevocable agreement.

Google’s quantum computer has already been applied to simulating real physical phenomena. Early in 2021, it was used to demonstrate a “time crystal” (a quantum system of particles which move in a regular, repeating motion without burning any energy to the environment). On the other hand, by late 2021, IBM unveiled its 127-qubit chip Eagle, but we don’t yet know how much gate fidelity (how close two operations are to each other) it really attains, as outlined by Aaronson in his blog post.

What do all of these advances mean for cryptography? Does it mean that soon all secure connections will be broken? If not “soon”, when will it be? The answers to all of these questions are complex and unclear. The Quantum Threat Timeline Report of 2020 suggests that the majority of researchers on the subject think that 15 to 40 years2 will be the time frame for when a “cryptography-breaking” quantum computer arrives. It is worth noting that these predictions are more optimistic than compared to the same report of 2019, which suggests that, due to the advances of the quantum computing field, researchers feel more assured of their imminent arrival.

The upcoming decades of breaking cryptography

The quantum solace and spectre

Quantum computing will shine a light into new areas of development and advances, but it will also expose secret areas: faster mechanisms to solve mathematical problems and break cryptography. The metaphor of the dawn can be used to show how close we are to quantum computers breaking some types of cryptography (how far are we to twilight), which will be catastrophic to the needs of privacy, communication and security we have.

It is worth noting again, though, that the advent of quantum computing is not a bad scenario in itself. Quantum computing will bring much progress in the form of advances to the understanding and mimicking of the fundamental way in which nature interacts. Their inconvenience is that they break most of the cryptography algorithms we use nowadays. They also have their limits: they cannot solve every problem, and in cryptography there are algorithms that will not be broken by them.

Let’s start by defining first what dawn is. Dawn is the arrival of stable quantum computers that break cryptography in a reasonable amount of time and space (explicitly: breaking the mathematical foundation of the p256 algorithm; although some research defines this as breaking RSA2048). It is also the time by which a new day arrives: a day that starts with new advances in medicine, for example, by better mimicking molecules’ nature with a computer.

How close to dawn are we? To answer this question, we have to look at the recent advances in the aim of breaking cryptography with a quantum computer. One of the most interesting advances is that the number 15 was factored in 2001 (it is 20 years already) and number 21 was factored in 2012 by a quantum computer. While this does not seem so exciting for us (as we can factor it in our minds), it is astonishing to see that a quantum computer was stable enough to do so.

Why do we care about computers factoring or not factoring integers? The reason is that they are the mathematical base of certain public key cryptography algorithms (like, RSA) and, if they are factored, the algorithm is broken.

Taking a step further, quantum computers are also able to break encryption algorithms themselves, such as AES, as we can use quantum efficient searching algorithms for the task. This kind of searching algorithm, Grover’s algorithm, solves the task of function inversion, which speeds up collision and preimage attacks (attacks that you don’t want to be sped up in your encryption algorithms). These attacks are related to hash functions: a collision attack is one that tries to find two messages that produce the same hash; a preimage attack is one that tries to find a message given its hash value. A smart evolution of this algorithm, using Hidden Variables as part of a computational model, will theoretically make searching much more efficient.

So what are the tasks needed for a quantum computer to break cryptography? It should:

  • Be able to factor big integers, or,
  • Be able to efficiently search for collisions or pre-images, or,
  • Be able to solve the discrete logarithm problem.

Why do we care about these quantum computers breaking cryptographic algorithms that we rarely hear of? What has it to do with the connections we make? Cryptographic algorithms are the base of the secure protocols that we use today. Every time we visit a website, we are using a broad suite of cryptographic algorithms that preserve the security, privacy and integrity of connections. Without them, we don’t have the properties we are used to when connecting over digital means.

What can we do? Do we now stop using digital means altogether? There are solutions. Of course, the easy answer will be, for example, to increase the size of integers to make them impossible to be factored by a quantum computer. But, is it really a usable algorithm? How slow will our connections be if we have to increase these algorithms exponentially?

As we get closer to twilight, we also get closer to the dawn of new beginnings, the rise of post-quantum cryptography.

The rise and state of post-quantum cryptography

As dawn approaches, researchers everywhere have been busy looking for solutions. If the mathematical underpinnings of an algorithm are what is broken by a quantum computer, can we just change them? Are there mathematical problems that are hard even for quantum computers? The answer is yes and, even more interestingly, we already know how to create algorithms out of them.

The mathematical problems that we use in this new upcoming era, called post-quantum, are of many kinds:

  • Lattices by using learning with errors (LWE) or ring learning with errors problems (rLWE).
  • Isogenies, which rely on properties of supersingular elliptic curves and supersingular isogeny graphs.
  • Multivariate cryptography, which relies on the difficulty of solving systems of multivariate equations.
  • Code-based cryptography, which relies on the theory of error-correcting codes.

With these mathematical constructions, we can build algorithms that can be used to protect our connections and data. How do we know, nevertheless, that the algorithms are safe? Even if a mathematical construction that an algorithm uses is safe from the quantum spectre, is the interaction with other parts of the system or in regard to adversaries safe? Who decides on this?

Because it is difficult to determine answers to these questions on an isolated level, the National Institute of Standards and Technology of the United States has been running, since 2016, a post-quantum process that will define how these algorithms work, their design, what properties they provide and how they will be used. We now have a set of finalists that are close to being standardized. The competition standardizes on two tracks: algorithms for key encapsulation mechanisms and public-key encryption (or KEMs) and algorithms for the authentication (mainly, digital signatures). The current finalists are:

For the public-key encryption and key-establishment algorithms:

  • Classic McEliece, a code-based cryptography scheme.
  • CRYSTALS-KYBER, a lattice based scheme.
  • NTRU, a lattice based scheme.
  • SABER, a lattice based scheme.

For the digital signature algorithms:

  • CRYSTALS-DILITHIUM, a lattice based scheme.
  • FALCON, a lattice based scheme.
  • Rainbow, a multivariate cryptography scheme.

What do these finalists show us? It seems that we can rely on the security given by lattice-based schemes (which is no slight to the security of any of the other schemes). It also shows us that in the majority of the cases, we prioritize schemes that can be used in the same way cryptography is used nowadays: with fast operations’ computation time, and with small parameters’ sizes. In the migration to post-quantum cryptography, it is not only about using safe mathematical constructions and secure algorithms but also about how they integrate into the systems, architectures, protocols, and expected speed and space efficiencies we currently depend on. More research seems to be needed in the signatures/authentication front and the inclusion of new innovative signature schemes (like SQISign or MAYO), and that is why NIST will be calling for a fourth round of its competition.

It is worth noting that there exists another flavor of upcoming cryptography called “quantum cryptography”, but its efficient dawn seems to be far away (though, some Quantum Key Distribution — QKD — protocols are being deployed now). Quantum cryptography is one that uses quantum mechanical properties to perform cryptographic tasks. It has mainly focused on QKD), but it seems to be too inefficient to be used in practice outside very limited settings. And even though, in theory, these schemes should be perfectly secure against any eavesdropper; in practice, the physical machines may still fall victim to side-channel attacks.

Is dawn, then, the collapse of cryptography? It can be, on one hand, as it is the collapse of most classical cryptography. But, on the other hand, it is also the dawn of new, post-quantum cryptography and the start of quantum computing for the benefit of the world. Will the process of reaching dawn be simple? Certainly, not. Migrating systems, architectures and protocols to new algorithms is an intimidating task. But we at Cloudflare and as a community are taking the steps, as you will read in other blog posts throughout this week.

References:

…….
1In the original paper, they estimate of 10,000 years is based on the observation that there is not enough Random Access Memory (RAM) to store the full state vector in a Schrödinger-type simulation and, then, it is needed to use a hybrid Schrödinger–Feynman algorithm, which is memory-efficient but exponentially more computationally expensive. In contrast, IBM used a Schrödinger-style classical simulation approach that uses both RAM and hard drive space.
2There is a 50% or more likelihood of “cryptography-breaking” quantum computers arriving. Slightly more than half of the researchers answered that, in the next 15 years, they will arrive. About 86% of the researchers answered that, in the next 20 years, they will arrive. All but one among the researchers answered that, in the next 30 years, they will arrive.
3Although, in practice, this seems difficult to attain.

Breaking 256-bit Elliptic Curve Encryption with a Quantum Computer

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/02/breaking-245-bit-elliptic-curve-encryption-with-a-quantum-computer.html

Researchers have calculated the quantum computer size necessary to break 256-bit elliptic curve public-key cryptography:

Finally, we calculate the number of physical qubits required to break the 256-bit elliptic curve encryption of keys in the Bitcoin network within the small available time frame in which it would actually pose a threat to do so. It would require 317 × 106 physical qubits to break the encryption within one hour using the surface code, a code cycle time of 1 μs, a reaction time of 10 μs, and a physical gate error of 10-3. To instead break the encryption within one day, it would require 13 × 106 physical qubits.

In other words: no time soon. Not even remotely soon. IBM’s largest ever superconducting quantum computer is 127 physical qubits.

Integrity Guarantees of Blockchains In Case of Single Owner Or Colluding Owners

Post Syndicated from Bozho original https://techblog.bozho.net/integrity-guarantees-of-blockchains-in-case-of-single-owner-or-colluding-owners/

The title may sound as a paper title, rather than a blogpost, because it was originally an idea for such, but I’m unlikely to find the time to put a proper paper about it, so here it is – a blogpost.

Blockchain has been touted as the ultimate integrity guarantee – if you “have blockchain”, nobody can tamper with your data. Of course, reality is more complicated, and even in the most distributed of ledgers, there are known attacks. But most organizations that are experimenting with blockchain, rely on a private network, sometimes having themselves as the sole owner of the infrastructure, and sometimes sharing it with just a few partners.

The point of having the technology in the first place is to guarantee that once collected, data cannot be tampered with. So let’s review how that works in practice.

First, we have two define two terms – “tamper-resistant” (sometimes referred to as tamper-free) and “tamper-evident”. “Tamper-resistant” means nobody can ever tamper with the data and the state of the data structure is always guaranteed to be without any modifications. “Tamper-evident”, on the other hand, means that a data structure can be validated for integrity violations, and it will be known that there have been modifications (alterations, deletions or back-dating of entries). Therefore, with tamper-evident structures you can prove that the data is intact, but if it’s not intact, you can’t know the original state. It’s still a very important property, as the ability to prove that data is not tampered with is crucial for compliance and legal aspects.

Blockchain is usually built ontop of several main cryptographic primitives: cryptographic hashes, hash chains, Merkle trees, cryptographic timestamps and digital signatures. They all play a role in the integrity guarantees, but the most important ones are the Merkle tree (with all of its variations, like a Patricia Merkle tree) and the hash chain. The original bitcoin paper describes a blockchain to be a hash chain, based on the roots of multiple Merkle trees (which form a single block). Some blockchains rely on a single, ever-growing merkle tree, but let’s not get into particular implementation details.

In all cases, blockchains are considered tamper-resistant because their significantly distributed in a way that enough number of members have a copy of the data. If some node modifies that data, e.g. 5 blocks in the past, it has to prove to everyone else that this is the correct merkle root for that block. You have to have more than 50% of the network capacity in order to do that (and it’s more complicated than just having them), but it’s still possible. In a way, tamper resistance = tamper evidence + distributed data.

But many of the practical applications of blockchain rely on private networks, serving one or several entities. They are often based on proof of authority, which means whoever has access to a set of private keys, controls what the network agree on. So let’s review the two cases:

  • Multiple owners – in case of multiple node owners, several of them can collude to rewrite the chain. The collusion can be based on mutual business interest (e.g. in a supply chain, several members may team up against the producer to report distorted data), or can be based on security compromise (e.g. multiple members are hacked by the same group). In that case, the remaining node owners can have a backup of the original data, but finding out whether the rest were malicious or the changes were legitimate part of the business logic would require a complicated investigation.
  • Single owner – a single owner can have a nice Merkle tree or hash chain, but an admin with access to the underlying data store can regenerate the whole chain and it will look legitimate, while in reality it will be tampered with. Splitting access between multiple admins is one approach (or giving them access to separate nodes, none of whom has access to a majority), but they often drink beer together and collusion is again possible. But more importantly – you can’t prove to a 3rd party that your own employees haven’t colluded under orders from management in order to cover some tracks to present a better picture to a regulator.

In the case of a single owner, you don’t even have a tamper-evident structure – the chain can be fully rewritten and nobody will understand that. In case of multiple owners, it depends on the implementation. There will be a record of the modification at the non-colluding party, but proving which side “cheated” would be next to impossible. Tamper-evidence is only partially achieved, because you can’t prove whose data was modified and whose data hasn’t (you only know that one of the copies has tampered data).

In order to achieve tamper-evident structure with both scenarios is to use anchoring. Checkpoints of the data need to be anchored externally, so that there is a clear record of what has been the state of the chain at different points in time. Before blockchain, the recommended approach was to print it in newspapers (e.g. as an ad) and because it has a large enough circulation, nobody can collect all newspapers and modify the published checkpoint hash. This published hash would be either a root of the Merkle tree, or the latest hash in a hash chain. An ever-growing Merkle tree would allow consistency and inclusion proofs to be validated.

When we have electronic distribution of data, we can use public blockchains to regularly anchor our internal ones, in order to achieve proper tamper-evident data. We, at LogSentinel, for example, do exactly that – we allow publishing the latest Merkle root and the latest hash chain to Ethereum. Then even if those with access to the underlying datastore manage to modify and regenerate the entire chain/tree, there will be no match with the publicly advertised values.

How to store data on publish blockchains is a separate topic. In case of Ethereum, you can put any payload within a transaction, so you can put that hash in low-value transactions between two own addresses (or self-transactions). You can use smart-contracts as well, but that’s not necessary. For Bitcoin, you can use OP_RETURN. Other implementations may have different approaches to storing data within transactions.

If we want to achieve tamper-resistance, we just need to have several copies of the data, all subject to tamper-evidence guarantees. Just as in a public network. But what a public network gives is is a layer, which we can trust with providing us with the necessary piece for achieving local tamper evidence. Of course, going to hardware, it’s easier to have write-only storage (WORM, write once, ready many). The problem with it, is that it’s expensive and that you can’t reuse it. It’s not so much applicable to use-cases that require short-lived data that requires tamper-resistance.

So in summary, in order to have proper integrity guarantees and the ability to prove that the data in a single-owner or multi-owner private blockchains hasn’t been tampered with, we have to send publicly the latest hash of whatever structure we are using (chain or tree). If not, we are only complicating our lives by integrating a complex piece of technology without getting the real benefit it can bring – proving the integrity of our data.

The post Integrity Guarantees of Blockchains In Case of Single Owner Or Colluding Owners appeared first on Bozho's tech blog.

Pairings in CIRCL

Post Syndicated from Armando Faz-Hernández original https://blog.cloudflare.com/circl-pairings-update/

Pairings in CIRCL

Pairings in CIRCL

In 2019, we announced the release of CIRCL, an open-source cryptographic library written in Go that provides optimized implementations of several primitives for key exchange and digital signatures. We are pleased to announce a major update of our library: we have included more packages for elliptic curve-based cryptography (ECC), pairing-based cryptography, and quantum-resistant algorithms.

All of these packages are the foundation of work we’re doing on bringing the benefits of cutting edge research to Cloudflare. In the past we’ve experimented with post-quantum algorithms, used pairings to keep keys safe around the world, and implemented advanced elliptic curves. Now we’re continuing that work, and sharing the foundation with everyone.

In this blog post we’re going to focus on pairing-based cryptography and give you a brief overview of some properties that make this topic so pleasant. If you are not so familiar with elliptic curves, we recommend this primer on ECC.

Otherwise, let’s get ready, pairings have arrived!

What are pairings?

Elliptic curve cryptography enables an efficient instantiation of several cryptographic applications: public-key encryption, signatures, zero-knowledge proofs, and many other more exotic applications like oblivious transfer and OPRFs. With all of those applications you might wonder what is the additional value that pairings offer? To see that, we need first to understand the basic properties of an elliptic curve system, and from that we can highlight the big gap that pairings have.

Conventional elliptic curve systems work with a single group $\mathbb{G}$: the points of an elliptic curve $E$. In this group, usually denoted additively, we can add the points $P$ and $Q$ and get another point on the curve $R=P+Q$; also, we can multiply a point $P$ by an integer scalar $k$ and by repeatedly doing

$$ kP = \underbrace{P+P+\dots+P}_{k \text{ terms}} $$
This operation is known as scalar multiplication, which resembles exponentiation, and there are efficient algorithms for this operation. But given the point $Q=kP$, and $P$, it is very hard for an adversary that doesn’t know $k$ to find it. This is the Elliptic Curve Discrete Logarithm problem (ECDLP).

Now we show a property of scalar multiplication that can help us to understand the properties of pairings.

Scalar Multiplication is a Linear Map

Note the following equivalences:

$ (a+b)P = aP + bP $

$ b (aP) = a (bP) $.

These are very useful properties for many protocols: for example, the last identity allows Alice and Bob to arrive at the same value when following the Diffie-Hellman key-agreement protocol.

But while point addition and scalar multiplication are nice, it’s also useful to be able to multiply points: if we had a point $P$ and $aP$ and $bP$, getting $abP$ out would be very cool and let us do all sorts of things. Unfortunately Diffie-Hellman would immediately be insecure, so we can’t get what we want.

Guess what? Pairings provide an efficient, useful sort of intermediary point multiplication.

It’s intermediate multiplication because although the operation takes two points as operands, the result of a pairing is not a point, but an element of a different group; thus, in a pairing there are more groups involved and all of them must contain the same number of elements.

Pairing is denoted as  $$ e \colon\; \mathbb{G}_1 \times \mathbb{G}_2 \rightarrow \mathbb{G}_T $$
Groups $\mathbb{G}_1$ and $\mathbb{G}_2$ contain points of an elliptic curve $E$. More specifically, they are the $r$-torsion points, for a fixed prime $r$. Some pairing instances fix $\mathbb{G}_1=\mathbb{G}_2$, but it is common to use disjoint sets for efficiency reasons. The third group $\mathbb{G}_T$ has notable differences. First, it is written multiplicatively, unlike the other two groups. $\mathbb{G}_T$ is not the set of points on an elliptic curve. It’s instead a subgroup of the multiplicative group over some larger finite field. It contains the elements that satisfy $x^r=1$, better known as the $r$-roots of unity.

Pairings in CIRCL
Source: “Pairings are not dead, just resting” by Diego Aranha ECC-2017 (inspired by Avanzi’s talk at SPEED-2009).

While every elliptic curve has a pairing, very few have ones that are efficiently computable. Those that do, we call them pairing-friendly curves.

The Pairing Operation is a Bilinear Map

What makes pairings special is that \(e\) is a bilinear map. Yes, the linear property of the scalar multiplication is present twice, one per group. Let’s see the first linear map.

For points $P, Q, R$ and scalars $a$ and $b$ we have:

$ e(P+Q, R) = e(P, R) * e(Q, R) $

$ e(aP, Q) = e(P, Q)^a $

So, a scalar $a$ acting in the first operand as $aP$, finds its way out and escapes from the input of the pairing and appears in the output of the pairing as an exponent in $\mathbb{G}_T$. The same linear map is observed for the second group:

$ e(P, Q+R) = e(P, Q) * e(P, R) $

$ e(P, bQ) = e(P, Q)^b $

Hence, the pairing is bilinear. We will see below how this property becomes useful.

Can bilinear pairings help solving ECDLP?

The MOV (by Menezes, Okamoto, and Vanstone) attack reduces the discrete logarithm problem on elliptic curves to finite fields. An attacker with knowledge of $kP$ and public points $P$ and $Q$ can recover $k$ by computing:

$ g_k = e(kP, Q) = e(P, Q)^k $,

$ g = e(P, Q) $

$ k = \log_g(g_k) $

Note that the discrete logarithm to be solved was moved from $\mathbb{G}_1$ to $\mathbb{G}_T$. So an attacker must ensure that the discrete logarithm is easier to solve in $\mathbb{G}_T$, and surprisingly, for some curves this is the case.

Fortunately, pairings do not present a threat for standard curves (such as the NIST curves or Curve25519) because these curves are constructed in such a way that $\mathbb{G}_T$ gets very large, which makes the pairing operation not efficient anymore.

This attacking strategy was one of the first applications of pairings in cryptanalysis as a tool to solve the discrete logarithm. Later, more people noticed that the properties of pairings are so useful, and can be used constructively to do cryptography. One of the fascinating truisms of cryptography is that one person’s sledgehammer is another person’s brick: while pairings yield a generic attack strategy for the ECDLP problem, it can also be used as a building block in a ton of useful applications.

Applications of Pairings

In the 2000s decade, a large wave of research works were developed aimed at applying pairings to many practical problems. An iconic pairing-based system was created by Antoine Joux, who constructed a one-round Diffie-Hellman key exchange for three parties.

Let’s see first how a three-party Diffie-Hellman is done without pairings. Alice, Bob and Charlie want to agree on a shared key, so they compute, respectively, $aP$, $bP$ and $cP$ for a public point P. Then, they send to each other the points they computed. So Alice receives $cP$ from Charlie and sends $aP$ to Bob, who can then send $baP$ to Charlie and get $acP$ from Alice and so on. After all this is done, they can all compute $k=abcP$. Can this be performed in a single round trip?

Pairings in CIRCL
Two round Diffie-Hellman without pairings.
Pairings in CIRCL
One round Diffie-Hellman with pairings.

The 3-party Diffie-Hellman protocol needs two communication rounds (on the top), but with the use of pairings a one-round trip protocol is possible.

Affirmative! Antoine Joux showed how to agree on a shared secret in a single round of communication. Alice announces $aP$, gets $bP$ and $cP$ from Bob and Charlie respectively, and then computes $k= (bP, cP)^a$. Likewise Bob computes $e(aP,cP)^b$ and Charlie does $e(aP,bP)^c$. It’s not difficult to convince yourself that all these values are equivalent, just by looking at the bilinear property.

$e(bP,cP)^a  = e(aP,cP)^b  = e(aP,bP)^c = e(P,P)^{abc}$

With pairings we’ve done in one round what would otherwise take two.

Another application in cryptography addresses a problem posed by Shamir in 1984: does there exist an encryption scheme in which the public key is an arbitrary string? Imagine if your public key was your email address. It would be easy to remember and certificate authorities and certificate management would be unnecessary.

A solution to this problem came some years later, in 2001, and is the Identity-based Encryption (IBE) scheme proposed by Boneh and Franklin, which uses bilinear pairings as the main tool.

Nowadays, pairings are used for the zk-SNARKS that make Zcash an anonymous currency, and are also used in drand to generate public-verifiable randomness. Pairings and the compact, aggregatable BLS signatures are used in Ethereum. We have used pairings to build Geo Key Manager: pairings let us implement a compact broadcast and negative broadcast scheme that together make Geo Key Manager work.

In order to make these schemes, we have to implement pairings, and to do that we need to understand the mathematics behind them.

Where do pairings come from?

In order to deeply understand pairings we must understand the interplay of geometry and arithmetic, and the origins of the group’s law. The starting point is the concept of a divisor, a formal combination of points on the curve.

$D = \sum n_i P_i$

The sum of all the coefficients $n_i$ is the degree of the divisor. If we have a function on the curve that has poles and zeros, we can count them with multiplicity to get a principal divisor. Poles are counted as negative, while zeros as positive. For example if we take the projective line, the function $x$ has the divisor $(0)-(\infty)$.

The degree of a divisor is the sum of its coefficients. All principal divisors have degree $0$.  The group of degree zero divisors modulo the principal divisors is the Jacobian. This means that we take all the degree zero divisors, and freely add or subtract principle divisors, constructing an abelian variety called the Jacobian.

Until now our constructions have worked for any curve.  Elliptic curves have a special property: since a line intersects the curve in three points, it’s always possible to turn an element of the Jacobian into one of the form $(P)-(O)$ for a point $P$. This is where the addition law of elliptic curves comes from.

Pairings in CIRCL
The relation between addition on an elliptic curve and the geometry of the curve. Source file ECClines.svg

Given a function $f$ we can evaluate it on a divisor $D=\sum n_i P_i$ by taking the product $\prod f(P_i)^{n_i}$. And if two functions $f$ and $g$ have disjoint divisors, we have the Weil duality:

$ f(\text{div}(g)) = g(\text{div}(f)) $,

The existence of Weil duality is what gives us the bilinear map we seek. Given an $r$-torsion point $T$ we have a function $f$ whose divisor is $r(T)-r(O)$. We can write down an auxiliary function $g$ such that $f(rP)=g^r(P)$ for any $P$. We then get a pairing by taking.

$e_r(S,T)=\frac{g(X+S)}{g(X)}$.

The auxiliary point $X$ is any point that makes the numerator and denominator defined.

In practice, the pairing we have defined above, the Weil pairing, is little used. It was historically the first pairing and is extremely important in the mathematics of elliptic curves, but faster alternatives based on more complicated underlying mathematics are used today. These faster pairings have different $\mathbb{G}_1$ and $\mathbb{G}_2$, while the Weil pairing made them the same.

Shift in parameters

As we saw earlier, the discrete logarithm problem can be attacked either on the group of elliptic curve points or on the third group (an extension of a prime field) whichever is weaker. This is why the parameters that define a pairing must balance the security of the three groups.

Before 2015 a good balance between the extension degree, the size of the prime field, and the security of the scheme was achieved by the family of Barreto-Naehrig (BN) curves. For 128 bits of security, BN curves use an extension of degree 12, and have a prime of size 256 bits; as a result they are an efficient choice for implementation.

A breakthrough for pairings occurred in 2015 when Kim and Barbescu published a result that accelerated the attacks in finite fields. This resulted in increasing the size of fields to comply with standard security levels. Just as short hashes like MD5 got depreciated as they became insecure and $2^{64}$ was no longer enough, and RSA-1024 was replaced with RSA-2048, we regularly change parameters to deal with improved attacks.

For pairings this implied the use of larger primes for all the groups. Roughly speaking, the previous 192-bit security level becomes the new 128-bit level after this attack. Also, this shift in parameters brings the family of Barreto-Lynn-Scott (BLS) curves to the stage because pairings on BLS curves are faster than BN in this new setting. Hence, currently BLS curves using an extension of degree 12, and primes of around 384 bits provide the equivalent to 128 bit security.

The IETF draft (currently in preparation) draft-irtf-cfrg-pairing-friendly-curves specifies secure pairing-friendly elliptic curves. It includes parameters for BN and BLS families of curves. It also targets different security levels to provide crypto agility for some applications relying on pairing-based cryptography.

Implementing Pairings in Go

Historically, notable examples of software libraries implementing pairings include PBC by Ben Lynn, Miracl by Michael Scott, and Relic by Diego Aranha. All of them are written in C/C++ and some ports and wrappers to other languages exist.

In the Go standard library we can find the golang.org/x/crypto/bn256 package by Adam Langley, an implementation of a pairing using a BN curve with 256-bit prime. Our colleague Brendan McMillion built github.com/cloudflare/bn256 that dramatically improves the speed of pairing operations for that curve. See the RWC-2018 talk to see our use case of pairing-based cryptography. This time we want to go one step further, and we started looking for alternative pairing implementations.

Although one can find many libraries that implement pairings, our goal is to rely on one that is efficient, includes protection against side channels, and exposes a flexible API oriented towards applications that permit generality, while avoiding common security pitfalls. This motivated us to include pairings in CIRCL. We followed best practices on secure code development and we want to share with you some details about the implementation.

We started by choosing a pairing-friendly curve. Due to the attack previously mentioned, the BN256 curve does not meet the 128-bit security level. Thus there is a need for using stronger curves. Such a stronger curve is the BLS12-381 curve that is widely used in the zk-SNARK protocols and short signature schemes. Using this curve allows us to make our Go pairing implementation interoperable with other implementations available in other programming languages, so other projects can benefit from using CIRCL too.

This code snippet tests the linearity property of pairings and shows how easy it is to use our library.

import (
    "crypto/rand"
    "fmt"
    e "github.com/cloudflare/circl/ecc/bls12381"
)

func ExamplePairing() {
    P,  Q := e.G1Generator(), e.G2Generator()
    a,  b := new(e.Scalar), new(e.Scalar)
    aP, bQ := new(e.G1), new(e.G2)
    ea, eb := new(e.Gt), new(e.Gt)

    a.Random(rand.Reader)
    b.Random(rand.Reader)

    aP.ScalarMult(a, P)
    bQ.ScalarMult(b, Q)

    g  := e.Pair( P, Q)
    ga := e.Pair(aP, Q)
    gb := e.Pair( P,bQ)

    ea.Exp(g, a)
    eb.Exp(g, b)
    linearLeft := ea.IsEqual(ga) // e(P,Q)^a == e(aP,Q)
    linearRight:= eb.IsEqual(gb) // e(P,Q)^b == e(P,bQ)

    fmt.Print(linearLeft && linearRight)
    // Output: true
}

We applied several optimizations that allowed us to improve on performance and security of the implementation. In fact, as the parameters of the curve are fixed, some other optimizations become easier to apply; for example, the code for prime field arithmetic and the towering construction for extension fields as we detail next.

Formally-verified arithmetic using fiat-crypto

One of the more difficult parts of a cryptography library to implement correctly is the prime field arithmetic. Typically people specialize it for speed, but there are many tedious constraints on what the inputs to operations can be to ensure correctness. Vulnerabilities have happened when people get it wrong, across many libraries. However, this code is perfect for machines to write and check. One such tool is fiat-crypto.

Using fiat-crypto to generate the prime field arithmetic means that we have a formal verification that the code does what we need. The fiat-crypto tool is invoked in this script, and produces Go code for addition, subtraction, multiplication, and squaring over the 381-bit prime field used in the BLS12-381 curve. Other operations are not covered, but those are much easier to check and analyze by hand.

Another advantage is that it avoids relying on the generic big.Int package, which is slow, frequently spills variables to the heap causing dynamic memory allocations, and most importantly, does not run in constant-time. Instead, the code produced is straight-line code, no branches at all, and relies on the math/bits package for accessing machine-level instructions. Automated code generation also means that it’s easier to apply new techniques to all primitives.

Tower Field Arithmetic

In addition to prime field arithmetic, say integers modulo a prime number p, a pairing also requires high-level arithmetic operations over extension fields.

To better understand what an extension field is, think of the analogous case of going from the reals to the complex numbers: the operations are referred as usual, there exist addition, multiplication and division $(+, -, \times, /)$, however they are computed quite differently.

The complex numbers are a quadratic extension over the reals, so imagine a two-level house. The first floor is where the real numbers live, however, they cannot access the second floor by themselves. On the other hand, the complex numbers can access the entire house through the use of a staircase. The equation $f(x)=x^2+1$ was not solvable over the reals, but is solvable over the complex numbers, since they have the number $i^2=1$. And because they have the number $i$, they also have to have numbers like $3i$ and $5+i$ that solve other equations that weren’t solvable over the reals either. This second story has given the roots of the polynomials a place to live.

Algebraically we can view the complex numbers as $\mathbb{R}[x]/(x^2+1)$, the space of polynomials where we consider $x^2=-1$. Given a polynomial like $x^3+5x+1$, we can turn it into $4x+1$, which is another way of writing $1+4i$. In this new field $x^2+1=0$ holds automatically, and we have added both $x$ as a root of the polynomial we picked. This process of writing down a field extension by adding a root of a polynomial works over any field, including finite fields.

Following this analogy, one can construct not a house but a tower, say a $k=12$ floor building where the ground floor is the prime field $\mathbb{F}_p$. The reason to build such a large tower is because we want to host VIP guests: namely a group called $\mu_r$, the $r$-roots of unity. There are exactly $r$ members and they behave as an (algebraic) group, i.e. for all $x,y \in \mu_r$, it follows $x*y \in \mu_r$ and $x^r = 1$.

One particularity is that making operations on the top-floor can be costly. Assume that whenever an operation is needed in the top-floor, members who live on the main floor are required to provide their assistance. Hence an operation on the top-floor needs to go all the way down to the ground floor. For this reason, our tower needs more than a staircase, it needs an efficient way to move between the levels, something even better than an elevator.

What if we use portals? Imagine anyone on the twelfth floor using a portal to go down immediately to the sixth floor, and then use another one connecting the sixth to the second floor, and finally another portal connecting to the ground floor. Thus, one can get faster to the first floor rather than descending through a long staircase.

The building analogy can be used to understand and construct a tower of finite fields. We use only some extensions to build a twelfth extension from the (ground) prime field $\mathbb{F}_p$.

$\mathbb{F}_{p}$ ⇒  $\mathbb{F}_{p^2}$ ⇒  $\mathbb{F}_{p^4}$ ⇒  $\mathbb{F}_{p^{12}}$

In fact, the extension of finite fields is as follows:

  • $\mathbb{F}_{p^2}$ is built as polynomials in $\mathbb{F}_p[u]$ reduced modulo $u^2+1=0$.
  • $\mathbb{F}_{p^6}$ is built as polynomials in $\mathbb{F}_{p^2}[v]$ reduced modulo $v^2+u+1=0$.
  • $\mathbb{F}_{p^{12}}$ is built as polynomials in $\mathbb{F}_{p^6}[w]$ reduced modulo $w^2+v=0$, or as polynomials in $\mathbb{F}_{p^4}[w]$ reduced modulo $w^3+v=0$.

The portals here are the polynomials used as modulus, as they allow us to move from one extension to the other.

Different constructions for higher extensions have an impact on the number of operations performed. Thus, we implemented the latter tower field for $\mathbb{F}_{p^{12}}$ as it results in a lower number of operations. The arithmetic operations are quite easy to implement and manually verify, so at this level formal verification is not as effective as in the case of prime field arithmetic. However, having an automated tool that generates code for this arithmetic would be useful for developers not familiar with the internals of field towering. The fiat-crypto tool keeps track of this idea [Issue 904, Issue 851].

Now, we describe more details about the main core operations of a bilinear pairing.

The Miller loop and the final exponentiation

The pairing function we implemented is the optimal r-ate pairing, which is defined as:

$ e(P,Q) = f_Q(P)^{\text{exp}} $

That is the construction of a function $f$ based on $Q$, then evaluated on a point $P$, and the result of that is raised to a specific power. The efficient function evaluation is performed using “the Miller loop”, which is an algorithm devised by Victor Miller and has a similar structure to a double-and-add algorithm for scalar multiplication.

After having computed $f_Q(P)$ this value is an element of $\mathbb{F}_{p^{12}}$, however it is not yet an $r$-root of unity; in order to do so, the final exponentiation accomplishes this task. Since the exponent is constant for each curve, special algorithms can be tuned for it.

One interesting acceleration opportunity presents itself: in the Miller loop the elements of $\mathbb{F}_{p^{12}}$ that we have to multiply by are special — as polynomials, they have no linear term and their constant term lives in $\mathbb{F}_{p^{2}}$. We created a specialized multiplication that avoids multiplications where the input has to be zero. This specialization accelerated the pairing computation by 12%.

So far, we have described how the internal operations of a pairing are performed. There are still some other functions floating around regarding the use of pairings in cryptographic protocols. It is also important to optimize these functions and now we will discuss some of them.

Product of Pairings

Often protocols will want to evaluate a product of pairings, rather than a single pairing. This is the case if we’re evaluating multiple signatures, or if the protocol uses cancellation between different equations to ensure security, as in the dual system encoding approach to designing protocols. If each pairing was evaluated individually, this would require multiple evaluations of the final exponentiation. However, we can evaluate the product first, and then evaluate the final exponentiation once. This requires a different interface that can take vectors of points.

Occasionally, there is a sign or an exponent in the factors of the product. It’s very easy to deal with a sign explicitly by negating one of the input points, almost for free. General exponents are more complicated, especially when considering the need for side channel protection. But since we expose the interface, later work on the library will accelerate it without changing applications.

Regarding API exposure, one of the trickiest and most error prone aspects of software engineering is input validation. So we must check that raw binary inputs decode correctly as the points used for a pairing. Part of this verification includes subgroup membership testing which is the topic we discuss next.

Subgroup Membership Testing

Checking that a point is on the curve is easy, but checking that it has the right order is not: the classical way to do this is an entire expensive scalar multiplication. But implementing pairings involves the use of many clever tricks that help to make things run faster.

One example is twisting: the $\mathbb{G}_2$ group are points with coordinates in $\mathbb{F}_{p^{12}}$, however, one can use a smaller field to reduce the number of operations. The trick here is using an associated curve $E’$, which is a twist of the original curve $E$. This allows us to work on the subfield $\mathbb{F}_{p^{2}}$ that has cheaper operations.

Additionally, twisting the curve over $\mathbb{G}_2$ carries some efficiently computable endomorphisms coming from the field structure. For the cost of two field multiplications, we can compute an additional endomorphism, dramatically decreasing the cost of scalar multiplication.

By searching for the smallest combination of scalars that could zero out the $r$-torsion points, Sean Bowe came up with a much more efficient way to do subgroup checks. We implement his trick, with a big reduction in the complexity of some applications.

As can be seen, implementing a pairing is full of subtleties. We just saw that point validation in the pairing setting is a bit more challenging than in the conventional case of elliptic curve cryptography. This kind of reformulation also applies to other operations that require special care on their implementation. One another example is how to encode binary strings as elements of the group $\mathbb{G}_1$ (or $\mathbb{G}_2$). Although this operation might sound simple, implementing it securely needs to take into consideration several aspects; thus we expand more on this topic.

Hash to Curve

An important piece on the Boneh-Franklin Identity-based Encryption scheme is a special hash function that maps an arbitrary string — the identity, e.g., an email address — to a point on an elliptic curve, and that still behaves as a conventional cryptographic hash function (such as SHA-256) that is hard to invert and collision-resistant. This operation is commonly known as hashing to curve.

Boneh and Franklin found a particular way to perform hashing to curve: apply a conventional hash function to the input bitstring, and interpret the result as the $y$-coordinate of a point, then from the curve equation $y^2=x^3+b$, find the $x$-coordinate as $x=\sqrt[3]{y^2-b}$. The cubic root always exists on fields of characteristic $p\equiv 2 \bmod{3}$. But this algorithm does not apply to other fields in general restricting the parameters to be used.

Another popular algorithm, but since now we need to remark it is an insecure way for performing hash to curve is the following. Let the hash of the input be the $x$-coordinate, and from it find the $y$-coordinate by computing a square root $y= \sqrt{x^3+b}$. Note that not all $x$-coordinates lead that the square root exists, which means the algorithm may fail; thus, it’s a probabilistic algorithm. To make sure it works always, a counter can be added to $x$ and increased everytime the square root is not found. Although this algorithm always finds a point on the curve, this also makes the algorithm run in variable time i.e., it’s a non-constant time algorithm. The lack of this property on implementations of cryptographic algorithms makes them susceptible to timing attacks. The DragonBlood attack is an example of how a non-constant time hashing algorithm resulted in a full key recovery of WPA3 Wi-Fi passwords.

Secure hash to curve algorithms must guarantee several properties. It must be ensured that any input passed to the hash produces a point on the targeted group. That is no special inputs must trigger exceptional cases, and the output point must belong to the correct group. We make emphasis on the correct group since in certain applications the target group is the entire set of points of an elliptic curve, but in other cases, such as in the pairing setting, the target group is a subgroup of the entire curve, recall that $\mathbb{G}_1$ and $\mathbb{G}_2$ are $r$-torsion points. Finally, some cryptographic protocols are proven secure provided that the hash to curve function behaves as a random oracle of points. This requirement adds another level of complexity to the hash to curve function.

Fortunately, several researchers have addressed most of these problems and some other researchers have been involved in efforts to define a concrete specification for secure algorithms for hashing to curves, by extending the sort of geometric trick that worked for the Boneh-Franklin curve. We have participated in the Crypto Forum Research Group (CFRG) at IETF on the work-in-progress Internet draft-irtf-cfrg-hash-to-curve. This document specifies secure algorithms for hashing targeting several elliptic curves including the BLS12-381 curve. At Cloudflare, we are actively collaborating in several working groups of IETF, see Jonathan Hoyland’s post to know more about it.

Our implementation complies with the recommendations given in the hash to curve draft and also includes many implementation techniques and tricks derived from a vast number of academic research articles in pairing-based cryptography. A good compilation of most of these tricks is delightfully explained by Craig Costello.

We hope this post helps you to shed some light and guidance on the development of pairing-based cryptography, as it has become much more relevant these days. We will share with you soon an interesting use case in which the application of pairing-based cryptography helps us to harden the security of our infrastructure.

What’s next?

We invite you to use our CIRCL library, now equipped with bilinear pairings. But there is more: look at other primitives already available such as HPKE, VOPRF, and Post-Quantum algorithms. On our side, we will continue improving the performance and security of our library, and let us know if any of your projects uses CIRCL, we would like to know your use case. Reach us at research.cloudflare.com.

You’ll soon hear more about how we’re using CIRCL across Cloudflare.

Exported Authenticators: The long road to RFC

Post Syndicated from Jonathan Hoyland original https://blog.cloudflare.com/exported-authenticators-the-long-road-to-rfc/

Exported Authenticators: The long road to RFC

Exported Authenticators: The long road to RFC

Our earlier blog post talked in general terms about how we work with the IETF. In this post we’re going to talk about a particular IETF project we’ve been working on, Exported Authenticators (EAs). Exported Authenticators is a new extension to TLS that we think will prove really exciting. It unlocks all sorts of fancy new authentication possibilities, from TLS connections with multiple certificates attached, to logging in to a website without ever revealing your password.

Now, you might have thought that given the innumerable hours that went into the design of TLS 1.3 that it couldn’t possibly be improved, but it turns out that there are a number of places where the design falls a little short. TLS allows us to establish a secure connection between a client and a server. The TLS connection presents a certificate to the browser, which proves the server is authorised to use the name written on the certificate, for example blog.cloudflare.com. One of the most common things we use that ability for is delivering webpages. In fact, if you’re reading this, your browser has already done this for you. The Cloudflare Blog is delivered over TLS, and by presenting a certificate for blog.cloudflare.com the server proves that it’s allowed to deliver Cloudflare’s blog.

When your browser requests blog.cloudflare.com you receive a big blob of HTML that your browser then starts to render. In the dim and distant past, this might have been the end of the story. Your browser would render the HTML, and display it. Nowadays, the web has become more complex, and the HTML your browser receives often tells it to go and load lots of other resources. For example, when I loaded the Cloudflare blog just now, my browser made 73 subrequests.

As we mentioned in our connection coalescing blog post, sometimes those resources are also served by Cloudflare, but on a different domain. In our connection coalescing experiment, we acquired certificates with a special extension, called a Subject Alternative Name (SAN), that tells the browser that the owner of the certificate can act as two different websites. Along with some further shenanigans that you can read about in our blog post, this lets us serve the resources for both the domains over a single TLS connection.

Cloudflare, however, services millions of domains, and we have millions of certificates. It’s possible to generate certificates that cover lots of domains, and in fact this is what Cloudflare used to do. We used to use so-called “cruise-liner” certificates, with dozens of names on them. But for connection coalescing this quickly becomes impractical, as we would need to know what sub-resources each webpage might request, and acquire certificates to match. We switched away from this model because issues with individual domains could affect other customers.

What we’d like to be able to do is serve as much content as possible down a single connection. When a user requests a resource from a different domain they need to perform a new TLS handshake, costing valuable time and resources. Our connection coalescing experiment showed the benefits when we know in advance what resources are likely to be requested, but most of the time we don’t know what subresources are going to be requested until the requests actually arrive. What we’d rather do is attach extra identities to a connection after it’s been established, and we know what extra domains the client actually wants. Because the TLS connection is just a transport mechanism and doesn’t understand the information being sent across it, it doesn’t actually know what domains might subsequently be requested. This is only available to higher-layer protocols such as HTTP. However, we don’t want any website to be able to impersonate another, so we still need to have strong authentication.

Exported Authenticators

Enter Exported Authenticators. They give us even more than we asked for. They allow us to do application layer authentication that’s just as strong as the authentication you get from TLS, and then tie it to the TLS channel. Now that’s a pretty complicated idea, so let’s break it down.

To understand application layer authentication we first need to explain what the application layer is. The application layer is a reference to the OSI model. The OSI model describes the various layers of abstraction we use, to make things work across the Internet. When you’re developing your latest web application you don’t want to have to worry about how light is flickered down a fibre optic cable, or even how the TLS handshake is encoded (although that’s a fascinating topic in its own right, let’s leave that for another time.)

All you want to care about is having your content delivered to your end-user, and using TLS gives you a guaranteed in-order, reliable, authenticated channel over which you can communicate. You just shove bits in one end of the pipe, and after lots of blinky lights, fancy routing, maybe a touch of congestion control, and a little decoding, *poof*, your data arrives at the end-user.

The application layer is the top of the OSI stack, and contains things like HTTP. Because the TLS handshake is lower in the stack, the application is oblivious to this process. So, what Exported Authenticators give us is the ability for the very top of the stack to reliably authenticate their partner.

Exported Authenticators: The long road to RFC
The seven-layered OSI model

Now let’s jump back a bit, and discuss what we mean when we say that EAs give us authentication that’s as strong as TLS authentication. TLS, as we know, is used to create a secure connection between two endpoints, but lots of us are hazy when we try and pin down exactly what we mean by “secure”. The TLS standard makes eight specific promises, but rather than get buried in that particular ocean of weeds, let’s just pick out the one guarantee that we care about most: Peer Authentication.

Peer authentication: The client's view of the peer identity should reflect the server's identity. [...]

In other words, if the client thinks that it’s talking to example.com then it should, in fact, be talking to example.com.

What we want from EAs is that if I receive an EA then I have cryptographic proof that the person I’m talking to is the person I think I’m talking to. Now at this point you might be wondering what an EA actually looks like, and what it has to do with certificates. Well, an EA is actually a trio of messages, the first of which is a Certificate. The second is a CertificateVerify, a cryptographic proof that the sender knows the private key for the certificate. Finally there is a Finished message, which acts as a MAC, and proves the first two parts of the message haven’t been tampered with. If this structure sounds familiar to you, it’s because it’s the same structure as used by the server in the TLS handshake to prove it is the owner of the certificate.

The final piece of unpacking we need to do is explaining what we mean by tying the authentication to the TLS channel. Because EAs are an application layer construct they don’t provide any transport mechanism. So, whilst I know that the EA was created by the server I want to talk to, without binding the EA to a TLS connection I can’t be sure that I’m talking directly to the server I want.

Exported Authenticators: The long road to RFC
Without protection, a malicious server can move Exported Authenticators from one connection to another.

For all I know, the TLS server I’m talking to is creating a new TLS connection to the EA Server, and relaying my request, and then returning the response. This would be very bad, because it would allow a malicious server to impersonate any server that supports EAs.

Exported Authenticators: The long road to RFC
Because EAs are bound to a single TLS connection, if a malicious server copies an EA from one connection to another it will fail to verify.

EAs therefore have an extra security feature. They use the fact that every TLS connection is guaranteed to produce a unique set of keys. EAs take one of these keys and use it to construct the EA. This means that if some malicious third-party copies an EA from one TLS session to another, the recipient wouldn’t be able to validate it. This technique is called channel binding, and is another fascinating topic, but this post is already getting a bit long, so we’ll have to revisit channel binding in a future blog post.

How the sausage is made

OK, now we know what EAs do, let’s talk about how they were designed and built. EAs are going through the IETF standardisation process. Draft standards move through the IETF process starting as Internet Drafts (I-Ds), and ending up as published Requests For Comment (RFCs). RFCs are voluntary standards that underpin much of the global Internet plumbing, and not just for security protocols like TLS. RFCs define DNS, UDP, TCP, and many, many more.

The first step in producing a new IETF standard is coming up with a proposal. Designing security protocols is a very conservative business, firstly because it’s very easy to introduce really subtle bugs, and secondly, because if you do introduce a security issue, things can go very wrong, very quickly. A flaw in the design of a protocol can be especially problematic as it can be replicated across multiple independent implementations — for example the TLS renegotiation vulnerabilities reported in 2009 and the custom EC(DH) parameters vulnerability from 2012. To minimise the risks of design issues, EAs hew closely to the design of the TLS 1.3 handshake.

Security and Assurance

Before making a big change to how authentication works on the Internet, we want as much assurance as possible that we’re not going to break anything. To give us more confidence that EAs are secure, they reuse parts of the design of TLS 1.3. The TLS 1.3 design was carefully examined by dozens of experts, and underwent multiple rounds of formal analysis — more on that in a moment. Using well understood design patterns is a super important part of security protocols. Making something secure is incredibly difficult, because security issues can be introduced in thousands of ways, and an attacker only needs to find one. By starting from a well understood design we can leverage the years of expertise that went into it.

Another vital step in catching design errors early is baked into the IETF process: achieving rough consensus. Although the ins and outs of the IETF process are worthy of their own blog post, suffice it to say the IETF works to ensure that all technical objections get addressed, and even if they aren’t solved they are given due care and attention. Exported Authenticators were proposed way back in 2016, and after many rounds of comments, feedback, and analysis the TLS Working Group (WG) at the IETF has finally reached consensus on the protocol. All that’s left before the EA I-D becomes an RFC is for a final revision of the text to be submitted and sent to the RFC Editors, leading hopefully to a published standard very soon.

As we just mentioned, the WG has to come to a consensus on the design of the protocol. One thing that can hold up achieving consensus are worries about security. After the Snowden revelations there was a barrage of attacks on TLS 1.2, not to mention some even earlier attacks from academia. Changing how trust works on the Internet can be pretty scary, and the TLS WG didn’t want to be caught flat-footed. Luckily this coincided with the maturation of some tools and techniques we can use to get mathematical guarantees that a protocol is secure. This class of techniques is known as formal methods. To help ensure that people are confident in the security of EAs I performed a formal analysis.

Formal Analysis

Formal analysis is a special technique that can be used to examine security protocols. It creates a mathematical description of the protocol, the security properties we want it to have, and a model attacker. Then, aided by some sophisticated software, we create a proof that the protocol has the properties we want even in the presence of our model attacker. This approach is able to catch incredibly subtle edge cases, which, if not addressed, could lead to attacks, as has happened before. Trotting out a formal analysis gives us strong assurances that we haven’t missed any horrible issues. By sticking as closely as possible to the design of TLS 1.3 we were able to repurpose much of the original analysis for EAs, giving us a big leg up in our ability to prove their security. Our EA model is available in Bitbucket, along with the proofs. You can check it out using Tamarin, a theorem prover for security protocols.

Formal analysis, and formal methods in general, give very strong guarantees that rule out entire classes of attack. However, they are not a panacea. TLS 1.3 was subject to a number of rounds of formal analysis, and yet an attack was still found. However, this attack in many ways confirms our faith in formal methods. The attack was found in a blind spot of the proof, showing that attackers have been pushed to the very edges of the protocol. As our formal analyses get more and more rigorous, attackers will have fewer and fewer places to search for attacks. As formal analysis has become more and more practical, more and more groups at the IETF have been asking to see proofs of security before standardising new protocols. This hopefully will mean that future attacks on protocol design will become rarer and rarer.

Once the EA I-D becomes an RFC, then all sorts of cool stuff gets unlocked — for example OPAQUE-EAs, which will allow us to do password-based login on the web without the server ever seeing the password! Watch this space.

Exported Authenticators: The long road to RFC

More Military Cryptanalytics, Part III

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/more-military-cryptanalytics-part-iii.html

Late last year, the NSA declassified and released a redacted version of Lambros D. Callimahos’s Military Cryptanalytics, Part III. We just got most of the index. It’s hard to believe that there are any real secrets left in this 44-year-old volume.

Apple’s NeuralHash Algorithm Has Been Reverse-Engineered

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/apples-neuralhash-algorithm-has-been-reverse-engineered.html

Apple’s NeuralHash algorithm — the one it’s using for client-side scanning on the iPhone — has been reverse-engineered.

Turns out it was already in iOS 14.3, and someone noticed:

Early tests show that it can tolerate image resizing and compression, but not cropping or rotations.

We also have the first collision: two images that hash to the same value.

The next step is to generate innocuous images that NeuralHash classifies as prohibited content.

This was a bad idea from the start, and Apple never seemed to consider the adversarial context of the system as a whole, and not just the cryptography.

No, RSA Is Not Broken

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/03/no-rsa-is-not-broken.html

I have been seeing this paper by cryptographer Peter Schnorr making the rounds: “Fast Factoring Integers by SVP Algorithms.” It describes a new factoring method, and its abstract ends with the provocative sentence: “This destroys the RSA cryptosystem.”

It does not. At best, it’s an improvement in factoring — and I’m not sure it’s even that. The paper is a preprint: it hasn’t been peer reviewed. Be careful taking its claims at face value.

Some discussion here.

I’ll append more analysis links to this post when I find them.

KEMTLS: Post-quantum TLS without signatures

Post Syndicated from Sofía Celi original https://blog.cloudflare.com/kemtls-post-quantum-tls-without-signatures/

KEMTLS: Post-quantum TLS without signatures

KEMTLS: Post-quantum TLS without signatures

The Transport Layer Security protocol (TLS), which secures most Internet connections, has mainly been a protocol consisting of a key exchange authenticated by digital signatures used to encrypt data at transport[1]. Even though it has undergone major changes since 1994, when SSL 1.0 was introduced by Netscape, its main mechanism has remained the same. The key exchange was first based on RSA, and later on traditional Diffie-Hellman (DH) and Elliptic-curve Diffie-Hellman (ECDH). The signatures used for authentication have almost always been RSA-based, though in recent years other kinds of signatures have been adopted, mainly ECDSA and Ed25519. This recent change to elliptic curve cryptography in both at the key exchange and at the signature level has resulted in considerable speed and bandwidth benefits in comparison to traditional Diffie-Hellman and RSA.

TLS is the main protocol that protects the connections we use everyday. It’s everywhere: we use it when we buy products online, when we register for a newsletter — when we access any kind of website, IoT device, API for mobile apps and more, really. But with the imminent threat of the arrival of quantum computers (a threat that seems to be getting closer and closer), we need to reconsider the future of TLS once again. A wide-scale post-quantum experiment was carried out by Cloudflare and Google: two post-quantum key exchanges were integrated into our TLS stack and deployed at our edge servers as well as in Chrome Canary clients. The goal of that experiment was to evaluate the performance and feasibility of deployment of two post-quantum key exchanges in TLS.

Similar experiments have been proposed for introducing post-quantum algorithms into the TLS handshake itself. Unfortunately, it seems infeasible to replace both the key exchange and signature with post-quantum primitives, because post-quantum cryptographic primitives are bigger, or slower (or both), than their predecessors. The proposed algorithms under consideration in the NIST post-quantum standardization process use mathematical objects that are larger than the ones used for elliptic curves, traditional Diffie-Hellman, or RSA. As a result, the overall size of public keys, signatures and key exchange material is much bigger than those from elliptic curves, Diffie-Hellman, or RSA.

How can we solve this problem? How can we use post-quantum algorithms as part of the TLS handshake without making the material too big to be transmitted? In this blogpost, we will introduce a new mechanism for making this happen. We’ll explain how it can be integrated into the handshake and we’ll cover implementation details. The key observation in this mechanism is that, while post-quantum algorithms have bigger communication size than their predecessors, post-quantum key exchanges have somewhat smaller sizes than post-quantum signatures, so we can try to replace signatures with key exchanges in some places to save space.  We will only focus on the TLS 1.3 handshake as it is the TLS version that should be currently used.

Past experiments: making the TLS 1.3 handshake post-quantum

KEMTLS: Post-quantum TLS without signatures

TLS 1.3 was introduced in August 2018, and it brought many security and performance improvements (notably, having only one round-trip to complete the handshake). But TLS 1.3 is designed for a world with classical computers, and some of its functionality will be broken by quantum computers when they do arrive.

The primary goal of TLS 1.3 is to provide authentication (the server side of the channel is always authenticated, the client side is optionally authenticated), confidentiality, and integrity by using a handshake protocol and a record protocol. The handshake protocol, the one of interest for us today, establishes the cryptographic parameters for securing and authenticating a connection. It can be thought of as having three main phases, as defined in RFC8446:

–  The Parameter Negotiation phase (referred to as ‘Server Parameters’ in RFC8446), which establishes other handshake parameters (whether the client is authenticated, application-layer protocol support, etc).

–  The Key Exchange phase, which establishes shared keying material and selects the cryptographic parameters to be used. Everything after this phase will be encrypted.

–  The Authentication phase, which authenticates the server (and, optionally, the client) and provides key confirmation and handshake integrity.

The main idea of past experiments that introduced post-quantum algorithms into the handshake of TLS 1.3 was to use them in place of classical algorithms by advertising them as part of the supported groups[2] and key share[3] extensions, and, therefore, establishing with them the negotiated connection parameters. Key encapsulation mechanisms (KEMs) are an abstraction of the basic key exchange primitive, and were used to generate the shared secrets. When using a pre-shared key, its symmetric algorithms can be easily replaced by post-quantum KEMs as well; and, in the case of password-authenticated TLS, some ideas have been proposed on how to use post-quantum algorithms with them.

Most of the above ideas only provide what is often defined as ‘transitional security’, because its main focus is to provide quantum-resistant confidentiality, and not to take quantum-resistant authentication into account. It is possible to use post-quantum signatures for TLS authentication, but the post-quantum signatures are larger than traditional ones. Furthermore, it is worth noting that using post-quantum signatures is much more expensive than using post-quantum KEMs.

We can estimate the impact of such a replacement on network traffic by simply looking at the sum of the cryptographic objects that are transmitted during the handshake. A typical TLS 1.3 handshake using elliptic curve X25519 and RSA-2048 would transmit 1,376 bytes, which would correspond to the public keys for key exchange, the certificate, the signature of the handshake, and the certificate chain. If we were to replace X25519 by the post-quantum KEM Kyber512 and RSA by the post-quantum signature Dilithium II, two of the more efficient proposals, the size transmitted data would increase to 10,036 bytes[4]. The increase is mostly due to the size of the post-quantum signature algorithm.

The question then is: how can we achieve full post-quantum security and give a handshake that is efficient to be used?

A more efficient proposal: KEMTLS

There is a long history of other mechanisms, besides signatures, being used for authentication. Modern protocols, such as the Signal protocol, the Noise framework, or WireGuard, rely on key exchange mechanisms for authentication; but they are unsuitable for the TLS 1.3 case as they expect the long-term key material to be known in advance by the interested parties.

The OPTLS proposal by Krawczyk and Wee authenticates the TLS handshake without signatures by using a non-interactive key exchange (NIKE). However, the only somewhat efficient construction for a post-quantum NIKE is CSIDH, the security of which is the subject of an ongoing debate. But we can build on this idea, and use KEMs for authentication. KEMTLS, the current proposed experiment, replaces the handshake signature by a post-quantum KEM key exchange. It was designed and introduced by Peter Schwabe, Douglas Stebila and Thom Wiggers in the publication ‘Post-Quantum TLS Without Handshake Signatures’.

KEMTLS, therefore, achieves the same goals as TLS 1.3 (authentication, confidentiality and integrity) in the face of quantum computers. But there’s one small difference compared to the TLS 1.3 handshake. KEMTLS allows the client to send encrypted application data in the second client-to-server TLS message flow when client authentication is not required, and in the third client-to-server TLS message flow when mutual authentication is required. Note that with TLS 1.3, the server is able to send encrypted and authenticated application data in its first response message (although, in most uses of TLS 1.3, this feature is not actually used). With KEMTLS, when client authentication is not required, the client is able to send its first encrypted application data after the same number of handshake round trips as in TLS 1.3.

Intuitively, the handshake signature in TLS 1.3 proves possession of the private key corresponding to the public key certified in the TLS 1.3 server certificate. For these signature schemes, this is the straightforward way to prove possession; another way to prove possession is through key exchanges. By carefully considering the key derivation sequence, a server can decrypt any messages sent by the client only if it holds the private key corresponding to the certified public key. Therefore, implicit authentication is fulfilled. It is worth noting that KEMTLS still relies on signatures by certificate authorities to authenticate the long-term KEM keys.

With KEMTLS, application data transmitted during the handshake is implicitly authenticated rather than explicitly (as in TLS 1.3), and has slightly weaker downgrade resilience and forward secrecy; but full downgrade resilience and forward secrecy are achieved once the KEMTLS handshake completes.

KEMTLS: Post-quantum TLS without signatures

By replacing the handshake signature by a KEM key exchange, we reduce the size of the data transmitted in the example handshake to 8,344 bytes, using Kyber512 and Dilithium II — a significant reduction. We can reduce the handshake size even for algorithms such as the NTRU-assumption based KEM NTRU and signature algorithm Falcon, which have a less-pronounced size gap. Typically, KEM operations are computationally much lighter than signing operations, which makes the reduction even more significant.

KEMTLS was presented at ACM CCS 2020. You can read more about its details in the paper. It was initially implemented in the RustTLS library by Thom Wiggers using optimized C/assembly implementations of the post-quantum algorithms provided by the PQClean and Open Quantum Safe projects.

Cloudflare and KEMTLS: the implementation

As part of our effort to show that TLS can be completely post-quantum safe, we implemented the full KEMTLS handshake in Golang’s TLS 1.3 suite. The implementation was done in several steps:

  1. We first needed to clone our own version of Golang, so we could add different post-quantum algorithms to it. You can find our own version here. This code gets constantly updated with every release of Golang, following these steps.
  2. We needed to implement post-quantum algorithms in Golang, which we did on our own cryptographic library, CIRCL.
  3. As we cannot force certificate authorities to use certificates with long-term post-quantum KEM keys, we decided to use Delegated Credentials. A delegated credential is a short-lasting key that the certificate’s owner has delegated for use in TLS. Therefore, they can be used for post-quantum KEM keys. See its implementation in our Golang code here.
  4. We implemented mutual auth (client and server authentication) KEMTLS by using Delegated Credentials for the authentication process. See its implementation in our Golang code here. You can also check its test for an overview of how it works.

Implementing KEMTLS was a straightforward process, although it did require changes to the way Golang handles a TLS 1.3 handshake and how the key schedule works.

A “regular” TLS 1.3 handshake in Golang (from the server perspective) looks like this:

func (hs *serverHandshakeStateTLS13) handshake() error {
    c := hs.c

    // For an overview of the TLS 1.3 handshake, see RFC 8446, Section 2.
    if err := hs.processClientHello(); err != nil {
   	 return err
    }
    if err := hs.checkForResumption(); err != nil {
   	 return err
    }
    if err := hs.pickCertificate(); err != nil {
   	 return err
    }
    c.buffering = true
    if err := hs.sendServerParameters(); err != nil {
   	 return err
    }
    if err := hs.sendServerCertificate(); err != nil {
   	 return err
    }
    if err := hs.sendServerFinished(); err != nil {
   	 return err
    }
    // Note that at this point we could start sending application data without
    // waiting for the client's second flight, but the application might not
    // expect the lack of replay protection of the ClientHello parameters.
    if _, err := c.flush(); err != nil {
   	 return err
    }
    if err := hs.readClientCertificate(); err != nil {
   	 return err
    }
    if err := hs.readClientFinished(); err != nil {
   	 return err
    }

    atomic.StoreUint32(&c.handshakeStatus, 1)

    return nil
}

We had to interrupt the process when the server sends the Certificate (sendServerCertificate()) in order to send the KEMTLS specific messages. In the same way, we had to add the appropriate KEM TLS messages to the client’s handshake. And, as we didn’t want to change so much the way Golang handles TLS 1.3, we only added one new constant to the configuration that can be used by a server in order to ask for the Client’s Certificate (the constant is serverConfig.ClientAuth = RequestClientKEMCert).

The implementation is easy to work with: if a delegated credential or a certificate has a public key of a supported post-quantum KEM algorithm, the handshake will proceed with KEMTLS. If the server requests a Client KEMTLS Certificate, the handshake will use client KEMTLS authentication.

Running the Experiment

So, what’s next? We’ll take the code we have produced and run it on actual Cloudflare infrastructure to measure how efficiently it works.

Thanks

Many thanks to everyone involved in the project: Chris Wood, Armando Faz-Hernández, Thom Wiggers, Bas Westerbaan, Peter Wu, Peter Schwabe, Goutam Tamvada, Douglas Stebila, Thibault Meunier, and the whole Cloudflare Research team.

1It is worth noting that the RSA key transport in TLS ≤1.2 has the server only authenticated by RSA public key encryption, although the server’s RSA public key is certified using RSA signatures by Certificate Authorities.
2An extension used by the client to indicate which named groups -Elliptic Curve Groups, Finite Field Groups- it supports for key exchange.
3An extension which contains the endpoint’s cryptographic parameters.
4These numbers, as it is noted in the paper, are based on the round-2 submissions.

Best practices and advanced patterns for Lambda code signing

Post Syndicated from Cassia Martin original https://aws.amazon.com/blogs/security/best-practices-and-advanced-patterns-for-lambda-code-signing/

Amazon Web Services (AWS) recently released Code Signing for AWS Lambda. By using this feature, you can help enforce the integrity of your code artifacts and make sure that only trusted developers can deploy code to your AWS Lambda functions. Today, let’s review a basic use case along with best practices for lambda code signing. Then, let’s dive deep and talk about two advanced patterns—one for centralized signing and one for cross account layer validation. You can use these advanced patterns to use code signing in a distributed ownership model, where you have separate groups for developers writing code and for groups responsible for enforcing specific signing profiles or for publishing layers.

Secure software development lifecycle

For context of what this capability gives you, let’s look at the secure software development lifecycle (SDLC). You need different kinds of security controls for each of your development phases. An overview of the secure SDLC development stages—code, build, test, deploy, and monitor—, along with applicable security controls, can be found in Figure 1. You can use code signing for Lambda to protect the deployment stage and give a cryptographically strong hash verification.

Figure 1: Code signing provides hash verification in the deployment phase of a secure SDLC

Figure 1: Code signing provides hash verification in the deployment phase of a secure SDLC

Adding Security into DevOps and Implementing DevSecOps Using AWS CodePipeline provide additional information on building a secure SDLC, with a particular focus on the code analysis controls.

Basic pattern:

Figure 2 shows the basic pattern described in Code signing for AWS Lambda and in the documentation. The basic code signing pattern uses AWS Signer on a ZIP file and calls a create API to install the signed artifact in Lambda.

Figure 2: The basic code signing pattern

Figure 2: The basic code signing pattern

The basic pattern illustrated in Figure 2 is as follows:

  1. An administrator creates a signing profile in AWS Signer. A signing profile is analogous to a code signing certificate and represents a publisher identity. Administrators can provide access via AWS Identity and Access Management (IAM) for developers to use the signing profile to sign their artifacts.
  2. Administrators create a code signing configuration (CSC)—a new resource in Lambda that specifies the signing profiles that are allowed to sign code and the signature validation policy that defines whether to warn or reject deployments that fail the signature checks. CSC can be attached to existing or new Lambda functions to enable signature validations on deployment.
  3. Developers use one of the allowed signing profiles to sign the deployment artifact—a ZIP file—in AWS Signer.
  4. Developers deploy the signed deployment artifact to a function using either the CreateFunction API or the UpdateFunctionCode API.

Lambda performs signature checks before accepting the deployment. The deployment fails if the signature checks fail and you have set the signature validation policy in the CSC to reject deployments using ENFORCE mode.

Code signing checks

Code signing for Lambda provides four signature checks. First, the integrity check confirms that the deployment artifact hasn’t been modified after it was signed using AWS Signer. Lambda performs this check by matching the hash of the artifact with the hash from the signature. The second check is the source mismatch check, which detects if a signature isn’t present or if the artifact is signed by a signing profile that isn’t specified in the CSC. The third, expiry check, will fail if a signature is past its point of expiration. The fourth is the revocation check, which is used to see if anyone has explicitly marked the signing profile used for signing or the signing job as invalid by revoking it.

The integrity check must succeed or Lambda will not run the artifact. The other three checks can be configured to either block invocation or generate a warning. These checks are performed in order until one check fails or all checks succeed. As a security leader concerned about the security of code deployments, you can use the Lambda code signing checks to satisfy different security assurances:

  • Integrity – Provides assurance that code has not been tampered with, by ensuring that the signature on the build artifact is cryptographically valid.
  • Source mismatch – Provides assurance that only trusted entities or developers can deploy code.
  • Expiry – Provides assurance that code running in your environment is not stale, by making sure that signatures were created within a certain date and time.
  • Revocation – Allows security administrators to remove trust by invalidating signatures after the fact so that they cannot be used for code deployment if they have been exposed or are otherwise no longer trusted.

The last three checks are enforced only if you have set the signature validation policy—UntrustedArtifactOnDeployment parameter—in the CSC to ENFORCE. If the policy is set to WARN, then failures in any of the mismatch, expiry, and revocation checks will log a metric called a signature validation error in Amazon CloudWatch. The best practice for this setting is to initially set the policy to WARN. Then, you can monitor the warnings, if any, and update the policy to enforce when you’re confident in the findings in CloudWatch.

Centralized signing enforcement

In this scenario, you have a security administrators team that centrally manages and approves signing profiles. The team centralizes signing profiles in order to enforce that all code running on Lambda is authored by a trusted developer and isn’t tampered with after it’s signed. To do this, the security administrators team wants to enforce that developers—in the same account—can only create Lambda functions with signing profiles that the team has approved. By owning the signing profiles used by developer teams, the security team controls the lifecycle of the signatures and the ability to revoke the signatures. Here are instructions for creating a signing profile and CSC, and then enforcing their use.

Create a signing profile

To create a signing profile, you’ll use the AWS Command Line Interface (AWS CLI). Start by logging in to your account as the central security role. This is an administrative role that is scoped with permissions needed for setting up code signing. You’ll create a signing profile to use for an application named ABC. These example commands are written with prepopulated values for things like profile names, IDs, and descriptions. Change those as appropriate for your application.

To create a signing profile

  1. Run this command:
    aws signer put-signing-profile --platform-id "AWSLambda-SHA384-ECDSA" --profile-name profile_for_application_ABC
    

    Running this command will give you a signing profile version ARN. It will look something like arn:aws:signer:sa-east-1:XXXXXXXXXXXX:/signing-profiles/profile_for_application_ABC/XXXXXXXXXX. Make a note of this value to use in later commands.

    As the security administrator, you must grant the developers access to use the profile for signing. You do that by using the add-profile-permission command. Note that in this example, you are explicitly only granting permission for the signer:StartSigningJob action. You might want to grant permissions to other actions, such as signer:GetSigningProfile or signer:RevokeSignature, by making additional calls to add-profile-permission.

  2. Run this command, replacing <role-name> with the principal you’re using:
    aws signer add-profile-permission \
    --profile-name profile_for_application_ABC \
    --action signer:StartSigningJob \
    --principal <role-name> \
    --statement-id testStatementId
    

Create a CSC

You also want to make a CSCwith the signing profile that you, as the security administrator, want all your developers to use.

To create a CSC

Run this command, replacing <signing-profile-version-arn> with the output from Step 1 of the preceding procedure—Create a signing profile:

aws lambda create-code-signing-config \
--description "Application ABC CSC" \
--allowed-publishers SigningProfileVersionArns=<signing-profile-version-arn> \
--code-signing-policies "UntrustedArtifactOnDeployment"="Enforce"

Running this command will give you a CSCARN that will look something like arn:aws:lambda:sa-east-1:XXXXXXXXXXXX:code-signing-config:approved-csc-XXXXXXXXXXXXXXXXX. Make a note of this value to use later.

Write an IAM policy using the new CSC

Now that the security administrators team has created this CSC, how do they ensure that all the developers use it? Administrators can use IAM to grant access to the CreateFunction API, while using the new lambda:CodeSigningConfig condition key with the CSC ARN you created. This will ensure that developers can create functions only if code signing is enabled.

This IAM policy will allow the developer roles to create Lambda functions, but only when they are using the approved CSC. The additional clauses Deny the developers from creating their own signing profiles or CSCs, so that they are forced to use the ones provided by the central team.

To write an IAM policy

Run the following command. Replace <code-signing-config-arn> with the CSC ARN you created previously.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "lambda:CreateFunction",
        "lambda:PutFunctionCodeSigningConfig"
      ],
      "Resource": "*",
      "Condition": {
        "ForAnyValue:StringEquals": {
          "lambda:CodeSigningConfig": ["<code-signing-config-arn>"]
          }
         }        
        },
       {
         "Effect": "Deny", 
         "Action": [
        "signer:PutSigningProfile",
        "lambda:DeleteFunctionCodeSigningConfig",
        "lambda:UpdateCodeSigningConfig",
        "lambda:DeleteCodeSigningConfig",
        "lambda:CreateCodeSigningConfig"
      ],
         "Resource": "*"
       }
  ]
}

Create a signed Lambda function

Now, the developers have permission to create new Lambda functions, but only if the functions are configured with the approved CSC. The approved CSC can specify the settings for Lambda signing policies, and lists exactly what profiles are approved for signing the function code with. This means that developers in that account will only be able to create functions if the functions are signed with a profile approved by the central team and the developer permissions have been added to the signing profile used.

To create a signed Lambda function

  1. Upload any Lambda code file to an Amazon Simple Storage Service (Amazon S3) bucket with the name main-function.zip. Note that your S3 bucket must be version enabled.
  2. Sign the zipped Lambda function using AWS Signer and the following command, replacing <lambda-bucket> and <version-string> with the correct details from your uploaded main-function.zip.
    aws signer start-signing-job \ 
    --source 's3={bucketName=<lambda-bucket>, version=<version-string>, key=main-function.zip}' \
    --destination 's3={bucketName=<lambda-bucket>, prefix=signed-}' \
    --profile-name profile_for_application_ABC
    

  3. Download the newly created ZIP file from your Lambda bucket. It will be called something like signed-XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.zip.
  4. For convenience, rename it to signed-main-function.zip.
  5. Run the following command, replacing <lambda-role> with the ARN of your Lambda execution role, and replacing <code-signing-config-arn> with the result of the earlier procedure Create a CSC.
    aws lambda create-function \
        --function-name "signed-main-function" \
        --runtime "python3.8" \
        --role <lambda-role> \
        --zip-file "fileb://signed-main-function.zip" \
        --handler lambda_function.lambda_handler \ 
        --code-signing-config-arn <code-signing-config-arn>
    

Cross-account centralization

This pattern supports the use case where the security administrators and the developers are working in the same account. You might want to implement this across different accounts, which requires creating CSCs in specific accounts where developers need to deploy and update Lambda functions. To do this, you can use AWS CloudFormation StackSets to deploy CSCs. Stack sets allow you to roll out CloudFormation stacks across multiple AWS accounts. Use AWS CloudFormation StackSets for Multiple Accounts in an AWS Organization illustrates how to use an AWS CloudFormation template for deployment to multiple accounts.

The security administrators can detect and react to any changes to the stack set deployed CSCs by using drift detection. Drift detection is an AWS CloudFormation feature that detects unmanaged changes to the resources deployed using StackSets. To complete the solution, Implement automatic drift remediation for AWS CloudFormation using Amazon CloudWatch and AWS Lambda shares a solution for taking automated remediation when drift is detected in a CloudFormation stack.

Cross-account validation for Lambda layers

So far, you have the tools to sign your own Lambda code so that no one can tamper with it, and you’ve reviewed a pattern where one team creates and owns the signing profiles to be used by different developers. Let’s look at one more advanced pattern where you publish code as a signed Lambda layer in one account, and you then use it in a Lambda function in a separate account. A Lambda layer is an archive containing additional code that you can include in a function.

For this, let’s consider how to set up code signing when you’re using layers across two accounts. Layers allow you to use libraries in your function without needing to include them in your deployment package. It’s also possible to publish a layer in one account, and have a different account consume that layer. Let’s act as a publisher of a layer. In this use case, you want to use code signing so that consumers of your layer can have the security assurance that no one has tampered with the layer. Note that if you enable code signing to verify signatures on a layer, Lambda will also verify the signatures on the function code. Therefore, all of your deployment artifacts must be signed, using a profile listed in the CSC attached to the function.

Figure 3 illustrates the cross-account layer pattern, where you sign a layer in a publishing account and a function uses that layer in another consuming account.

Figure 3: This advanced pattern supports cross-account layers

Figure 3: This advanced pattern supports cross-account layers

Here are the steps to build this setup. You’ll be logging in to two different accounts, your publishing account and your consuming account.

Make a publisher signing profile

Running this command will give you a profile version ARN. Make a note of the value returned to use in a later step.

To make a publisher signing profile

  1. In the AWS CLI, log in to your publishing account.
  2. Run this command to make a signing profile for your publisher:
    aws signer put-signing-profile --platform-id "AWSLambda-SHA384-ECDSA" --profile-name publisher_approved_profile1
    

Sign your layer code using signing profile

Next, you want to sign your layer code with this signing profile. For this example, use the blank layer code from this GitHub project. You can make your own layer by creating a ZIP file with all your code files included in a directory supported by your Lambda runtime. AWS Lambda layers has instructions for creating your own layer.

You can then sign your layer code using the signing profile.

To sign your layer code

  1. Name your Lambda layer code file blank-python.zip and upload it to your S3 bucket.
  2. Sign the zipped Lambda function using AWS Signer with the following command. Replace <lambda-bucket> and <version-string> with the details from your uploaded blank-python.zip.
    aws signer start-signing-job \ 
    --source 's3={bucketName=<lambda-bucket>, version=<version-string>, key=blank-python.zip}' \
    --destination 's3={bucketName=<lambda-bucket>, prefix=signed-}' \
    --profile-name publisher_approved_profile1
    

Publish your signed layer

Now publish the resulting, signed layer. Note that the layers themselves don’t have signature validation on deployment. However, the signatures will be checked when they’re added to a function.

To publish your signed layer

  1. Download your new signed ZIP file from your S3 bucket, and rename it signed-layer.zip.
  2. Run the following command to publish your layer:
    aws lambda publish-layer-version \
    --layer-name lambda_signing \
    --zip-file "fileb://signed-layer.zip" \
    --compatible-runtimes python3.8 python3.7        
    

This command will return information about your newly published layer. Search for the LayerVersionArn and make a note of it for use later.

Grant read access

For the last step in the publisher account, you must grant read access to the layer using the add-layer-version-permission command. In the following command, you’re granting access to an individual account using the principal parameter.

(Optional) You could instead choose to grant access to all accounts in your organization by using “*” as the principal and adding the organization-id parameter.

To grant read access

  • Run the following command to grant read access to your layer, replacing <consuming-account-id> with the account ID of your second account:
    aws lambda add-layer-version-permission \
    --layer-name lambda_signing \
    --version-number 1 \
    --statement-id for-consuming-account \
    --action lambda:GetLayerVersion \
    --principal <consuming-account-id> 	
    

Create a CSC

It’s time to switch your AWS CLI to work with the consuming account. This consuming account can create a CSC for their Lambda functions that specifies what signing profiles are allowed.

To create a CSC

  1. In the AWS CLI, log out from your publishing account and into your consuming account.
  2. The consuming account will need a signing profile of its own to sign the main Lambda code. Run the following command to create one:
    aws signer put-signing-profile --platform-id "AWSLambda-SHA384-ECDSA" --profile-name consumer_approved_profile1
    

  3. Run the following command to create a CSC that allows code to be signed either by the publisher or the consumer. Replace <consumer-signing-profile-version-arn> with the profile version ARN you created in the preceding step. Replace <publisher-signing-profile-version-arn> with the signing profile from the Make a publisher signing profile procedure. Make a note of the CSC returned by this command to use in later steps.
    aws lambda create-code-signing-config \
    --description "Allow layers from publisher" \
    --allowed-publishers SigningProfileVersionArns="<publisher-signing-profile-version-arn>,<consumer-signing-profile-version-arn>" \
    --code-signing-policies "UntrustedArtifactOnDeployment"="Enforce"
    

Create a Lambda function using the CSC

When creating the function that uses the signed layer, you can pass in the CSC that you created. Lambda will check the signature on the function code in this step.

To create a Lambda function

  1. Use your own lambda code function, or make a copy of blank-python.zip, and rename it consumer-main-function.zip.) Upload consumer-main-function.zip to a versioned S3 bucket in your consumer account.

    Note: If the S3 bucket doesn’t have versioning enabled, the procedure will fail.

  2. Sign the function with the signing profile of the consumer account. Replace <consumers-lambda-bucket> and <version-string> in the following command with the name of the S3 bucket you uploaded the consumer-main-function.zip to and the version.
    aws signer start-signing-job \ 
    --source 's3={bucketName=<consumers-lambda-bucket>, version=<version-string>, key=consumer-main-function.zip}' \
    --destination 's3={bucketName=<consumers-lambda-bucket>, prefix=signed-}' \
    --profile-name consumer_approved_profile1
    

  3. Download your new file and rename it to signed-consumer-main-function.zip.
  4. Run the following command to create a new Lambda function, replacing <lambda-role> with a valid Lambda execution role and <code-signing-config-arn> with the value returned from the previous procedure: Creating a CSC.
    aws lambda create-function \
        --function-name "signed-consumer-main-function" \
        --runtime "python3.8" \
        --role <lambda-role> \
        --zip-file "fileb://signed-consumer-main-function.zip" \
        --handler lambda_function.lambda_handler \ 
        --code-signing-config <code-signing-config-arn>
    

  5. Finally, add the signed layer from the publishing account into the configuration of that function. Run the following command, replacing <lamba-layer-arn> with the result from the preceding step Publish your signed layer.
    aws lambda update-function-configuration \
    --function-name "signed-consumer-main-function" \
    --layers "<lambda-layer-arn>"   
    

Lambda will check the signature on the layer code in this step. If the signature of any deployed layer artifact is corrupt, the Lambda function stops you from attaching the layer and deploying your code. This is true regardless of the mode you choose—WARN or ENFORCE. If you have multiple layers to add to your function, you must sign all layers invoked in a Lambda function.

This capability allows layer publishers to share signed layers. A publisher can sign all layers using a specific signing profile and ask all the layer consumers to use that signing profile as one of the allowed profiles in their CSCs. When someone uses the layer, they can trust that the layer comes from that publisher and hasn’t been tampered with.

Conclusion

You’ve learned some best practices and patterns for using code signing for AWS Lambda. You know how code signing fits in the secure SDLC, and what value you get from each of the code signing checks. You also learned two patterns for using code signing for distributed ownership—one for centralized signing and one for cross account layer validation. No matter your role—as a developer, as a central security team, or as a layer publisher—you can use these tools to help enforce the integrity of code artifacts in your organization.

You can learn more about Lambda code signing in Configure code signing for AWS Lambda.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Lambda forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Cassia Martin

Cassia is a Security Solutions Architect in New York City. She works with large financial institutions to solve security architecture problems and to teach them cloud tools and patterns. Cassia has worked in security for over 10 years, and she has a strong background in application security.

Cloning Google Titan 2FA keys

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/01/cloning-google-titan-2fa-keys.html

This is a clever side-channel attack:

The cloning works by using a hot air gun and a scalpel to remove the plastic key casing and expose the NXP A700X chip, which acts as a secure element that stores the cryptographic secrets. Next, an attacker connects the chip to hardware and software that take measurements as the key is being used to authenticate on an existing account. Once the measurement-taking is finished, the attacker seals the chip in a new casing and returns it to the victim.

Extracting and later resealing the chip takes about four hours. It takes another six hours to take measurements for each account the attacker wants to hack. In other words, the process would take 10 hours to clone the key for a single account, 16 hours to clone a key for two accounts, and 22 hours for three accounts.

By observing the local electromagnetic radiations as the chip generates the digital signatures, the researchers exploit a side channel vulnerability in the NXP chip. The exploit allows an attacker to obtain the long-term elliptic curve digital signal algorithm private key designated for a given account. With the crypto key in hand, the attacker can then create her own key, which will work for each account she targeted.

The attack isn’t free, but it’s not expensive either:

A hacker would first have to steal a target’s account password and also gain covert possession of the physical key for as many as 10 hours. The cloning also requires up to $12,000 worth of equipment and custom software, plus an advanced background in electrical engineering and cryptography. That means the key cloning — ­were it ever to happen in the wild — ­would likely be done only by a nation-state pursuing its highest-value targets.

That last line about “nation-state pursuing its highest-value targets” is just not true. There are many other situations where this attack is feasible.

Note that the attack isn’t against the Google system specifically. It exploits a side-channel attack in the NXP chip. Which means that other systems are probably vulnerable:

While the researchers performed their attack on the Google Titan, they believe that other hardware that uses the A700X, or chips based on the A700X, may also be vulnerable. If true, that would include Yubico’s YubiKey NEO and several 2FA keys made by Feitian.

Extracting Personal Information from Large Language Models Like GPT-2

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/01/extracting-personal-information-from-large-language-models-like-gpt-2.html

Researchers have been able to find all sorts of personal information within GPT-2. This information was part of the training data, and can be extracted with the right sorts of queries.

Paper: “Extracting Training Data from Large Language Models.”

Abstract: It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model.

We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model’s training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data.

We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. For example, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models.

From a blog post:

We generated a total of 600,000 samples by querying GPT-2 with three different sampling strategies. Each sample contains 256 tokens, or roughly 200 words on average. Among these samples, we selected 1,800 samples with abnormally high likelihood for manual inspection. Out of the 1,800 samples, we found 604 that contain text which is reproduced verbatim from the training set.

The rest of the blog post discusses the types of data they found.

Brexit Deal Mandates Old Insecure Crypto Algorithms

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/brexit-deal-mandates-old-insecure-crypto-algorithms.html

In what is surely an unthinking cut-and-paste issue, page 921 of the Brexit deal mandates the use of SHA-1 and 1024-bit RSA:

The open standard s/MIME as extension to de facto e-mail standard SMTP will be deployed to encrypt messages containing DNA profile information. The protocol s/MIME (V3) allows signed receipts, security labels, and secure mailing lists… The underlying certificate used by s/MIME mechanism has to be in compliance with X.509 standard…. The processing rules for s/MIME encryption operations… are as follows:

  1. the sequence of the operations is: first encryption and then signing,
  2. the encryption algorithm AES (Advanced Encryption Standard) with 256 bit key length and RSA with 1,024 bit key length shall be applied for symmetric and asymmetric encryption respectively,
  3. the hash algorithm SHA-1 shall be applied.
  4. s/MIME functionality is built into the vast majority of modern e-mail software packages including Outlook, Mozilla Mail as well as Netscape Communicator 4.x and inter-operates among all major e-mail software packages.

And s/MIME? Bleah.

Helping build the next generation of privacy-preserving protocols

Post Syndicated from Nick Sullivan original https://blog.cloudflare.com/next-generation-privacy-protocols/

Helping build the next generation of privacy-preserving protocols

Helping build the next generation of privacy-preserving protocols

Over the last ten years, Cloudflare has become an important part of Internet infrastructure, powering websites, APIs, and web services to help make them more secure and efficient. The Internet is growing in terms of its capacity and the number of people using it and evolving in terms of its design and functionality. As a player in the Internet ecosystem, Cloudflare has a responsibility to help the Internet grow in a way that respects and provides value for its users. Today, we’re making several announcements around improving Internet protocols with respect to something important to our customers and Internet users worldwide: privacy.

These initiatives are:

Each of these projects impacts an aspect of the Internet that influences our online lives and digital footprints. Whether we know it or not, there is a lot of private information about us and our lives floating around online. This is something we can help fix.

For over a year, we have been working through standards bodies like the IETF and partnering with the biggest names in Internet technology (including Mozilla, Google, Equinix, and more) to design, deploy, and test these new privacy-preserving protocols at Internet scale. Each of these three protocols touches on a critical aspect of our online lives, and we expect them to help make real improvements to privacy online as they gain adoption.

A continuing tradition at Cloudflare

One of Cloudflare’s core missions is to support and develop technology that helps build a better Internet. As an industry, we’ve made exceptional progress in making the Internet more secure and robust. Cloudflare is proud to have played a part in this progress through multiple initiatives over the years.

Here are a few highlights:

  • Universal SSL™. We’ve been one of the driving forces for encrypting the web. We launched Universal SSL in 2014 to give website encryption to our customers for free and have actively been working along with certificate authorities like Let’s Encrypt, web browsers, and website operators to help remove mixed content. Before Universal SSL launched to give all Cloudflare customers HTTPS for free, only 30% of connections to websites were encrypted. Through the industry’s efforts, that number is now 80% — and a much more significant proportion of overall Internet traffic. Along with doing our part to encrypt the web, we have supported the Certificate Transparency project via Nimbus and Merkle Town, which has improved accountability for the certificate ecosystem HTTPS relies on for trust.
  • TLS 1.3 and QUIC. We’ve also been a proponent of upgrading existing security protocols. Take Transport Layer Security (TLS), the underlying protocol that secures HTTPS. Cloudflare engineers helped contribute to the design of TLS 1.3, the latest version of the standard, and in 2016 we launched support for an early version of the protocol. This early deployment helped lead to improvements to the final version of the protocol. TLS 1.3 is now the most widely used encryption protocol on the web and a vital component of the emerging QUIC standard, of which we were also early adopters.
  • Securing Routing, Naming, and Time. We’ve made major efforts to help secure other critical components of the Internet. Our efforts to help secure Internet routing through our RPKI toolkit, measurement studies, and “Is BGP Safe Yet” tool have significantly improved the Internet’s resilience against disruptive route leaks. Our time service (time.cloudflare.com) has helped keep people’s clocks in sync with more secure protocols like NTS and Roughtime. We’ve also made DNS more secure by supporting DNS-over-HTTPS and DNS-over-TLS in 1.1.1.1 at launch, along with one-click DNSSEC in our authoritative DNS service and registrar.

Continuing to improve the security of the systems of trust online is critical to the Internet’s growth. However, there is a more fundamental principle at play: respect. The infrastructure underlying the Internet should be designed to respect its users.

Building an Internet that respects users

When you sign in to a specific website or service with a privacy policy, you know what that site is expected to do with your data. It’s explicit. There is no such visibility to the users when it comes to the operators of the Internet itself. You may have an agreement with your Internet Service Provider (ISP) and the site you’re visiting, but it’s doubtful that you even know which networks your data is traversing. Most people don’t have a concept of the Internet beyond what they see on their screen, so it’s hard to imagine that people would accept or even understand what a privacy policy from a transit wholesaler or an inspection middlebox would even mean.

Without encryption, Internet browsing information is implicitly shared with countless third parties online as information passes between networks. Without secure routing, users’ traffic can be hijacked and disrupted. Without privacy-preserving protocols, users’ online life is not as private as they would think or expect. The infrastructure of the Internet wasn’t built in a way that reflects their expectations.

Helping build the next generation of privacy-preserving protocols
Normal network flow
Helping build the next generation of privacy-preserving protocols
Network flow with malicious route leak

The good news is that the Internet is continuously evolving. One of the groups that help guide that evolution is the Internet Architecture Board (IAB). The IAB provides architectural oversight to the Internet Engineering Task Force (IETF), the Internet’s main standard-setting body. The IAB recently published RFC 8890, which states that individual end-users should be prioritized when designing Internet protocols. It says that if there’s a conflict between the interests of end-users and the interest of service providers, corporations, or governments, IETF decisions should favor end users. One of the prime interests of end-users is the right to privacy, and the IAB published RFC 6973 to indicate how Internet protocols should take privacy into account.

Today’s technical blog posts are about improvements to the Internet designed to respect user privacy. Privacy is a complex topic that spans multiple disciplines, so it’s essential to clarify what we mean by “improving privacy.” We are specifically talking about changing the protocols that handle privacy-sensitive information exposed “on-the-wire” and modifying them so that this data is exposed to fewer parties. This data continues to exist. It’s just no longer available or visible to third parties without building a mechanism to collect it at a higher layer of the Internet stack, the application layer. These changes go beyond website encryption; they go deep into the design of the systems that are foundational to making the Internet what it is.

The toolbox: cryptography and secure proxies

Two tools for making sure data can be used without being seen are cryptography and secure proxies.

Helping build the next generation of privacy-preserving protocols

Cryptography allows information to be transformed into a format that a very limited number of people (those with the key) can understand. Some describe cryptography as a tool that transforms data security problems into key management problems. This is a humorous but fair description. Cryptography makes it easier to reason about privacy because only key holders can view data.

Another tool for protecting access to data is isolation/segmentation. By physically limiting which parties have access to information, you effectively build privacy walls. A popular architecture is to rely on policy-aware proxies to pass data from one place to another. Such proxies can be configured to strip sensitive data or block data transfers between parties according to what the privacy policy says.

Both these tools are useful individually, but they can be even more effective if combined. Onion routing (the cryptographic technique underlying Tor) is one example of how proxies and encryption can be used in tandem to enforce strong privacy. Broadly, if party A wants to send data to party B, they can encrypt the data with party B’s key and encrypt the metadata with a proxy’s key and send it to the proxy.

Platforms and services built on top of the Internet can build in consent systems, like privacy policies presented through user interfaces. The infrastructure of the Internet relies on layers of underlying protocols. Because these layers of the Internet are so far below where the user interacts with them, it’s almost impossible to build a concept of user consent. In order to respect users and protect them from privacy issues, the protocols that glue the Internet together should be designed with privacy enabled by default.

Data vs. metadata

The transition from a mostly unencrypted web to an encrypted web has done a lot for end-user privacy. For example, the “coffeeshop stalker” is no longer an issue for most sites. When accessing the majority of sites online, users are no longer broadcasting every aspect of their web browsing experience (search queries, browser versions, authentication cookies, etc.) over the Internet for any participant on the path to see. Suppose a site is configured correctly to use HTTPS. In that case, users can be confident their data is secure from onlookers and reaches only the intended party because their connections are both encrypted and authenticated.

However, HTTPS only protects the content of web requests. Even if you only browse sites over HTTPS, that doesn’t mean that your browsing patterns are private. This is because HTTPS fails to encrypt a critical aspect of the exchange: the metadata. When you make a phone call, the metadata is the phone number, not the call’s contents. Metadata is the data about the data.

To illustrate the difference and why it matters, here’s a diagram of what happens when you visit a website like an imageboard. Say you’re going to a specific page on that board (https://<imageboard>.com/room101/) that has specific embedded images hosted on <embarassing>.com.

Helping build the next generation of privacy-preserving protocols
Page load for an imageboard, returning an HTML page with an image from an embarassing site
Helping build the next generation of privacy-preserving protocols
Subresource fetch for the image from an embarassing site

The space inside the dotted line here represents the part of the Internet that your data needs to transit. They include your local area network or coffee shop, your ISP, an Internet transit provider, and it could be the network portion of the cloud provider that hosts the server. Users often don’t have a relationship with these entities or a contract to prevent these parties from doing anything with the user’s data. And even if those entities don’t look at the data, a well-placed observer intercepting Internet traffic could see anything sent unencrypted. It would be best if they just didn’t see it at all. In this example, the fact that the user visited <imageboard>.com can be seen by an observer, which is expected. However, though page content is encrypted, it’s possible to learn which specific page you’ve visited can be seen since <embarassing>.com is also visible.

It’s a general rule that if data is available to on-path parties on the Internet, some of these on-path parties will use this data. It’s also true that these on-path parties need some metadata in order to facilitate the transport of this data. This balance is explored in RFC 8558, which explains how protocols should be designed thoughtfully with respect to the balance between too much metadata (bad for privacy) and too little metadata (bad for operations).

In an ideal world, Internet protocols would be designed with the principle of least privilege. They would provide the minimum amount of information needed for the on-path parties (the pipes) to do the job of transporting the data to the right place and keep everything else confidential by default. Current protocols, including TLS 1.3 and QUIC, are important steps towards this ideal but fall short with respect to metadata privacy.

Knowing both who you are and what you do online can lead to profiling

Today’s announcements reflect two metadata protection levels: the first involves limiting the amount of metadata available to third-party observers (like ISPs). The second involves restricting the amount of metadata that users share with service providers themselves.

Hostnames are an example of metadata that needs to be protected from third-party observers, which DoH and ECH intend to do. However, it doesn’t make sense to hide the hostname from the site you’re visiting. It also doesn’t make sense to hide it from a directory service like DNS. A DNS server needs to know which hostname you’re resolving to resolve it for you!

A privacy issue arises when a service provider knows about both what sites you’re visiting and who you are. Individual websites do not have this dangerous combination of information (except in the case of third party cookies, which are going away soon in browsers), but DNS providers do. Thankfully, it’s not actually necessary for a DNS resolver to know *both* the hostname of the service you’re going to and which IP you’re coming from. Disentangling the two, which is the goal of ODoH, is good for privacy.

The Internet is part of ‘our’ Infrastructure

Roads should be well-paved, well lit, have accurate signage, and be optimally connected. They aren’t designed to stop a car based on who’s inside it. Nor should they be! Like transportation infrastructure, Internet infrastructure is responsible for getting data where it needs to go, not looking inside packets, and making judgments. But the Internet is made of computers and software, and software tends to be written to make decisions based on the data it has available to it.

Privacy-preserving protocols attempt to eliminate the temptation for infrastructure providers and others to peek inside and make decisions based on personal data. A non-privacy preserving protocol like HTTP keeps data and metadata, like passwords, IP addresses, and hostnames, as explicit parts of the data sent over the wire. The fact that they are explicit means that they are available to any observer to collect and act on. A protocol like HTTPS improves upon this by making some of the data (such as passwords and site content) invisible on the wire using encryption.

The three protocols we are exploring today extend this concept.

  • ECH takes most of the unencrypted metadata in TLS (including the hostname) and encrypts it with a key that was fetched ahead of time.
  • ODoH (a new variant of DoH co-designed by Apple, Cloudflare, and Fastly engineers) uses proxies and onion-like encryption to make the source of a DNS query invisible to the DNS resolver. This protects the user’s IP address when resolving hostnames.
  • OPAQUE uses a new cryptographic technique to keep passwords hidden even from the server. Utilizing a construction called an Oblivious Pseudo-Random Function (as seen in Privacy Pass), the server does not learn the password; it only learns whether or not the user knows the password.

By making sure Internet infrastructure acts more like physical infrastructure, user privacy is more easily protected. The Internet is more private if private data can only be collected where the user has a chance to consent to its collection.

Doing it together

As much as we’re excited about working on new ways to make the Internet more private, innovation at a global scale doesn’t happen in a vacuum. Each of these projects is the output of a collaborative group of individuals working out in the open in organizations like the IETF and the IRTF. Protocols must come about through a consensus process that involves all the parties that make up the interconnected set of systems that power the Internet. From browser builders to cryptographers, from DNS operators to website administrators, this is truly a global team effort.

We also recognize that sweeping technical changes to the Internet will inevitably also impact the technical community. Adopting these new protocols may have legal and policy implications. We are actively working with governments and civil society groups to help educate them about the impact of these potential changes.

We’re looking forward to sharing our work today and hope that more interested parties join in developing these protocols. The projects we are announcing today were designed by experts from academia, industry, and hobbyists together and were built by engineers from Cloudflare Research (including the work of interns, which we will highlight) with everyone’s support Cloudflare.

If you’re interested in this type of work, we’re hiring!

Three common cloud encryption questions and their answers on AWS

Post Syndicated from Peter M. O'Donnell original https://aws.amazon.com/blogs/security/three-common-cloud-encryption-questions-and-their-answers-on-aws/

At Amazon Web Services (AWS), we encourage our customers to take advantage of encryption to help secure their data. Encryption is a core component of a good data protection strategy, but people sometimes have questions about how to manage encryption in the cloud to meet the growth pace and complexity of today’s enterprises. Encryption can seem like a difficult task—people often think they need to master complicated systems to encrypt data—but the cloud can simplify it.

In response to frequently asked questions from executives and IT managers, this post provides an overview of how AWS makes encryption less difficult for everyone. In it, I describe the advantages to encryption in the cloud, common encryption questions, and some AWS services that can help.

Cloud encryption advantages

The most important thing to remember about encryption on AWS is that you always own and control your data. This is an extension of the AWS shared responsibility model, which makes the secure delivery and operation of your applications the responsibility of both you and AWS. You control security in the cloud, including encryption of content, applications, systems, and networks. AWS manages security of the cloud, meaning that we are responsible for protecting the infrastructure that runs all of the services offered in the AWS Cloud.

Encryption in the cloud offers a number of advantages in addition to the options available in on-premises environments. This includes on-demand access to managed services that enable you to more easily create and control the keys used for cryptographic operations, integrated identity and access management, and automating encryption in transit and at rest. With the cloud, you don’t manage physical security or the lifecycle of hardware. Instead of the need to procure, configure, deploy, and decommission hardware, AWS offers you a managed service backed by hardware that meets the security requirements of FIPS 140-2. If you need to use that key tens of thousands of times per second, the elastic capacity of AWS services can scale to meet your demands. Finally, you can use integrated encryption capabilities with the AWS services that you use to store and process your data. You pay only for what you use and can instead focus on configuring and monitoring logical security, and innovating on behalf of your business.

Addressing three common encryption questions

For many of the technology leaders I work with, agility and risk mitigation are top IT business goals. An enterprise-wide cloud encryption and data protection strategy helps define how to achieve fine-grained access controls while maintaining nearly continuous visibility into your risk posture. In combination with the wide range of AWS services that integrate directly with AWS Key Management Service (AWS KMS), AWS encryption services help you to achieve greater agility and additional control of your data as you move through the stages of cloud adoption.

The configuration of AWS encryption services is part of your portion of the shared responsibility model. You’re responsible for your data, AWS Identity and Access Management (IAM) configuration, operating systems and networks, and encryption on the client-side, server-side, and network. AWS is responsible for protecting the infrastructure that runs all of the services offered in AWS.

That still leaves you with responsibilities around encryption—which can seem complex, but AWS services can help. Three of the most common questions we get from customers about encryption in the cloud are:

  • How can I use encryption to prevent unauthorized access to my data in the cloud?
  • How can I use encryption to meet compliance requirements in the cloud?
  • How do I demonstrate compliance with company policies or other standards to my stakeholders in the cloud?

Let’s look closely at these three questions and some ways you can address them in AWS.

How can I use encryption to prevent unauthorized access to my data in the cloud?

Start with IAM

The primary way to protect access to your data is access control. On AWS, this often means using IAM to describe which users or roles can access resources like Amazon Simple Storage Service (Amazon S3) buckets. IAM allows you to tightly define the access for each user—whether human or system—and set the conditions in which that access is allowed. This could mean requiring the use of multi-factor authentication, or making the data accessible only from your Amazon Virtual Private Cloud (Amazon VPC).

Encryption allows you to introduce an additional authorization condition before granting access to data. When you use AWS KMS with other services, you can get further control over access to sensitive data. For example, with S3 objects that are encrypted by KMS, each IAM user must not only have access to the storage itself but also have authorization to use the KMS key that protects the data. This works similarly for Amazon Elastic Block Store (Amazon EBS). For example, you can allow an entire operations team to manage Amazon EBS volumes and snapshots, but, for certain Amazon EBS volumes that contain sensitive data, you can use a different KMS master key with different permissions that are granted only to the individuals you specify. This ability to define more granular access control through independent permission on encryption keys is supported by all AWS services that integrate with KMS.

When you configure IAM for your users to access your data and resources, it’s critical that you consider the principle of least privilege. This means you grant only the access necessary for each user to do their work and no more. For example, instead of granting users access to an entire S3 bucket, you can use IAM policy language to specify the particular Amazon S3 prefixes that are required and no others. This is important when thinking about the difference between using a service—data plane events—and managing a service—management plane events. An application might store and retrieve objects in an S3 bucket, but it’s rarely the case that the same application needs to list all of the buckets in an account or configure the bucket’s settings and permissions.

Making clear distinctions between who can use resources and who can manage resources is often referred to as the principle of separation of duties. Consider the circumstance of having a single application with two identities that are associated with it—an application identity that uses a key to encrypt and decrypt data and a manager identity that can make configuration changes to the key. By using AWS KMS together with services like Amazon EBS, Amazon S3, and many others, you can clearly define which actions can be used by each persona. This prevents the application identity from making configuration or permission changes while allowing the manager to make those changes but not use the services to actually access the data or use the encryption keys.

Use AWS KMS and key policies with IAM policies

AWS KMS provides you with visibility and granular permissions control of a specific key in the hierarchy of keys used to protect your data. Controlling access to the keys in KMS is done using IAM policy language. The customer master key (CMK) has its own policy document, known as a key policy. AWS KMS key policies can work together with IAM identity policies or you can manage the permissions for a KMS CMK exclusively with key policies. This gives you greater flexibility to separately assign permissions to use the key or manage the key, depending on your business use case.

Encryption everywhere

AWS recommends that you encrypt as much as possible. This means encrypting data while it’s in transit and while it’s at rest.

For customers seeking to encrypt data in transit for their public facing applications, our recommended best practice is to use AWS Certificate Manager (ACM). This service automates the creation, deployment, and renewal of public TLS certificates. If you’ve been using SSL/TLS for your websites and applications, then you’re familiar with some of the challenges related to dealing with certificates. ACM is designed to make certificate management easier and less expensive.

One way ACM does this is by generating a certificate for you. Because AWS operates a certificate authority that’s already trusted by industry-standard web browsers and operating systems, public certificates created by ACM can be used with public websites and mobile applications. ACM can create a publicly trusted certificate that you can then deploy into API Gateway, Elastic Load Balancing, or Amazon CloudFront (a globally distributed content delivery network). You don’t have to handle the private key material or figure out complicated tooling to deploy the certificates to your resources. ACM helps you to deploy your certificates either through the AWS Management Console or with automation that uses AWS Command Line Interface (AWS CLI) or AWS SDKs.

One of the challenges related to certificates is regularly rotating and renewing them so they don’t unexpectedly expire and prevent your users from using your website or application. Fortunately, ACM has a feature that updates the certificate before it expires and automatically deploys the new certificate to the resources associated with it. No more needing to make a calendar entry to remind your team to renew certificates and, most importantly, no more outages because of expired certificates.

Many customers want to secure data in transit for services by using privately trusted TLS certificates instead of publicly trusted TLS certificates. For this use case, you can use AWS Certificate Manager Private Certificate Authority (ACM PCA) to issue certificates for both clients and servers. ACM PCA provides an inexpensive solution for issuing internally trusted certificates and it can be integrated with ACM with all of the same integrative benefits that ACM provides for public certificates, including automated renewal.

For encrypting data at rest, I strongly encourage using AWS KMS. There is a broad range of AWS storage and database services that support KMS integration so you can implement robust encryption to protect your data at rest within AWS services. This lets you have the benefit of the KMS capabilities for encryption and access control to build complex solutions with a variety of AWS services without compromising on using encryption as part of your data protection strategy.

How can I use encryption to meet compliance requirements in the cloud?

The first step is to identify your compliance requirements. This can often be done by working with your company’s risk and compliance team to understand the frameworks and controls that your company must abide by. While the requirements vary by industry and region, the most common encryption compliance requirements are to encrypt your data and make sure that the access control for the encryption keys (for example by using AWS KMS CMK key policies) is separate from the access control to the encrypted data itself (for example through Amazon S3 bucket policies).

Another common requirement is to have separate encryption keys for different classes of data, or for different tenants or customers. This is directly supported by AWS KMS as you can have as many different keys as you need within a single account. If you need to use even more than the 10,000 keys AWS KMS allows by default, contact AWS Support about raising your quota.

For compliance-related concerns, there are a few capabilities that are worth exploring as options to increase your coverage of security controls.

  • Amazon S3 can automatically encrypt all new objects placed into a bucket, even when the user or software doesn’t specify encryption.
  • You can use batch operations in Amazon S3 to encrypt existing objects that weren’t originally stored with encryption.
  • You can use the Amazon S3 inventory report to generate a list of all S3 objects in a bucket, including their encryption status.

AWS services that track encryption configurations to comply with your requirements

Anyone who has pasted a screenshot of a configuration into a word processor at the end of the year to memorialize compliance knows how brittle traditional on-premises forms of compliance attestation can be. Everything looked right the day it was installed and still looked right at the end of the year—but how can you be certain that everything was correctly configured at all times?

AWS provides several different services to help you configure your environment correctly and monitor its configuration over time. AWS services can also be configured to perform automated remediation to correct any deviations from your desired configuration state. AWS helps automate the collection of compliance evidence and provides nearly continuous, rather than point in time, compliance snapshots.

AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. AWS Config continuously monitors and records your AWS resource configurations and helps you to automate the evaluation of recorded configurations against desired configurations. One of the most powerful features of AWS Config is AWS Config Rules. While AWS Config continuously tracks the configuration changes that occur among your resources, it checks whether these changes violate any of the conditions in your rules. If a resource violates a rule, AWS Config flags the resource and the rule as noncompliant. AWS Config comes with a wide range of prewritten managed rules to help you maintain compliance for many different AWS services. The managed rules include checks for encryption status on a variety of resources, ACM certificate expiration, IAM policy configurations, and many more.

For additional monitoring capabilities, consider Amazon Macie and AWS Security Hub. Amazon Macie is a service that helps you understand the contents of your S3 buckets by analyzing and classifying the data contained within your S3 objects. It can also be used to report on the encryption status of your S3 buckets, giving you a central view into the configurations of all buckets in your account, including default encryption settings. Amazon Macie also integrates with AWS Security Hub, which can perform automated checks of your configurations, including several checks that focus on encryption settings.

Another critical service for compliance outcomes is AWS CloudTrail. CloudTrail enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. AWS KMS records all of its activity in CloudTrail, allowing you to identify who used the encryption keys, in what context, and with which resources. This information is useful for operational purposes and to help you meet your compliance needs.

How do I demonstrate compliance with company policy to my stakeholders in the cloud?

You probably have internal and external stakeholders that care about compliance and require that you document your system’s compliance posture. These stakeholders include a range of possible entities and roles, including internal and external auditors, risk management departments, industry and government regulators, diligence teams related to funding or acquisition, and more.

Unfortunately, the relationship between technical staff and audit and compliance staff is sometimes contentious. AWS believes strongly that these two groups should work together—they want the same things. The same services and facilities that engineering teams use to support operational excellence can also provide output that answers stakeholders’ questions about security compliance.

You can provide access to the console for AWS Config and CloudTrail to your counterparts in audit and risk management roles. Use AWS Config to continuously monitor your configurations and produce periodic reports that can be delivered to the right stakeholders. The evolution towards continuous compliance makes compliance with your company policies on AWS not just possible, but often better than is possible in traditional on-premises environments. AWS Config includes several managed rules that check for encryption settings in your environment. CloudTrail contains an ongoing record of every time AWS KMS keys are used to either encrypt or decrypt your resources. The contents of the CloudTrail entry include the KMS key ID, letting your stakeholders review and connect the activity recorded in CloudTrail with the configurations and permissions set in your environment. You can also use the reports produced by Security Hub automated compliance checks to verify and validate your encryption settings and other controls.

Your stakeholders might have further requirements for compliance that are beyond your scope of control because AWS is operating those controls for you. AWS provides System and Organization Controls (SOC) Reports that are independent, third-party examination reports that demonstrate how AWS achieves key compliance controls and objectives. The purpose of these reports is to help you and your auditors understand the AWS controls established to support operations and compliance. You can consult the AWS SOC2 report, available through AWS Artifact, for more information about how AWS operates in the cloud and provides assurance around AWS security procedures. The SOC2 report includes several AWS KMS-specific controls that might be of interest to your audit-minded colleagues.

Summary

Encryption in the cloud is easier than encryption on-premises, powerful, and can help you meet the highest standards for controls and compliance. The cloud provides more comprehensive data protection capabilities for customers looking to rapidly scale and innovate than are available for on-premises systems. This post provides guidance for how to think about encryption in AWS. You can use IAM, AWS KMS, and ACM to provide granular access control to your most sensitive data, and support protection of your data in transit and at rest. Once you’ve identified your compliance requirements, you can use AWS Config and CloudTrail to review your compliance with company policy over time, rather than point-in-time snapshots obtained through traditional audit methods. AWS can provide on-demand compliance evidence, with tools such as reporting from CloudTrail and AWS Config, and attestations such as SOC reports.

I encourage you to review your current encryption approach against the steps I’ve outlined in this post. While every industry and company is different, I believe the core concepts presented here apply to all scenarios. I want to hear from you. If you have any comments or feedback on the approach discussed here, or how you’ve used it for your use case, leave a comment on this post.

And for more information on encryption in the cloud and on AWS, check out the following resources, in addition to our collection of encryption blog posts.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Peter M. O’Donnell

Peter is an AWS Principal Solutions Architect, specializing in security, risk, and compliance with the Strategic Accounts team. Formerly dedicated to a major US commercial bank customer, Peter now supports some of AWS’s largest and most complex strategic customers in security and security-related topics, including data protection, cryptography, identity, threat modeling, incident response, and CISO engagement.

Author

Supriya Anand

Supriya is a Senior Digital Strategist at AWS, focused on marketing, encryption, and emerging areas of cybersecurity. She has worked to drive large scale marketing and content initiatives forward in a variety of regulated industries. She is passionate about helping customers learn best practices to secure their AWS cloud environment so they can innovate faster on behalf of their business.

re:Invent 2020 – Your guide to AWS Identity and Data Protection sessions

Post Syndicated from Marta Taggart original https://aws.amazon.com/blogs/security/reinvent-2020-your-guide-to-aws-identity-and-data-protection-sessions/

AWS re:Invent will certainly be different in 2020! Instead of seeing you all in Las Vegas, this year re:Invent will be a free, three-week virtual conference. One thing that will remain the same is the variety of sessions, including many Security, Identity, and Compliance sessions. As we developed sessions, we looked to customers—asking where they would like to expand their knowledge. One way we did this was shared in a recent Security blog post, where we introduced a new customer polling feature that provides us with feedback directly from customers. The initial results of the poll showed that Identity and Access Management and Data Protection are top-ranking topics for customers. We wanted to highlight some of the re:Invent sessions for these two important topics so that you can start building your re:Invent schedule. Each session is offered at multiple times, so you can sign up for the time that works best for your location and schedule.

Managing your Identities and Access in AWS

AWS identity: Secure account and application access with AWS SSO
Ron Cully, Principal Product Manager, AWS

Dec 1, 2020 | 12:00 PM – 12:30 PM PST
Dec 1, 2020 | 8:00 PM – 8:30 PM PST
Dec 2, 2020 | 4:00 AM – 4:30 AM PST

AWS SSO provides an easy way to centrally manage access at scale across all your AWS Organizations accounts, using identities you create and manage in AWS SSO, Microsoft Active Directory, or external identity providers (such as Okta Universal Directory or Azure AD). This session explains how you can use AWS SSO to manage your AWS environment, and it covers key new features to help you secure and automate account access authorization.

Getting started with AWS identity services
Becky Weiss, Senior Principal Engineer, AWS

Dec 1, 2020 | 1:30 PM – 2:00 PM PST
Dec 1, 2020 | 9:30 PM – 10:00 PM PST
Dec 2, 2020 | 5:30 AM – 6:00 AM PST

The number, range, and breadth of AWS services are large, but the set of techniques that you need to secure them is not. Your journey as a builder in the cloud starts with this session, in which practical examples help you quickly get up to speed on the fundamentals of becoming authenticated and authorized in the cloud, as well as on securing your resources and data correctly.

AWS identity: Ten identity health checks to improve security in the cloud
Cassia Martin, Senior Security Solutions Architect, AWS

Dec 2, 2020 | 9:30 AM – 10:00 AM PST
Dec 2, 2020 | 5:30 PM – 6:00 PM PST
Dec 3, 2020 | 1:30 AM – 2:00 AM PST

Get practical advice and code to help you achieve the principle of least privilege in your existing AWS environment. From enabling logs to disabling root, the provided checklist helps you find and fix permissions issues in your resources, your accounts, and throughout your organization. With these ten health checks, you can improve your AWS identity and achieve better security every day.

AWS identity: Choosing the right mix of AWS IAM policies for scale
Josh Du Lac, Principal Security Solutions Architect, AWS

Dec 2, 2020 | 11:00 AM – 11:30 AM PST
Dec 2, 2020 | 7:00 PM – 7:30 PM PST
Dec 3, 2020 | 3:00 AM – 3:30 AM PST

This session provides both a strategic and tactical overview of various AWS Identity and Access Management (IAM) policies that provide a range of capabilities for the security of your AWS accounts. You probably already use a number of these policies today, but this session will dive into the tactical reasons for choosing one capability over another. This session zooms out to help you understand how to manage these IAM policies across a multi-account environment, covering their purpose, deployment, validation, limitations, monitoring, and more.

Zero Trust: An AWS perspective
Quint Van Deman, Principal WW Identity Specialist, AWS

Dec 2, 2020 | 12:30 PM – 1:00 PM PST
Dec 2, 2020 | 8:30 PM – 9:00 PM PST
Dec 3, 2020 | 4:30 AM – 5:00 AM PST

AWS customers have continuously asked, “What are the optimal patterns for ensuring the right levels of security and availability for my systems and data?” Increasingly, they are asking how patterns that fall under the banner of Zero Trust might apply to this question. In this session, you learn about the AWS guiding principles for Zero Trust and explore the larger subdomains that have emerged within this space. Then the session dives deep into how AWS has incorporated some of these concepts, and how AWS can help you on your own Zero Trust journey.

AWS identity: Next-generation permission management
Brigid Johnson, Senior Software Development Manager, AWS

Dec 3, 2020 | 11:00 AM – 11:30 AM PST
Dec 3, 2020 | 7:00 PM – 7:30 PM PST
Dec 4, 2020 | 3:00 AM – 3:30 AM PST

This session is for central security teams and developers who manage application permissions. This session reviews a permissions model that enables you to scale your permissions management with confidence. Learn how to set your organization up for access management success with permission guardrails. Then, learn about granting workforce permissions based on attributes, so they scale as your users and teams adjust. Finally, learn about the access analysis tools and how to use them to identify and reduce broad permissions and give users and systems access to only what they need.

How Goldman Sachs administers temporary elevated AWS access
Harsha Sharma, Solutions Architect, AWS
Chana Garbow Pardes, Associate, Goldman Sachs
Jewel Brown, Analyst, Goldman Sachs

Dec 16, 2020 | 2:00 PM – 2:30 PM PST
Dec 16, 2020 | 10:00 PM – 10:30 PM PST
Dec 17, 2020 | 6:00 AM – 6:30 AM PST

Goldman Sachs takes security and access to AWS accounts seriously. While empowering teams with the freedom to build applications autonomously is critical for scaling cloud usage across the firm, guardrails and controls need to be set in place to enable secure administrative access. In this session, learn how the company built its credential brokering workflow and administrator access for its users. Learn how, with its simple application that uses proprietary and AWS services, including Amazon DynamoDB, AWS Lambda, AWS CloudTrail, Amazon S3, and Amazon Athena, Goldman Sachs is able to control administrator credentials and monitor and report on actions taken for audits and compliance.

Data Protection

Do you need an AWS KMS custom key store?
Tracy Pierce, Senior Consultant, AWS

Dec 15, 2020 | 9:45 AM – 10:15 AM PST
Dec 15, 2020 | 5:45 PM – 6:15 PM PST
Dec 16, 2020 | 1:45 AM – 2:15 AM PST

AWS Key Management Service (AWS KMS) has integrated with AWS CloudHSM, giving you the option to create your own AWS KMS custom key store. In this session, you learn more about how a KMS custom key store is backed by an AWS CloudHSM cluster and how it enables you to generate, store, and use your KMS keys in the hardware security modules that you control. You also learn when and if you really need a custom key store. Join this session to learn why you might choose not to use a custom key store and instead use the AWS KMS default.

Using certificate-based authentication on containers & web servers on AWS
Josh Rosenthol, Senior Product Manager, AWS
Kevin Rioles, Manager, Infrastructure & Security, BlackSky

Dec 8, 2020 | 12:45 PM – 1:15 PM PST
Dec 8, 2020 | 8:45 PM – 9:15 PM PST
Dec 9, 2020 | 4:45 AM – 5:15 AM PST

In this session, BlackSky talks about its experience using AWS Certificate Manager (ACM) end-entity certificates for the processing and distribution of real-time satellite geospatial intelligence and monitoring. Learn how BlackSky uses certificate-based authentication on containers and web servers within its AWS environment to help make TLS ubiquitous in its deployments. The session details the implementation, architecture, and operations best practices that the company chose and how it was able to operate ACM at scale across multiple accounts and regions.

The busy manager’s guide to encryption
Spencer Janyk, Senior Product Manager, AWS

Dec 9, 2020 | 11:45 AM – 12:15 PM PST
Dec 9, 2020 | 7:45 PM – 8:15 PM PST
Dec 10, 2020 | 3:45 AM – 4:15 AM PST

In this session, explore the functionality of AWS cryptography services and learn when and where to deploy each of the following: AWS Key Management Service, AWS Encryption SDK, AWS Certificate Manager, AWS CloudHSM, and AWS Secrets Manager. You also learn about defense-in-depth strategies including asymmetric permissions models, client-side encryption, and permission segmentation by role.

Building post-quantum cryptography for the cloud
Alex Weibel, Senior Software Development Engineer, AWS

Dec 15, 2020 | 12:45 PM – 1:15 PM PST
Dec 15, 2020 | 8:45 PM – 9:15 PM PST
Dec 16, 2020 | 4:45 AM – 5:15 AM PST

This session introduces post-quantum cryptography and how you can use it today to secure TLS communication. Learn about recent updates on standards and existing deployments, including the AWS post-quantum TLS implementation (pq-s2n). A description of the hybrid key agreement method shows how you can combine a new post-quantum key encapsulation method with a classical key exchange to secure network traffic today.

Data protection at scale using Amazon Macie
Neel Sendas, Senior Technical Account Manager, AWS

Dec 17, 2020 | 7:15 AM – 7:45 AM PST
Dec 17, 2020 | 3:15 PM – 3:45 PM PST
Dec 17, 2020 | 11:15 PM – 11:45 PM PST

Data Loss Prevention (DLP) is a common topic among companies that work with sensitive data. If an organization can’t identify its sensitive data, it can’t protect it. Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS. In this session, we will share details of the design and architecture you can use to deploy Macie at large scale.

While sessions are virtual this year, they will be offered at multiple times with live moderators and “Ask the Expert” sessions available to help answer any questions that you may have. We look forward to “seeing” you in these sessions. Please see the re:Invent agenda for more details and to build your schedule.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Marta Taggart

Marta is a Seattle-native and Senior Program Manager in AWS Security, where she focuses on privacy, content development, and educational programs. Her interest in education stems from two years she spent in the education sector while serving in the Peace Corps in Romania. In her free time, she’s on a global hunt for the perfect cup of coffee.

Author

Himanshu Verma

Himanshu is a Worldwide Specialist for AWS Security Services. In this role, he leads the go-to-market creation and execution for AWS Data Protection and Threat Detection & Monitoring services, field enablement, and strategic customer advisement. Prior to AWS, he held roles as Director of Product Management, engineering and development, working on various identity, information security and data protection technologies.

Indistinguishability Obfuscation

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/11/indistinguishability-obfuscation.html

Quanta magazine recently published a breathless article on indistinguishability obfuscation — calling it the “‘crown jewel’ of cryptography” — and saying that it had finally been achieved, based on a recently published paper. I want to add some caveats to the discussion.

Basically, obfuscation makes a computer program “unintelligible” by performing its functionality. Indistinguishability obfuscation is more relaxed. It just means that two different programs that perform the same functionality can’t be distinguished from each other. A good definition is in this paper.

This is a pretty amazing theoretical result, and one to be excited about. We can now do obfuscation, and we can do it using assumptions that make real-world sense. The proofs are kind of ugly, but that’s okay — it’s a start. What it means in theory is that we have a fundamental theoretical result that we can use to derive a whole bunch of other cryptographic primitives.

But — and this is a big one — this result is not even remotely close to being practical. We’re talking multiple days to perform pretty simple calculations, using massively large blocks of computer code. And this is likely to remain true for a very long time. Unless researchers increase performance by many orders of magnitude, nothing in the real world will make use of this work anytime soon.

But but, consider fully homomorphic encryption. It, too, was initially theoretically interesting and completely impractical. And now, after decades of work, it seems to be almost just-barely maybe approaching practically useful. This could very well be on the same trajectory, and perhaps in twenty to thirty years we will be celebrating this early theoretical result as the beginning of a new theory of cryptography.

Round 2 post-quantum TLS is now supported in AWS KMS

Post Syndicated from Alex Weibel original https://aws.amazon.com/blogs/security/round-2-post-quantum-tls-is-now-supported-in-aws-kms/

AWS Key Management Service (AWS KMS) now supports three new hybrid post-quantum key exchange algorithms for the Transport Layer Security (TLS) 1.2 encryption protocol that’s used when connecting to AWS KMS API endpoints. These new hybrid post-quantum algorithms combine the proven security of a classical key exchange with the potential quantum-safe properties of new post-quantum key exchanges undergoing evaluation for standardization. The fastest of these algorithms adds approximately 0.3 milliseconds of overheard compared to a classical TLS handshake. The new post-quantum key exchange algorithms added are Round 2 versions of Kyber, Bit Flipping Key Encapsulation (BIKE), and Supersingular Isogeny Key Encapsulation (SIKE). Each organization has submitted their algorithms to the National Institute of Standards and Technology (NIST) as part of NIST’s post-quantum cryptography standardization process. This process spans several rounds of evaluation over multiple years, and is likely to continue beyond 2021.

In our previous hybrid post-quantum TLS blog post, we announced that AWS KMS had launched hybrid post-quantum TLS 1.2 with Round 1 versions of BIKE and SIKE. The Round 1 post-quantum algorithms are still supported by AWS KMS, but at a lower priority than the Round 2 algorithms. You can choose to upgrade your client to enable negotiation of Round 2 algorithms.

Why post-quantum TLS is important

A large-scale quantum computer would be able to break the current public-key cryptography that’s used for key exchange in classical TLS connections. While a large-scale quantum computer isn’t available today, it’s still important to think about and plan for your long-term security needs. TLS traffic using classical algorithms recorded today could be decrypted by a large-scale quantum computer in the future. If you’re developing applications that rely on the long-term confidentiality of data passed over a TLS connection, you should consider a plan to migrate to post-quantum cryptography before the lifespan of the sensitivity of your data would be susceptible to an unauthorized user with a large-scale quantum computer. As an example, this means that if you believe that a large-scale quantum computer is 25 years away, and your data must be secure for 20 years, you should migrate to post-quantum schemes within the next 5 years. AWS is working to prepare for this future, and we want you to be prepared too.

We’re offering this feature now instead of waiting for standardization efforts to be complete so you have a way to measure the potential performance impact to your applications. Offering this feature now also gives you the protection afforded by the proposed post-quantum schemes today. While we believe that the use of this feature raises the already high security bar for connecting to AWS KMS endpoints, these new cipher suites will impact bandwidth utilization and latency. However, using these new algorithms could also create connection failures for intermediate systems that proxy TLS connections. We’d like to get feedback from you on the effectiveness of our implementation or any issues found so we can improve it over time.

Hybrid post-quantum TLS 1.2

Hybrid post-quantum TLS is a feature that provides the security protections of both the classical and post-quantum key exchange algorithms in a single TLS handshake. Figure 1 shows the differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2. Hybrid post-quantum TLS 1.2 has three major differences from classical TLS 1.2:

  • The negotiated post-quantum key is appended to the ECDHE key before being used as the hash-based message authentication code (HMAC) key.
  • The text hybrid in its ASCII representation is prepended to the beginning of the HMAC message.
  • The entire client key exchange message from the TLS handshake is appended to the end of the HMAC message.
Figure 1: Differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2

Figure 1: Differences in the connection secret derivation process between classical and hybrid post-quantum TLS 1.2

Some background on post-quantum TLS

Today, all requests to AWS KMS use TLS with key exchange algorithms that provide perfect forward secrecy and use one of the following classical schemes:

While existing FFDHE and ECDHE schemes use perfect forward secrecy to protect against the compromise of the server’s long-term secret key, these schemes don’t protect against large-scale quantum computers. In the future, a sufficiently capable large-scale quantum computer could run Shor’s Algorithm to recover the TLS session key of a recorded classical session, and thereby gain access to the data inside. Using a post-quantum key exchange algorithm during the TLS handshake protects against attacks from a large-scale quantum computer.

The possibility of large-scale quantum computing has spurred the development of new quantum-resistant cryptographic algorithms. NIST has started the process of standardizing post-quantum key encapsulation mechanisms (KEMs). A KEM is a type of key exchange that’s used to establish a shared symmetric key. AWS has chosen three NIST KEM submissions to adopt in our post-quantum efforts:

Hybrid mode ensures that the negotiated key is as strong as the weakest key agreement scheme. If one of the schemes is broken, the communications remain confidential. The Internet Engineering Task Force (IETF) Hybrid Post-Quantum Key Encapsulation Methods for Transport Layer Security 1.2 draft describes how to combine post-quantum KEMs with ECDHE to create new cipher suites for TLS 1.2.

These cipher suites use a hybrid key exchange that performs two independent key exchanges during the TLS handshake. The key exchange then cryptographically combines the keys from each into a single TLS session key. This strategy combines the proven security of a classical key exchange with the potential quantum-safe properties of new post-quantum key exchanges being analyzed by NIST.

The effect of hybrid post-quantum TLS on performance

Post-quantum cipher suites have a different performance profile and bandwidth usage from traditional cipher suites. AWS has measured bandwidth and latency across 2,000 TLS handshakes between an Amazon Elastic Compute Cloud (Amazon EC2) C5n.4xlarge client and the public AWS KMS endpoint, which were both in the us-west-2 Region. Your own performance characteristics might differ, and will depend on your environment, including your:

  • Hardware–CPU speed and number of cores.
  • Existing workloads–how often you call AWS KMS and what other work your application performs.
  • Network–location and capacity.

The following graphs and table show latency measurements performed by AWS for all newly supported Round 2 post-quantum algorithms, in addition to the classical ECDHE key exchange algorithm currently used by most customers.

Figure 2 shows the latency differences of all hybrid post-quantum algorithms compared with classical ECDHE alone, and shows that compared to ECDHE alone, SIKE adds approximately 101 milliseconds of overhead, BIKE adds approximately 9.5 milliseconds of overhead, and Kyber adds approximately 0.3 milliseconds of overhead.
 

Figure 2: TLS handshake latency at varying percentiles for four key exchange algorithms

Figure 2: TLS handshake latency at varying percentiles for four key exchange algorithms

Figure 3 shows the latency differences between ECDHE with Kyber, and ECDHE alone. The addition of Kyber adds approximately 0.3 milliseconds of overhead.
 

Figure 3: TLS handshake latency at varying percentiles, with only top two performing key exchange algorithms

Figure 3: TLS handshake latency at varying percentiles, with only top two performing key exchange algorithms

The following table shows the total amount of data (in bytes) needed to complete the TLS handshake for each cipher suite, the average latency, and latency at varying percentiles. All measurements were gathered from 2,000 TLS handshakes. The time was measured on the client from the start of the handshake until the handshake was completed, and includes all network transfer time. All connections used RSA authentication with a 2048-bit key, and ECDHE used the secp256r1 curve. All hybrid post-quantum tests used the NIST Round 2 versions. The Kyber test used the Kyber-512 parameter, the BIKE test used the BIKE-1 Level 1 parameter, and the SIKE test used the SIKEp434 parameter.

Item Bandwidth
(bytes)
Total
handshakes
Average
(ms)
p0
(ms)
p50
(ms)
p90
(ms)
p99
(ms)
ECDHE (classic) 3,574 2,000 3.08 2.07 3.02 3.95 4.71
ECDHE + Kyber R2 5,898 2,000 3.36 2.38 3.17 4.28 5.35
ECDHE + BIKE R2 12,456 2,000 14.91 11.59 14.16 18.27 23.58
ECDHE + SIKE R2 4,628 2,000 112.40 103.22 108.87 126.80 146.56

By default, the AWS SDK client performs a TLS handshake once to set up a new TLS connection, and then reuses that TLS connection for multiple requests. This means that the increased cost of a hybrid post-quantum TLS handshake is amortized over multiple requests sent over the TLS connection. You should take the amortization into account when evaluating the overall additional cost of using post-quantum algorithms; otherwise performance data could be skewed.

AWS KMS has chosen Kyber Round 2 to be KMS’s highest prioritized post-quantum algorithm, with BIKE Round 2, and SIKE Round 2 next in priority order for post-quantum algorithms. This is because Kyber’s performance is closest to the classical ECDHE performance that most AWS KMS customers are using today and are accustomed to.

How to use hybrid post-quantum cipher suites

To use the post-quantum cipher suites with AWS KMS, you need the preview release of the AWS Common Runtime (CRT) HTTP client for the AWS SDK for Java 2.x. Also, you will need to configure the AWS CRT HTTP client to use the s2n post-quantum hybrid cipher suites. Post-quantum TLS for AWS KMS is available in all AWS Regions except for AWS GovCloud (US-East), AWS GovCloud (US-West), AWS China (Beijing) Region operated by Beijing Sinnet Technology Co. Ltd (“Sinnet”), and AWS China (Ningxia) Region operated by Ningxia Western Cloud Data Technology Co. Ltd. (“NWCD”). Since NIST has not yet standardized post-quantum cryptography, connections that require Federal Information Processing Standards (FIPS) compliance cannot use the hybrid key exchange. For example, kms.<region>.amazonaws.com supports the use of post-quantum cipher suites, while kms-fips.<region>.amazonaws.com does not.

  1. If you’re using the AWS SDK for Java 2.x, you must add the preview release of the AWS Common Runtime client to your Maven dependencies.
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>aws-crt-client</artifactId>
        <version>2.14.13-PREVIEW</version>
    </dependency>
    

  2. You then must configure the new SDK and cipher suite in the existing initialization code of your application:
    if(!TLS_CIPHER_PREF_KMS_PQ_TLSv1_0_2020_07.isSupported()){
        throw new RuntimeException("Post Quantum Ciphers not supported on this Platform");
    }
    
    SdkAsyncHttpClient awsCrtHttpClient = AwsCrtAsyncHttpClient.builder()
              .tlsCipherPreference(TLS_CIPHER_PREF_KMS_PQ_TLSv1_0_2020_07)
              .build();
              
    KmsAsyncClient kms = KmsAsyncClient.builder()
             .httpClient(awsCrtHttpClient)
             .build();
             
    ListKeysResponse response = kms.listKeys().get();
    

Now, all connections made to AWS KMS in supported Regions will use the new hybrid post-quantum cipher suites! To see a complete example of everything set up, check out the example application here.

Things to try

Here are some ideas about how to use this post-quantum-enabled client:

  • Run load tests and benchmarks. These new cipher suites perform differently than traditional key exchange algorithms. You might need to adjust your connection timeouts to allow for the longer handshake times or, if you’re running inside an AWS Lambda function, extend the execution timeout setting.
  • Try connecting from different locations. Depending on the network path your request takes, you might discover that intermediate hosts, proxies, or firewalls with deep packet inspection (DPI) block the request. This could be due to the new cipher suites in the ClientHello or the larger key exchange messages. If this is the case, you might need to work with your security team or IT administrators to update the relevant configuration to unblock the new TLS cipher suites. We’d like to hear from you about how your infrastructure interacts with this new variant of TLS traffic. If you have questions or feedback, please start a new thread on the AWS KMS discussion forum.

Conclusion

In this blog post, I announced support for Round 2 hybrid post-quantum algorithms in AWS KMS, and showed you how to begin experimenting with hybrid post-quantum key exchange algorithms for TLS when connecting to AWS KMS endpoints.

More info

If you’d like to learn more about post-quantum cryptography check out:

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Alex Weibel

Alex is a Senior Software Engineer on the AWS Crypto Algorithms team. He’s one of the maintainers for Amazon’s TLS Library s2n. Previously, Alex worked on TLS termination and request proxying for S3 and the Elastic Load Balancing Service developing new features for customers. Alex holds a Bachelor of Science degree in Computer Science from the University of Texas at Austin.