Tag Archives: Privacy

Using Machine Learning to Guess PINs from Video

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/10/using-machine-learning-to-guess-pins-from-video.html

Researchers trained a machine-learning system on videos of people typing their PINs into ATMs:

By using three tries, which is typically the maximum allowed number of attempts before the card is withheld, the researchers reconstructed the correct sequence for 5-digit PINs 30% of the time, and reached 41% for 4-digit PINs.

This works even if the person is covering the pad with their hands.

The article doesn’t contain a link to the original research. If someone knows it, please put it in the comments.

Slashdot thread.

Recovering Real Faces from Face-Generation ML System

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/10/recovering-real-faces-from-face-generation-ml-system.html

New paper: “This Person (Probably) Exists. Identity Membership Attacks Against GAN Generated Faces.

Abstract: Recently, generative adversarial networks (GANs) have achieved stunning realism, fooling even human observers. Indeed, the popular tongue-in-cheek website http://thispersondoesnotexist.com, taunts users with GAN generated images that seem too real to believe. On the other hand, GANs do leak information about their training data, as evidenced by membership attacks recently demonstrated in the literature. In this work, we challenge the assumption that GAN faces really are novel creations, by constructing a successful membership attack of a new kind. Unlike previous works, our attack can accurately discern samples sharing the same identity as training samples without being the same samples. We demonstrate the interest of our attack across several popular face datasets and GAN training procedures. Notably, we show that even in the presence of significant dataset diversity, an over represented person can pose a privacy concern.

News article. Slashdot post.

Coalescing Connections to Improve Network Privacy and Performance

Post Syndicated from Talha Paracha original https://blog.cloudflare.com/connection-coalescing-experiments/

Coalescing Connections to Improve Network Privacy and Performance

Coalescing Connections to Improve Network Privacy and Performance

Web pages typically have a large number of embedded subresources (e.g., JavaScript, CSS, image files, ads, beacons) that are fetched by a browser on page loads. Requests for these subresources can prompt browsers to perform further DNS lookups, TCP connections, and TLS handshakes, which can have a significant impact on how long it takes for the user to see the content and interact with the page. Further, each additional request exposes metadata (such as plaintext DNS queries, or unencrypted SNI in TLS handshake) which can have potential privacy implications for the user. With these factors in mind, we carried out a measurement study to understand how we can leverage Connection Coalescing (aka Connection Reuse) to address such concerns, and study its feasibility.


The web has come a long way and initially consisted of very simple protocols. One of them was HTTP/1.0, which required browsers to make a separate connection for every subresource on the page. This design was quickly recognized as having significant performance bottlenecks and was extended with HTTP pipelining and persistent connections in HTTP/1.1 revision, which allowed HTTP requests to reuse the same TCP connection. But, yet again, this was no silver bullet: while multiple requests could share the same connection, they still had to be serialized one after the other, so a client and server could only execute a single request/response exchange at any given time for each connection. As time passed, websites became more complex in structure and dynamic in nature, and HTTP/1.1 was identified as a major bottleneck. The only way to gain concurrency at the network layer was to use multiple TCP connections to the same origin in parallel, but this meant losing most benefits of persistent connections and ended up overloading the origin servers which were unable to meet the concurrency demand.

To address these performance limitations, the SPDY protocol was introduced over a decade later. SPDY supported stream multiplexing, where requests to and responses from the server used a single interleaved TCP connection, and allowed browsers to prioritize requests for critical subresources first — that were blocking page rendering. A modified variant of SPDY was standardized by the IETF as HTTP/2 in 2012 and published as RFC 7540 in 2015.

HTTP/2 and onwards retained this new standard for connection reuse. More specifically, all subresources on the same domain were able to reuse the same TCP/TLS (or UDP/QUIC) connection without any head-of-line blocking (at least on the application layer). This resulted in a single connection for all the subresources — reducing extraneous requests on page loads — potentially speeding up some websites and applications.

Interestingly, the protocol has a lesser-known feature to also enable subresources at different hostnames to be fetched over the same connection. We studied the real-world feasibility and benefits of this technique as an effort to improve users’ experience for websites across our network.

Coalescing Connections to Improve Network Privacy and Performance
Connection Coalescing allows reusing a TLS connection across different domains

Connection Coalescing

The technique is often referred to as Connection Coalescing and, to put it simply, is a way to access resources from different hostnames that are accessible from the same web server.

There are several reasons for why a single server could handle requests for different hosts, ranging from low-cost virtual hosting to the usage of CDNs and cloud providers (including Cloudflare, that acts as a reverse proxy for approximately 25 million Internet properties). Before going into the technical conditions required to enable connection coalescing, we should take a look at some benefits such a strategy can provide.

  • Privacy. When resources at different hostnames are loaded via separate TLS connections, those connections expose metadata to ISPs and other observers via the Server Name Indicator (SNI) field about the destinations that are being contacted (i.e., in the absence of encrypted SNI). This set of exposed SNI’s can allow an on-path adversary to fingerprint traffic and possibly determine user interactions on the webpage. On the other hand, coalesced requests for more than one hostname on a single connection exposes only one destination, and helps avoid such threats.
  • Performance. Additional TLS handshakes and TCP connections can incur significant costs in terms of cpu, memory and other resources. Thus, coalescing requests to use the same connection can optimize resource utilization.
  • Resource Prioritization. Multiplexing requests on a single connection means that applications have better visibility and more direct control over how related resources are prioritized and scheduled. In the absence of coalescing, the network properties (for example, route congestion) can interfere with the intended order of delivery for resources. This reliability gained through connection coalescing opens up new optimization opportunities to improve web page load times, among other things.

However, along with all these potential benefits, connection coalescing also has some associated risk factors that need to be considered in practice. First, TCP incorporates “fair” congestion control mechanisms — if there are ten connections on the same route, each gets approximately 1/10th of the total bandwidth. So with a route congested and bandwidth restricted, a client relying on multiple connections might be better off (for example, if they have five of the ten connections, their total share of bandwidth would be half). Second, browsers will use different parallelization routines for scheduling requests on multiple connections versus the same connection — it is not immediately clear whether the former or latter would perform better. Third, multiple connections exhibit an inherent form of load balancing for TLS-termination processes. That’s because multiple requests on the same connection must be answered by the same TLS-termination process that holds the session keys (often on the same physical server). So, it is important to study connection coalescing carefully before rolling it out widely.

With this context in mind, we studied the feasibility of connection coalescing on real-world traffic. More specifically, the two questions we wanted to answer were
(a) can we empirically demonstrate and quantify the theoretical benefits of connection coalescing?, and (b) could coalescing cause unintended side effects, such as performance degradation, due to the risks highlighted above?

In order to answer these questions, we first made the observation that a large number of Cloudflare customers request subresources from cdnjs — which is also powered by Cloudflare. For context, cdnjs has public JavaScript and CSS libraries (like jQuery), and is used by more than 12% of all websites on the Internet. One popular way these websites include resources from cdnjs is by using <script src="https://cdnjs.cloudflare.com/..." ></script> HTML tags. But there are other ways as well, such as the usage of XMLHttpRequest or Fetch APIs. Regardless of the way these resources are included, browsers will need to fetch them for completely loading a website.

We then identified a list of approximately four thousand websites using Cloudflare (on the Free plan) that likely used cdnjs. We divided this list of sites into evenly-sized and randomly-picked control and experiment groups. Our plan was to enable coalescing only for the experiment group, so that subresource requests generated from their web pages for cdnjs could reuse existing connections. In this way, we were able to compare results obtained on the experiment group, with the ones for the control group, and attribute any differences observed to connection coalescing.

In order to signal browsers that the requests can be coalesced, we served cdnjs and the sites from the same IP address in a few regions around the world. This meant the same DNS responses for all the zones that were part of the study — eventually load balanced by our Anycast network. These sites also had TLS certificates that included cdnjs.

The above two conditions (same IP and compatible certificate) are required to achieve coalescing as per the HTTP/2 spec. However, the QUIC spec allows coalescing even if only the second condition is met. Major web browsers are yet to adopt the QUIC coalescing mechanism, and currently use only the HTTP/2 coalescing logic for both protocols.

Coalescing Connections to Improve Network Privacy and Performance
Requests to Experiment Group Zones and cdnjs being coalesced on the same TLS connection


We started noticing evidence of real-world coalescing from the day our experiment was launched. The following graph shows that approximately 50% of requests to cdnjs from our experiment group sites are coalesced (i.e., their TLS SNI does not equal cdnjs) as compared to 0% of requests from the control group sites.

Coalescing Connections to Improve Network Privacy and Performance
Coalesced Requests to cdnjs from Control and Experimental Group Zones

In addition, we conducted active measurements using our private WebPageTest instances at the landing pages of experiment and control sites — using the two well-supported browsers: Google Chrome and Firefox. From our results, Chrome created about 78% fewer TLS connections to cdnjs for our experiment group sites, as compared to the control group. But surprisingly, Firefox created just roughly 22% fewer connections. As TLS handshakes are computationally expensive because they involve cryptographic signatures and key exchange algorithms, fewer handshakes meant less CPU cycles spent by both the client and the server.

Upon further analysis, we were able to make two observations from the data:

  • A fraction of sites that never coalesced connections with either browser appeared to load subresources with CORS enabled (i.e., <script src="https://cdnjs.cloudflare.com/..." integrity="sha512-894Y..." crossorigin="anonymous">). This is the default way cdnjs recommends inclusion of subresources, as CORS is needed for integrity checks that provide substantial mitigations against script-manipulation attacks. We do not recommend removing this attribute. Our testing also revealed that using XMLHttpRequest or Fetch APIs to load subresources disabled coalescing as well. It is unclear why browsers choose to not coalesce such connections, and we are in contact with the vendors to find out.
  • Although both Firefox and Chrome coalesced requests for cdnjs on existing connections, the reason for the discrepancy in the number of TLS connections to cdnjs (approximately 78% vs roughly 22%) is because Firefox appears to open new connections even if it does not end up using them.

After evaluating the potential benefits of coalescing, we wanted to understand if coalescing caused any unintended side effects. Hence, the final measurement we conducted was to check whether our experiments were detrimental to a website’s performance. We tracked Page Load Times (PLT) and Largest Contentful Paint (LCP) across a variety of stimulated network conditions using both Chrome and Firefox and found the results for experiment vs control group to not be statistically significant.

Coalescing Connections to Improve Network Privacy and Performance
Page load times for control and experiment group sites. Each site was loaded once, and the “fullyLoaded” metric from WebPageTest is reported


We consider our experimentation successful in determining the feasibility of connection coalescing and highlighting its potential benefits in terms of privacy and performance. More specifically, we observed the privacy benefits of coalescing in more than 50% of requests to cdnjs from real-world traffic. In addition, our active testing demonstrated that browsers create fewer TLS connections with coalescing enabled. Interestingly, our results also revealed that the benefits might not always occur (i.e., CORS-enabled requests, Firefox creating additional TLS connections despite coalescing). Finally, we did not find any evidence that coalescing can cause harm to real-world users’ experience on the Internet.

Some future directions we would like to explore include:

  • More aggressive connection reuse with multiple hostnames, while identifying conditions most suitable for coalescing.
  • Understanding how different connection reuse methods compare, e.g., IP-based coalescing vs. use of Origin Frames, and what effects do they have on user experience over the Internet.
  • Evaluating coalescing support among different browser vendors, and encouraging adoption of HTTP/3 QUIC based coalescing.
  • Reaping the full benefits of connection coalescing by experimenting with custom priority schemes for requests within the same connection.

Please send questions and feedback to [email protected]. We’re excited to continue this line of work in our effort to help build a better Internet! For those interested in joining our team please visit our Careers Page.

Handshake Encryption: Endgame (an ECH update)

Post Syndicated from Christopher Wood original https://blog.cloudflare.com/handshake-encryption-endgame-an-ech-update/

Handshake Encryption: Endgame (an ECH update)

Handshake Encryption: Endgame (an ECH update)

Privacy and security are fundamental to Cloudflare, and we believe in and champion the use of cryptography to help provide these fundamentals for customers, end-users, and the Internet at large. In the past, we helped specify, implement, and ship TLS 1.3, the latest version of the transport security protocol underlying the web, to all of our users. TLS 1.3 vastly improved upon prior versions of the protocol with respect to security, privacy, and performance: simpler cryptographic algorithms, more handshake encryption, and fewer round trips are just a few of the many great features of this protocol.

TLS 1.3 was a tremendous improvement over TLS 1.2, but there is still room for improvement. Sensitive metadata relating to application or user intent is still visible in plaintext on the wire. In particular, all client parameters, including the name of the target server the client is connecting to, are visible in plaintext. For obvious reasons, this is problematic from a privacy perspective: Even if your application traffic to crypto.cloudflare.com is encrypted, the fact you’re visiting crypto.cloudflare.com can be quite revealing.

And so, in collaboration with other participants in the standardization community and members of industry, we embarked towards a solution for encrypting all sensitive TLS metadata in transit. The result: TLS Encrypted ClientHello (ECH), an extension to protect this sensitive metadata during connection establishment.

Last year, we described the current status of this standard and its relation to the TLS 1.3 standardization effort, as well as ECH’s predecessor, Encrypted SNI (ESNI). The protocol has come a long way since then, but when will we know when it’s ready? There are many ways by which one can measure a protocol. Is it implementable? Is it easy to enable? Does it seamlessly integrate with existing protocols or applications? In order to assess these questions and see if the Internet is ready for ECH, the community needs deployment experience. Hence, for the past year, we’ve been focused on making the protocol stable, interoperable, and, ultimately, deployable. And today, we’re pleased to announce that we’ve begun our initial deployment of TLS ECH.

What does ECH mean for connection security and privacy on the network? How does it relate to similar technologies and concepts such as domain fronting? In this post, we’ll dig into ECH details and describe what this protocol does to move the needle to help build a better Internet.

Connection privacy

For most Internet users, connections are made to perform some type of task, such as loading a web page, sending a message to a friend, purchasing some items online, or accessing bank account information. Each of these connections reveals some limited information about user behavior. For example, a connection to a messaging platform reveals that one might be trying to send or receive a message. Similarly, a connection to a bank or financial institution reveals when the user typically makes financial transactions. Individually, this metadata might seem harmless. But consider what happens when it accumulates: does the set of websites you visit on a regular basis uniquely identify you as a user? The safe answer is: yes.

This type of metadata is privacy-sensitive, and ultimately something that should only be known by two entities: the user who initiates the connection, and the service which accepts the connection. However, the reality today is that this metadata is known to more than those two entities.

Making this information private is no easy feat. The nature or intent of a connection, i.e., the name of the service such as crypto.cloudflare.com, is revealed in multiple places during the course of connection establishment: during DNS resolution, wherein clients map service names to IP addresses; and during connection establishment, wherein clients indicate the service name to the target server. (Note: there are other small leaks, though DNS and TLS are the primary problems on the Internet today.)

As is common in recent years, the solution to this problem is encryption. DNS-over-HTTPS (DoH) is a protocol for encrypting DNS queries and responses to hide this information from onpath observers. Encrypted Client Hello (ECH) is the complementary protocol for TLS.

The TLS handshake begins when the client sends a ClientHello message to the server over a TCP connection (or, in the context of QUIC, over UDP) with relevant parameters, including those that are sensitive. The server responds with a ServerHello, encrypted parameters, and all that’s needed to finish the handshake.

Handshake Encryption: Endgame (an ECH update)

The goal of ECH is as simple as its name suggests: to encrypt the ClientHello so that privacy-sensitive parameters, such as the service name, are unintelligible to anyone listening on the network. The client encrypts this message using a public key it learns by making a DNS query for a special record known as the HTTPS resource record. This record advertises the server’s various TLS and HTTPS capabilities, including ECH support. The server decrypts the encrypted ClientHello using the corresponding secret key.

Conceptually, DoH and ECH are somewhat similar. With DoH, clients establish an encrypted connection (HTTPS) to a DNS recursive resolver such as and, within that connection, perform DNS transactions.

Handshake Encryption: Endgame (an ECH update)

With ECH, clients establish an encrypted connection to a TLS-terminating server such as crypto.cloudflare.com, and within that connection, request resources for an authorized domain such as cloudflareresearch.com.

Handshake Encryption: Endgame (an ECH update)

There is one very important difference between DoH and ECH that is worth highlighting. Whereas a DoH recursive resolver is specifically designed to allow queries for any domain, a TLS server is configured to allow connections for a select set of authorized domains. Typically, the set of authorized domains for a TLS server are those which appear on its certificate, as these constitute the set of names for which the server is authorized to terminate a connection.

Basically, this means the DNS resolver is open, whereas the ECH client-facing server is closed. And this closed set of authorized domains is informally referred to as the anonymity set. (This will become important later on in this post.) Moreover, the anonymity set is assumed to be public information. Anyone can query DNS to discover what domains map to the same client-facing server.

Why is this distinction important? It means that one cannot use ECH for the purposes of connecting to an authorized domain and then interacting with a different domain, a practice commonly referred to as domain fronting. When a client connects to a server using an authorized domain but then tries to interact with a different domain within that connection, e.g., by sending HTTP requests for an origin that does not match the domain of the connection, the request will fail.

From a high level, encrypting names in DNS and TLS may seem like a simple feat. However, as we’ll show, ECH demands a different look at security and an updated threat model.

A changing threat model and design confidence

The typical threat model for TLS is known as the Dolev-Yao model, in which an active network attacker can read, write, and delete packets from the network. This attacker’s goal is to derive the shared session key. There has been a tremendous amount of research analyzing the security of TLS to gain confidence that the protocol achieves this goal.

The threat model for ECH is somewhat stronger than considered in previous work. Not only should it be hard to derive the session key, it should also be hard for the attacker to determine the identity of the server from a known anonymity set. That is, ideally, it should have no more advantage in identifying the server than if it simply guessed from the set of servers in the anonymity set. And recall that the attacker is free to read, write, and modify any packet as part of the TLS connection. This means, for example, that an attacker can replay a ClientHello and observe the server’s response. It can also extract pieces of the ClientHello — including the ECH extension — and use them in its own modified ClientHello.

Handshake Encryption: Endgame (an ECH update)

The design of ECH ensures that this sort of attack is virtually impossible by ensuring the server certificate can only be decrypted by either the client or client-facing server.

Something else an attacker might try is masquerade as the server and actively interfere with the client to observe its behavior. If the client reacted differently based on whether the server-provided certificate was correct, this would allow the attacker to test whether a given connection using ECH was for a particular name.

Handshake Encryption: Endgame (an ECH update)

ECH also defends against this attack by ensuring that an attacker without access to the private ECH key material cannot actively inject anything into the connection.

The attacker can also be entirely passive and try to infer encrypted information from other visible metadata, such as packet sizes and timing. (Indeed, traffic analysis is an open problem for ECH and in general for TLS and related protocols.) Passive attackers simply sit and listen to TLS connections, and use what they see and, importantly, what they know to make determinations about the connection contents. For example, if a passive attacker knows that the name of the client-facing server is crypto.cloudflare.com, and it sees a ClientHello with ECH to crypto.cloudflare.com, it can conclude, with reasonable certainty, that the connection is to some domain in the anonymity set of crypto.cloudflare.com.

The number of potential attack vectors is astonishing, and something that the TLS working group has tripped over in prior iterations of the ECH design. Before any sort of real world deployment and experiment, we needed confidence in the design of this protocol. To that end, we are working closely with external researchers on a formal analysis of the ECH design which captures the following security goals:

  1. Use of ECH does not weaken the security properties of TLS without ECH.
  2. TLS connection establishment to a host in the client-facing server’s anonymity set is indistinguishable from a connection to any other host in that anonymity set.

We’ll write more about the model and analysis when they’re ready. Stay tuned!

There are plenty of other subtle security properties we desire for ECH, and some of these drill right into the most important question for a privacy-enhancing technology: Is this deployable?

Focusing on deployability

With confidence in the security and privacy properties of the protocol, we then turned our attention towards deployability. In the past, significant protocol changes to fundamental Internet protocols such as TCP or TLS have been complicated by some form of benign interference. Network software, like any software, is prone to bugs, and sometimes these bugs manifest in ways that we only detect when there’s a change elsewhere in the protocol. For example, TLS 1.3 unveiled middlebox ossification bugs that ultimately led to the middlebox compatibility mode for TLS 1.3.

While itself just an extension, the risk of ECH exposing (or introducing!) similar bugs is real. To combat this problem, ECH supports a variant of GREASE whose goal is to ensure that all ECH-capable clients produce syntactically equivalent ClientHello messages. In particular, if a client supports ECH but does not have the corresponding ECH configuration, it uses GREASE. Otherwise, it produces a ClientHello with real ECH support. In both cases, the syntax of the ClientHello messages is equivalent.

This hopefully avoids network bugs that would otherwise trigger upon real or fake ECH. Or, in other words, it helps ensure that all ECH-capable client connections are treated similarly in the presence of benign network bugs or otherwise passive attackers. Interestingly, active attackers can easily distinguish — with some probability — between real or fake ECH. Using GREASE, the ClientHello carries an ECH extension, though its contents are effectively randomized, whereas a real ClientHello using ECH has information that will match what is contained in DNS. This means an active attacker can simply compare the ClientHello against what’s in the DNS. Indeed, anyone can query DNS and use it to determine if a ClientHello is real or fake:

$ dig +short crypto.cloudflare.com TYPE65
\# 134 0001000001000302683200040008A29F874FA29F884F000500480046 FE0D0042D500200020E3541EC94A36DCBF823454BA591D815C240815 77FD00CAC9DC16C884DF80565F0004000100010013636C6F7564666C 6172652D65736E692E636F6D00000006002026064700000700000000 0000A29F874F260647000007000000000000A29F884F

Despite this obvious distinguisher, the end result isn’t that interesting. If a server is capable of ECH and a client is capable of ECH, then the connection most likely used ECH, and whether clients and servers are capable of ECH is assumed public information. Thus, GREASE is primarily intended to ease deployment against benign network bugs and otherwise passive attackers.

Note, importantly, that GREASE (or fake) ECH ClientHello messages are semantically different from real ECH ClientHello messages. This presents a real problem for networks such as enterprise settings or school environments that otherwise use plaintext TLS information for the purposes of implementing various features like filtering or parental controls. (Encrypted DNS protocols like DoH also encountered similar obstacles in their deployment.) Fundamentally, this problem reduces to the following: How can networks securely disable features like DoH and ECH? Fortunately, there are a number of approaches that might work, with the more promising one centered around DNS discovery. In particular, if clients could securely discover encrypted recursive resolvers that can perform filtering in lieu of it being done at the TLS layer, then TLS-layer filtering might be wholly unnecessary. (Other approaches, such as the use of canary domains to give networks an opportunity to signal that certain features are not permitted, may work, though it’s not clear if these could or would be abused to disable ECH.)

We are eager to collaborate with browser vendors, network operators, and other stakeholders to find a feasible deployment model that works well for users without ultimately stifling connection privacy for everyone else.

Next steps

ECH is rolling out for some FREE zones on our network in select geographic regions. We will continue to expand the set of zones and regions that support ECH slowly, monitoring for failures in the process. Ultimately, the goal is to work with the rest of the TLS working group and IETF towards updating the specification based on this experiment in hopes of making it safe, secure, usable, and, ultimately, deployable for the Internet.

ECH is one part of the connection privacy story. Like a leaky boat, it’s important to look for and plug all the gaps before taking on lots of passengers! Cloudflare Research is committed to these narrow technical problems and their long-term solutions. Stay tuned for more updates on this and related protocols.

The European Parliament Voted to Ban Remote Biometric Surveillance

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/10/the-european-parliament-voted-to-ban-remote-biometric-surveillance.html

It’s not actually banned in the EU yet — the legislative process is much more complicated than that — but it’s a step: a total ban on biometric mass surveillance.

To respect “privacy and human dignity,” MEPs said that EU lawmakers should pass a permanent ban on the automated recognition of individuals in public spaces, saying citizens should only be monitored when suspected of a crime.

The parliament has also called for a ban on the use of private facial recognition databases — such as the controversial AI system created by U.S. startup Clearview (also already in use by some police forces in Europe) — and said predictive policing based on behavioural data should also be outlawed.

MEPs also want to ban social scoring systems which seek to rate the trustworthiness of citizens based on their behaviour or personality.

Web3 — A vision for a decentralized web

Post Syndicated from Thibault Meunier original https://blog.cloudflare.com/what-is-web3/

Web3 — A vision for a decentralized web

Web3 — A vision for a decentralized web

By reading this, you are a participant of the web. It’s amazing that we can write this blog and have it appear to you without operating a server or writing a line of code. In general, the web of today empowers us to participate more than we could at any point in the past.

Last year, we mentioned the next phase of the Internet would be always on, always secure, always private. Today, we dig into a similar trend for the web, referred to as Web3. In this blog we’ll start to explain Web3 in the context of the web’s evolution, and how Cloudflare might help to support it.

Going from Web 1.0 to Web 2.0

When Sir Tim Berners-Lee wrote his seminal 1989 document “Information Management: A Proposal”, he outlined a vision of the “web” as a network of information systems interconnected via hypertext links. It is often assimilated to the Internet, which is the computer network it operates on. Key practical requirements for this web included being able to access the network in a decentralized manner through remote machines and allowing systems to be linked together without requiring any central control or coordination.

Web3 — A vision for a decentralized web
The original proposal for what we know as the web, fitting in one diagram – Source: w3

This vision materialized into an initial version of the web that was composed of interconnected static resources delivered via a distributed network of servers and accessed primarily on a read-only basis from the client side — “Web 1.0”. Usage of the web soared with the number of websites growing well over 1,000% in the ~2 years following the introduction of the Mosaic graphical browser in 1993, based on data from the World Wide Web Wanderer.

The early 2000s marked an inflection point in the growth of the web and a key period of its development, as technology companies that survived the dot-com crash evolved to deliver value to customers in new ways amidst heightened skepticism around the web:

  • Desktop browsers like Netscape became commoditized and paved the way for native web services for discovering content like search engines.
  • Network effects that were initially driven by hyperlinks in web directories like Yahoo! were hyperscaled by platforms that enabled user engagement and harnessed collective intelligence like review sites.
  • The massive volume of data generated by Internet activity and the growing realization of its competitive value forced companies to become experts at database management.

O’Reilly Media coined the concept of Web 2.0 in an attempt to capture such shifts in design principles, which were transformative to the usability and interactiveness of the web and continue to be core building blocks for Internet companies nearly two decades later.

However, in the midst of the web 2.0 transformation, the web fell out of touch with one of its initial core tenets — decentralization.

Decentralization: No permission is needed from a central authority to post anything on the web, there is no central controlling node, and so no single point of failure … and no “kill switch”!
— History of the web by Web Foundation

A new paradigm for the Internet

This is where Web3 comes in. The last two decades have proven that building a scalable system that decentralizes content is a challenge. While the technology to build such systems exists, no content platform achieves decentralization at scale.

There is one notable exception: Bitcoin. Bitcoin was conceptualized in a 2008 whitepaper by Satoshi Nakamoto as a type of distributed ledger known as a blockchain designed so that a peer-to-peer (P2P) network could transact in a public, consistent, and tamper-proof manner.

That’s a lot said in one sentence. Let’s break it down by term:

  • A peer-to-peer network is a network architecture. It consists of a set of computers, called nodes, that store and relay information. Each node is equally privileged, preventing one node from becoming a single point of failure. In the Bitcoin case, nodes can send, receive, and process Bitcoin transactions.
  • A ledger is a collection of accounts in which transactions are recorded. For Bitcoin, the ledger records Bitcoin transactions.
  • A distributed ledger is a ledger that is shared and synchronized among multiple computers. This happens through a consensus, so each computer holds a similar replica of the ledger. With Bitcoin, the consensus process is performed over a P2P network, the Bitcoin network.
  • A blockchain is a type of distributed ledger that stores data in “blocks” that are cryptographically linked together into an immutable chain that preserves their chronological order. Bitcoin leverages blockchain technology to establish a shared, single source of truth of transactions and the sequence in which they occurred, thereby mitigating the double-spending problem.

Bitcoin — which currently has over 40,000 nodes in its network and processes over $30B in transactions each day — demonstrates that an application can be run in a distributed manner at scale, without compromising security. It inspired the development of other blockchain projects such as Ethereum which, in addition to transactions, allows participants to deploy code that can verifiably run on each of its nodes.

Today, these programmable blockchains are seen as ideal open and trustless platforms to serve as the infrastructure of a distributed Internet. They are home to a rich and growing ecosystem of nearly 7,000 decentralized applications (“Dapps”) that do not rely on any single entity to be available. This provides them with greater flexibility on how to best serve their users in all jurisdictions.

The web is for the end user

Distributed systems are inherently different from centralized systems. They should not be thought about in the same way. Distributed systems enable the data and its processing to not be held by a single party. This is useful for companies to provide resilience, but it’s also useful for P2P-based networks where data can stay in the hands of the participants.

For instance, if you were to host a blog the old-fashioned way, you would put up a server, expose it to the Internet (via Cloudflare 😀), et voilà. Nowadays, your blog would be hosted on a platform like WordPress, Ghost, Notions, or even Twitter. If these companies were to have an outage, this affects a lot more people. In a distributed fashion, via IPFS for instance, your blog content can be hosted and served from multiple locations operated by different entities.

Web3 — A vision for a decentralized web
Web 1.0
Web3 — A vision for a decentralized web
Web 2.0
Web3 — A vision for a decentralized web

Each participant in the network can choose what they host/provide and can be home to different content. Similar to your home network, you are in control of what you share, and you don’t share everything.

This is a core tenet of decentralized identity. The same cryptographic principles underpinning cryptocurrencies like Bitcoin and Ethereum are being leveraged by applications to provide secure, cross-platform identity services. This is fundamentally different from other authentication systems such as OAuth 2.0, where a trusted party has to be reached to assess one’s identity. This materializes in the form of “Login with <Big Cloud provider>” buttons. These cloud providers are the only ones with enough data, resources, and technical expertise.

In a decentralised web, each participant holds a secret key. They can then use it to identify each other. You can learn about this cryptographic system in a previous blog. In a Web3 setting where web participants own their data, they can selectively share these data with applications they interact with. Participants can also leverage this system to prove interactions they had with one another. For example, if a college issues you a Decentralized Identifier (DID), you can later prove you have been registered at this college without reaching out to the college again. Decentralized Identities can also serve as a placeholder for a public profile, where participants agree to use a blockchain as a source of trust. This is what projects such as ENS or Unlock aim to provide: a way to verify your identity online based on your control over a public key.

This trend of proving ownership via a shared source of trust is key to the NFT craze. We have discussed NFTs before on this blog. Blockchain-based NFTs are a medium of conveying ownership. Blockchain enables this information to be publicly verified and updated. If the blockchain states a public key I control is the owner of an NFT, I can refer to it on other platforms to prove ownership of it. For instance, if my profile picture on social media is a cat, I can prove the said cat is associated with my public key. What this means depends on what I want to prove, especially with the proliferation of NFT contracts. If you want to understand how an NFT contract works, you can build your own.

Web3 — A vision for a decentralized web

How does Cloudflare fit in Web3?

Decentralization and privacy are challenges we are tackling at Cloudflare as part of our mission to help build a better Internet.

In a previous post, Nick Sullivan described Cloudflare’s contributions to enabling privacy on the web. We launched initiatives to fix information leaks in HTTPS through Encrypted Client Hello (ECH), make DNS even more private by supporting Oblivious DNS-over-HTTPS (ODoH), and develop OPAQUE which makes password breaches less likely to occur. We have also released our data localization suite to help businesses navigate the ever evolving regulatory landscape by giving them control over where their data is stored without compromising performance and security. We’ve even built a privacy-preserving attestation that is based on the same zero-knowledge proof techniques that are core to distributed systems such as ZCash and Filecoin.

It’s exciting to think that there are already ways we can change the web to improve the experience for its users. However, there are some limitations to build on top of the exciting infrastructure. This is why projects such as Ethereum and IPFS build on their own architecture. They are still relying on the Internet but do not operate with the web as we know it. To ease the transition, Cloudflare operates distributed web gateways. These gateways provide an HTTP interface to Web3 protocols: Ethereum and IPFS. Since HTTP is core to the web we know today, distributed content can be accessed securely and easily without requiring the user to operate experimental software.

Where do we go next?

The journey to a different web is long but exciting. The infrastructure built over the last two decades is truly stunning. The Internet and the web are now part of 4.6 billion people’s lives. At the same time, the top 35 websites had more visits than all others (circa 2014). Users have less control over their data and are even more reliant on a few players.

The early Web was static. Then Web 2.0 came to provide interactiveness and service we use daily at the cost of centralisation. Web3 is a trend that tries to challenge this. With distributed networks built on open protocols, users of the web are empowered to participate.

At Cloudflare, we are embracing this distributed future. Applying the knowledge and experience we have gained from running one of the largest edge networks, we are making it easier for users and businesses to benefit from Web3. This includes operating a distributed web product suite, contributing to open standards, and moving privacy forward.

If you would like to help build a better web with us, we are hiring.

Disaster recovery compliance in the cloud, part 2: A structured approach

Post Syndicated from Dan MacKay original https://aws.amazon.com/blogs/security/disaster-recovery-compliance-in-the-cloud-part-2-a-structured-approach/

Compliance in the cloud is fraught with myths and misconceptions. This is particularly true when it comes to something as broad as disaster recovery (DR) compliance where the requirements are rarely prescriptive and often based on legacy risk-mitigation techniques that don’t account for the exceptional resilience of modern cloud-based architectures. For regulated entities subject to principles-based supervision such as many financial institutions (FIs), the responsibility lies with the FI to determine what’s necessary to adequately recover from a disaster event. Without clear instructions, FIs are susceptible to making incorrect assumptions regarding their compliance requirements for DR.

In Part 1 of this two-part series, I provided some examples of common misconceptions FIs have about compliance requirements for disaster recovery in the cloud. In Part 2, I outline five steps you can take to avoid these misconceptions when architecting DR-compliant workloads for deployment on Amazon Web Services (AWS).

1. Identify workloads planned for deployment

It’s common for FIs to have a portfolio of workloads they are considering deploying to the cloud and often want to know that they can be compliant across the board. But compliance isn’t a one-size-fits-all domain—it’s based on the characteristics of each workload. For example, does the workload contain personally identifiable information (PII)? Will it be used to store, process, or transmit credit card information? Compliance is dependent on the answers to questions such as these and must be assessed on a case-by-case basis. Therefore, the first step in architecting for compliance is to identify the specific workloads you plan to deploy to the cloud. This way, you can assess the requirements of these specific workloads and not be distracted by aspects of compliance that might not be relevant.

2. Define the workload’s resiliency requirements

Resiliency is the ability of a workload to recover from infrastructure or service disruptions. DR is an important part of your resiliency strategy and concerns how your workload responds to a disaster event. DR strategies on AWS range from simple, low cost options such as backup and restore, to more complex options such as multi-site active-active, as shown in Figure 1.

For more information, I encourage you to read Seth Eliot’s blog series on DR Architecture on AWS as well as the AWS whitepaper Disaster Recovery of Workloads on AWS: Recovery in the Cloud.

The DR strategy you choose for a particular workload is dependent on your organization’s requirements for avoiding loss of data—known as the recovery point objective (RPO)—and reducing downtime where the workload isn’t available —known as the recovery time objective (RTO). RPO and RTO are key factors for determining the minimum architectural specifications necessary to meet the workload’s resiliency requirements. For example, can the workload’s RPO and RTO be achieved using a multi-AZ architecture in a single AWS Region, or do the resiliency requirements necessitate deploying the workload across multiple AWS Regions? Even if your workload is not subject to explicit compliance requirements for resiliency, understanding these requirements is necessary for assessing other aspects of DR compliance, including data residency and geodiversity.

3. Confirm the workload’s data residency requirements

As I mentioned in Part 1, data residency requirements might restrict which AWS Region or Regions you can deploy your workload to. Therefore, you need to confirm whether the workload is subject to any data residency requirements within applicable laws and regulations, corporate policies, or contractual obligations.

In order to properly assess these requirements, you must review the explicit language of the requirements so as to understand the specific constraints they impose. You should also consult legal, privacy, and compliance subject-matter specialists to help you interpret these requirements based on the characteristics of the workload. For example, do the requirements specifically state that the data cannot leave the country, or can the requirement be met so long as the data can be accessed from that country? Does the requirement restrict you from storing a copy of the data in another country—for example, for backup and recovery purposes? What if the data is encrypted and can only be read using decryption keys kept within the home country? Consulting subject-matter specialists to help interpret these requirements can help you avoid making overly restrictive assumptions and imposing unnecessary constraints on the workload’s architecture.

4. Confirm the workload’s geodiversity requirements

A single Region, multiple-AZ architecture is often sufficient to meet a workload’s resiliency requirements. However, if the workload is subject to geodiversity requirements, the distance between the AZs in an AWS Region might not conform to the minimum distance between individual data centers specified by the requirements. Therefore, it’s critical to confirm whether any geodiversity requirements apply to the workload.

Like data residency, it’s important to assess the explicit language of geodiversity requirements. Are they written down in a regulation or corporate policy, or are they just a recommended practice? Can the requirements be met if the workload is deployed across three or more AZs even if the minimum distance between those AZs is less than the specified minimum distance between the primary and backup data centers? If it’s a corporate policy, does it allow for exceptions if an alternative method provides equal or greater resiliency than asynchronous replication between two geographically distant data centers? Or perhaps the corporate policy is outdated and should be revised to reflect modern risk mitigation techniques. Understanding these parameters can help you avoid unnecessary constraints as you assess architectural options for your workloads.

5. Assess architectural options to meet the workload’s requirements

Now that you understand the workload’s requirements for resiliency, data residency, and geodiversity, you can assess the architectural options that meet these requirements in the cloud.

As per AWS Well-Architected best practices, you should strive for the simplest architecture necessary to meet your requirements. This includes assessing whether the workload can be accommodated within a single AWS Region. If the workload is constrained by explicit geographic diversity requirements or has resiliency requirements that cannot be accommodated by a single AWS Region, then you might need to architect the workload for deployment across multiple AWS Regions. If the workload is also constrained by explicit data residency requirements, then it might not be possible to deploy to multiple AWS Regions. In cases such as these, you can work with our AWS Solution Architects to assess hybrid options that might meet your compliance requirements, such as using AWS Outposts, Amazon Elastic Container Service (Amazon ECS) Anywhere, or Amazon Elastic Kubernetes Service (Amazon EKS) Anywhere. Another option may be to consider a DR solution in which your on-premises infrastructure is used as a backup for a workload running on AWS. In some cases, this might be a long-term solution. In others, it might be an interim solution until certain constraints can be removed—for example, a change to corporate policy or the introduction of additional AWS Regions in a particular country.


Let’s recap by summarizing some guiding principles for architecting compliant DR workloads as outlined in this two-part series:

  • Avoid assumptions; confirm the facts. If it’s not written down, it’s unlikely to be considered a mandatory compliance requirement.
  • Consult the experts. Legal, privacy, and compliance, as well as AWS Solution Architects, AWS security and compliance specialists, and other subject-matter specialists.
  • Avoid generalities; focus on the specifics. There is no one-size-fits-all approach.
  • Strive for simplicity, not zero risk. Don’t use multiple AWS Regions when one will suffice.
  • Don’t get distracted by exceptions. Focus on your current requirements, not workloads you’re not yet prepared to deploy to the cloud.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Dan MacKay

Dan is the Financial Services Compliance Specialist for AWS Canada. As a member of the Worldwide Financial Services Security & Compliance team, Dan advises financial services customers on best practices and practical solutions for cloud-related governance, risk, and compliance. He specializes in helping AWS customers navigate financial services and privacy regulations applicable to the use of cloud technology in Canada.

Disaster recovery compliance in the cloud, part 1: Common misconceptions

Post Syndicated from Dan MacKay original https://aws.amazon.com/blogs/security/disaster-recovery-compliance-in-the-cloud-part-1-common-misconceptions/

Compliance in the cloud can seem challenging, especially for organizations in heavily regulated sectors such as financial services. Regulated financial institutions (FIs) must comply with laws and regulations (often in multiple jurisdictions), global security standards, their own corporate policies, and even contractual obligations with their customers and counterparties. These various compliance requirements may impose constraints on how their workloads can be architected for the cloud, and may require interpretation on what FIs must do in order to be compliant. It’s common for FIs to make assumptions regarding their compliance requirements, which can result in unnecessary costs and increased complexity, and might not align with their strategic objectives. A modern, rationalized approach to compliance can help FIs avoid imposing unnecessary constraints while meeting their mandatory requirements.

In my role as an Amazon Web Services (AWS) Compliance Specialist, I work with our financial services customers to identify, assess, and determine solutions to address their compliance requirements as they move to the cloud. One of the most common challenges customers ask me about is how to comply with disaster recovery (DR) requirements for workloads they plan to run in the cloud. In this blog post, I share some of the typical misconceptions FIs have about DR compliance in the cloud. In Part 2, I outline a structured approach to designing compliant architectures for your DR workloads. As my primary market is Canada, the examples in this blog post largely pertain to FIs operating in Canada, but the principles and best practices are relevant to regulated organizations in any country.

“Why isn’t there a checklist for compliance in the cloud?”

Compliance requirements are sometimes prescriptive: “if X, then you must do Y.” When requirements are prescriptive, it’s usually clear what you must do in order to be compliant. For example, the Payment Card Industry Data Security Standard (PCI DSS) requirement 8.2.4 obliges companies that process, store, or transmit credit card information to “change user passwords/passphrases at least once every 90 days.” But in the financial services sector, compliance requirements for managing operational risks can be subjective. When regulators take what is known as a principles-based approach to setting regulatory expectations, each FI is required to assess their specific risks and determine the mitigating controls necessary to conform with the organization’s tolerance for operational risk. Because the rules aren’t prescriptive, there is no “checklist for achieving compliance.” Instead, principles-based requirements are guidelines that FIs are expected to consider as they design and implement technology solutions. They are, by definition, subject to interpretation and can be prone to myths and misconceptions among FIs and their service providers. To illustrate this, let’s look at two aspects of DR that are frequently misunderstood within the Canadian financial services industry: data residency and geodiversity.

“My data has to stay in country X”

Data residency or data localization is a requirement for specific data-sets processed and stored in an IT system to remain within a specific jurisdiction (for example, a country). As discussed in our Policy Perspectives whitepaper, contrary to historical perspectives, data residency doesn’t provide better security. Most cyber-attacks are perpetrated remotely and attackers aren’t deterred by the physical location of their victims. In fact, data residency can run counter to an organization’s objectives for security and resilience. For example, data residency requirements can limit the options our customers have when choosing the AWS Region or Regions in which to run their production workloads. This is especially challenging for customers who want to use multiple Regions for backup and recovery purposes.

It’s common for FIs operating in Canada to assume that they’re required to keep their data—particularly customer data—in Canada. In reality, there’s very little from a statutory perspective that imposes such a constraint. None of the private sector privacy laws include data residency requirements, nor do any of the financial services regulatory guidelines. There are some place of records requirements in Canadian federal financial services legislation such as The Bank Act and The Insurance Companies Act, but these are relatively narrow in scope and apply primarily to corporate records. For most Canadian FIs, their requirements are more often a result of their own corporate policies or contractual obligations, not externally imposed by public policies or regulations.

“My data centers have to be X kilometers apart”

Geodiversity—short for geographic diversity—is the concept of maintaining a minimum distance between primary and backup data processing sites. Geodiversity is based on the principle that requiring a certain distance between data centers mitigates the risk of location-based disruptions such as natural disasters. The principle is still relevant in a cloud computing context, but is not the only consideration when it comes to planning for DR. The cloud allows FIs to define operational resilience requirements instead of limiting themselves to antiquated business continuity planning and DR concepts like physical data center implementation requirements. Legacy disaster recovery solutions and architectures, and lifting and shifting such DR strategies into the cloud, can diminish the potential benefits of using the cloud to improve operational resilience. Modernizing your information technology also means modernizing your organization’s approach to DR.

In the cloud, vast physical distance separation is an anti-pattern—it’s an arbitrary metric that does little to help organizations achieve availability and recovery objectives. At AWS, we design our global infrastructure so that there’s a meaningful distance between the Availability Zones (AZs) within an AWS Region to support high availability, but close enough to facilitate synchronous replication across those AZs (an AZ being a cluster of data centers). Figure 1 shows the relationship between Regions, AZs, and data centers.

Synchronous replication across multiple AZs enables you to minimize data loss (defined as the recovery point objective or RPO) and reduce the amount of time that workloads are unavailable (defined as the recovery time objective or RTO). However, the low latency required for synchronous replication becomes less achievable as the distance between data centers increases. Therefore, a geodiversity requirement that mandates a minimum distance between data centers that’s too far for synchronous replication might prohibit you from taking advantage of AWS’s multiple-AZ architecture. A multiple-AZ architecture can achieve RTOs and RPOs that aren’t possible with a simple geodiversity mitigation strategy. For more information, refer to the AWS whitepaper Disaster Recovery of Workloads on AWS: Recovery in the Cloud.

Again, it’s a common perception among Canadian FIs that the disaster recovery architecture for their production workloads must comply with specific geodiversity requirements. However, there are no statutory requirements applicable to FIs operating in Canada that mandate a minimum distance between data centers. Some FIs might have corporate policies or contractual obligations that impose geodiversity requirements, but for most FIs I’ve worked with, geodiversity is usually a recommended practice rather than a formal policy. Informal corporate guidelines can have some value, but they aren’t absolute rules and shouldn’t be treated the same as mandatory compliance requirements. Otherwise, you might be unintentionally restricting yourself from taking advantage of more effective risk management techniques.

“But if it is a compliance requirement, doesn’t that mean I have no choice?”

Both of the previous examples illustrate the importance of not only confirming your compliance requirements, but also recognizing the source of those requirements. It might be infeasible to obtain an exception to an externally-imposed obligation such as a regulatory requirement, but exceptions or even revisions to corporate policies aren’t out of the question if you can demonstrate that modern approaches provide equal or greater protection against a particular risk—for example, the high availability and rapid recoverability supported by a multiple-AZ architecture. Consider whether your compliance requirements provide for some level of flexibility in their application.

Also, because many of these requirements are principles-based, they might be subject to interpretation. You have to consider the specific language of the requirement in the context of the workload. For example, a data residency requirement might not explicitly prohibit you from storing a copy of the content in another country for backup and recovery purposes. For this reason, I recommend that you consult applicable specialists from your legal, privacy, and compliance teams to aid in the interpretation of compliance requirements. Once you understand the legal boundaries of your compliance requirements, AWS Solutions Architects and other financial services industry specialists such as myself can help you assess viable options to meet your needs.


In this first part of a two-part series, I provided some examples of common misconceptions FIs have about compliance requirements for disaster recovery in the cloud. The key is to avoid making assumptions that might impose greater constraints on your architecture than are necessary. In Part 2, I show you a structured approach for architecting compliant DR workloads that can help you to avoid these preventable missteps.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Dan MacKay

Dan is the Financial Services Compliance Specialist for AWS Canada. As a member of the Worldwide Financial Services Security & Compliance team, Dan advises financial services customers on best practices and practical solutions for cloud-related governance, risk, and compliance. He specializes in helping AWS customers navigate financial services and privacy regulations applicable to the use of cloud technology in Canada.

ProtonMail Now Keeps IP Logs

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/09/protonmail-now-keeps-ip-logs.html

After being compelled by a Swiss court to monitor IP logs for a particular user, ProtonMail no longer claims that “we do not keep any IP logs.”

EDITED TO ADD (9/14): This seems to be more complicated. ProtonMail is not yet saying that they keep logs. Their privacy policy still states that they do not keep logs except in certain circumstances, and outlines those circumstances. And ProtonMail’s warrant canary has an interesting list of data orders they have received from various authorities, whether they complied, and why or why not.

How to securely create and store your CRL for ACM Private CA

Post Syndicated from Tracy Pierce original https://aws.amazon.com/blogs/security/how-to-securely-create-and-store-your-crl-for-acm-private-ca/

In this blog post, I show you how to protect your Amazon Simple Storage Service (Amazon S3) bucket while still allowing access to your AWS Certificate Manager (ACM) Private Certificate Authority (CA) certificate revocation list (CRL).

A CRL is a list of certificates that have been revoked by the CA. Certificates can be revoked because they might have inadvertently been shared, or to discontinue their use, such as when someone leaves the company or an IoT device is decommissioned. In this solution, you use a combination of separate AWS accounts, Amazon S3 Block Public Access (BPA) settings, and a new parameter created by ACM Private CA called S3ObjectAcl to mark the CRL as private. This new parameter allows you to set the privacy of your CRL as PUBLIC_READ or BUCKET_OWNER_FULL_CONTROL. If you choose PUBLIC_READ, the CRL will be accessible over the internet. If you choose BUCKET_OWNER_FULL_CONTROL, then only the CRL S3 bucket owner can access it, and you will need to use Amazon CloudFront to serve the CRL stored in Amazon S3 using origin access identity (OAI). This is because most TLS implementations expect a public endpoint for access.

A best practice for Amazon S3 is to apply the principle of least privilege. To support least privilege, you want to ensure you have the BPA settings for Amazon S3 enabled. These settings deny public access to your S3 objects by using ACLs, bucket policies, or access point policies. I’m going to walk you through setting up your CRL as a private object in an isolated secondary account with BPA settings for access, and a CloudFront distribution with OAI settings enabled. This will confirm that access can only be made through the CloudFront distribution and not directly to your S3 bucket. This enables you to maintain your private CA in your primary account, accessible only by your public key infrastructure (PKI) security team.

As part of the private infrastructure setup, you will create a CloudFront distribution to provide access to your CRL. While not required, it allows access to private CRLs, and is helpful in the event you want to move the CRL to a different location later. However, this does come with an extra cost, so that’s something to consider when choosing to make your CRL private instead of public.


For this walkthrough, you should have the following resources ready to use:

CRL solution overview

The solution consists of creating an S3 bucket in an isolated secondary account, enabling all BPA settings, creating a CloudFront OAI, and a CloudFront distribution.

Figure 1: Solution flow diagram

Figure 1: Solution flow diagram

As shown in Figure 1, the steps in the solution are as follows:

  1. Set up the S3 bucket in the secondary account with BPA settings enabled.
  2. Create the CloudFront distribution and point it to the S3 bucket.
  3. Create your private CA in AWS Certificate Manager (ACM).

In this post, I walk you through each of these steps.

Deploying the CRL solution

In this section, you walk through each item in the solution overview above. This will allow access to your CRL stored in an isolated secondary account, away from your private CA.

To create your S3 bucket

  1. Sign in to the AWS Management Console of your secondary account. For Services, select S3.
  2. In the S3 console, choose Create bucket.
  3. Give the bucket a unique name. For this walkthrough, I named my bucket example-test-crl-bucket-us-east-1, as shown in Figure 2. Because S3 buckets are unique across all of AWS and not just within your account, you must create your own unique bucket name when completing this tutorial. Remember to follow the S3 naming conventions when choosing your bucket name.
    Figure 2: Creating an S3 bucket

    Figure 2: Creating an S3 bucket

  4. Choose Next, and then choose Next again.
  5. For Block Public Access settings for this bucket, make sure the Block all public access check box is selected, as shown in Figure 3.
    Figure 3: S3 block public access bucket settings

    Figure 3: S3 block public access bucket settings

  6. Choose Create bucket.
  7. Select the bucket you just created, and then choose the Permissions tab.
  8. For Bucket Policy, choose Edit, and in the text field, paste the following policy (remember to replace each <user input placeholder> with your own value).
      "Version": "2012-10-17",
      "Statement": [
          "Effect": "Allow",
          "Principal": {
            "Service": "acm-pca.amazonaws.com"
          "Action": [
          "Resource": [

  9. Choose Save changes.
  10. Next to Object Ownership choose Edit.
  11. Select Bucket owner preferred, and then choose Save changes.

To create your CloudFront distribution

  1. Still in the console of your secondary account, from the Services menu, switch to the CloudFront console.
  2. Choose Create Distribution.
  3. For Select a delivery method for your content, under Web, choose Get Started.
  4. On the Origin Settings page, do the following, as shown in Figure 4:
    1. For Origin Domain Name, select the bucket you created earlier. In this example, my bucket name is example-test-crl-bucket-us-east-1.s3.amazonaws.com.
    2. For Restrict Bucket Access, select Yes.
    3. For Origin Access Identity, select Create a New Identity.
    4. For Comment enter a name. In this example, I entered access-identity-crl.
    5. For Grant Read Permissions on Bucket, select Yes, Update Bucket Policy.
    6. Leave all other defaults.
      Figure 4: CloudFront <strong>Origin Settings</strong> page

      Figure 4: CloudFront Origin Settings page

  5. Choose Create Distribution.

To create your private CA

  1. (Optional) If you have already created a private CA, you can update your CRL pointer by using the update-certificate-authority API. You must do this step from the CLI because you can’t select an S3 bucket in a secondary account for the CRL home when you create the CRL through the console. If you haven’t already created a private CA, follow the remaining steps in this procedure.
  2. Use a text editor to create a file named ca_config.txt that holds your CA configuration information. In the following example ca_config.txt file, replace each <user input placeholder> with your own value.
        "KeyAlgorithm": "<RSA_2048>",
        "SigningAlgorithm": "<SHA256WITHRSA>",
        "Subject": {
            "Country": "<US>",
            "Organization": "<Example LLC>",
            "OrganizationalUnit": "<Security>",
            "DistinguishedNameQualifier": "<Example.com>",
            "State": "<Washington>",
            "CommonName": "<Example LLC>",
            "Locality": "<Seattle>"

  3. From the CLI configured with a credential profile for your primary account, use the create-certificate-authority command to create your CA. In the following example, replace each <user input placeholder> with your own value.
    aws acm-pca create-certificate-authority --certificate-authority-configuration file://ca_config.txt --certificate-authority-type “ROOT” --profile <primary_account_credentials>

  4. With the CA created, use the describe-certificate-authority command to verify success. In the following example, replace each <user input placeholder> with your own value.
    aws acm-pca describe-certificate-authority --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --profile <primary_account_credentials>

  5. You should see the CA in the PENDING_CERTIFICATE state. Use the get-certificate-authority-csr command to retrieve the certificate signing request (CSR), and sign it with your ACM private CA. In the following example, replace each <user input placeholder> with your own value.
    aws acm-pca get-certificate-authority-csr --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --output text > <cert_1.csr> --profile <primary_account_credentials>

  6. Now that you have your CSR, use it to issue a certificate. Because this example sets up a ROOT CA, you will issue a self-signed RootCACertificate. You do this by using the issue-certificate command. In the following example, replace each <user input placeholder> with your own value. You can find all allowable values in the ACM PCA documentation.
    aws acm-pca issue-certificate --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --template-arn arn:aws:acm-pca:::template/RootCACertificate/V1 --csr fileb://<cert_1.csr> --signing-algorithm SHA256WITHRSA --validity Value=365,Type=DAYS --profile <primary_account_credentials>

  7. Now that the certificate is issued, you can retrieve it. You do this by using the get-certificate command. In the following example, replace each <user input placeholder> with your own value.
    aws acm-pca get-certificate --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --certificate-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012/certificate/6707447683a9b7f4055627ffd55cebcc> --output text --profile <primary_account_credentials> > ca_cert.pem

  8. Import the certificate ca_cert.pem into your CA to move it into the ACTIVE state for further use. You do this by using the import-certificate-authority-certificate command. In the following example, replace each <user input placeholder> with your own value.
    aws acm-pca import-certificate-authority-certificate --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --certificate fileb://ca_cert.pem --profile <primary_account_credentials>

  9. Use a text editor to create a file named revoke_config.txt that holds your CRL information pointing to your CloudFront distribution ID. In the following example revoke_config.txt, replace each <user input placeholder> with your own value.
        "CrlConfiguration": {
            "Enabled": <true>,
            "ExpirationInDays": <365>,
            "CustomCname": "<example1234.cloudfront.net>",
            "S3BucketName": "<example-test-crl-bucket-us-east-1>",
            "S3ObjectAcl": "<BUCKET_OWNER_FULL_CONTROL>"

  10. Update your CA CRL CNAME to point to the CloudFront distribution you created. You do this by using the update-certificate-authority command. In the following example, replace each <user input placeholder> with your own value.
    aws acm-pca update-certificate-authority --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --revocation-configuration file://revoke_config.txt --profile <primary_account_credentials>

You can use the describe-certificate-authority command to verify that your CA is in the ACTIVE state. After the CA is active, ACM generates your CRL periodically for you, and places it into your specified S3 bucket. It also generates a new CRL list shortly after you revoke any certificate, so you have the most updated copy.

Now that the PCA, CRL, and CloudFront distribution are all set up, you can test to verify the CRL is served appropriately.

To test that the CRL is served appropriately

  1. Create a CSR to issue a new certificate from your PCA. In the following example, replace each <user input placeholder> with your own value. Enter a secure PEM password when prompted and provide the appropriate field data.

    Note: Do not enter any values for the unused attributes, just press Enter with no value.

    openssl req -new -newkey rsa:2048 -days 365 -keyout <test_cert_private_key.pem> -out <test_csr.csr>

  2. Issue a new certificate using the issue-certificate command. In the following example, replace each <user input placeholder> with your own value. You can find all allowable values in the ACM PCA documentation.
    aws acm-pca issue-certificate --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --csr file://<test_csr.csr> --signing-algorithm <SHA256WITHRSA> --validity Value=<31>,Type=<DAYS> --idempotency-token 1 --profile <primary_account_credentials>

  3. After issuing the certificate, you can use the get-certificate command retrieve it, parse it, then get the CRL URL from the certificate just like a PKI client would. In the following example, replace each <user input placeholder> with your own value. This command uses the JQ package.
    aws acm-pca get-certificate --certificate-authority-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012> --certificate-arn <arn:aws:acm-pca:us-east-1:111122223333:certificate-authority/12345678-1234-1234-1234-123456789012/certificate/6707447683a9b7f4055example1234> | jq -r '.Certificate' > cert.pem openssl x509 -in cert.pem -text -noout | grep crl 

    You should see an output similar to the following, but with the domain names of your CloudFront distribution and your CRL file:


  4. Run the curl command to download your CRL file. In the following example, replace each <user input placeholder> with your own value.
    curl http://<example1234.cloudfront.net>/crl/<7215e983-3828-435c-a458-b9e4dd16bab1.crl>

Security best practices

The following are some of the security best practices for setting up and maintaining your private CA in ACM Private CA.

  • Place your root CA in its own account. You want your root CA to be the ultimate authority for your private certificates, limiting access to it is key to keeping it secure.
  • Minimize access to the root CA. This is one of the best ways of reducing the risk of intentional or unintentional inappropriate access or configuration. If the root CA was to be inappropriately accessed, all subordinate CAs and certificates would need to be revoked and recreated.
  • Keep your CRL in a separate account from the root CA. The reason for placing the CRL in a separate account is because some external entities—such as customers or users who aren’t part of your AWS organization, or external applications—might need to access the CRL to check for revocation. To provide access to these external entities, the CRL object and the S3 bucket need to be accessible, so you don’t want to place your CRL in the same account as your private CA.

For more information, see ACM Private CA best practices in the AWS Private CA User Guide.


You’ve now successfully set up your private CA and have stored your CRL in an isolated secondary account. You configured your S3 bucket with Block Public Access settings, created a custom URL through CloudFront, enabled OAI settings, and pointed your DNS to it by using Route 53. This restricts access to your S3 bucket through CloudFront and your OAI only. You walked through the setup of each step, from bucket configurations, hosted zone setup, distribution setup, and finally, private CA configuration and setup. You can now store your private CA in an account with limited access, while your CRL is hosted in a separate account that allows external entity access.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Certificate Manager forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Tracy Pierce

Tracy is a Senior Security Consultant for Engagement Security. She enjoys the peculiar culture of Amazon and uses that to ensure that every day is exciting for her fellow engineers and customers alike. Customer obsession is her highest priority both internally and externally. She has her AS in Computer Security and Forensics from Sullivan College of Technology and Design, Systems Security Certified Practitioner (SSCP) certification, AWS Developer Associate certification, AWS Solutions Architect Associates certificate, and AWS Security Specialist certification. Outside of work, she enjoys time with friends, her fiancé, her Great Dane, and three cats. She also reads (a lot), builds Legos, and loves glitter.

Surveillance of the Internet Backbone

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/surveillance-of-the-internet-backbone.html

Vice has an article about how data brokers sell access to the Internet backbone. This is netflow data. It’s useful for cybersecurity forensics, but can also be used for things like tracing VPN activity.

At a high level, netflow data creates a picture of traffic flow and volume across a network. It can show which server communicated with another, information that may ordinarily only be available to the server owner or the ISP carrying the traffic. Crucially, this data can be used for, among other things, tracking traffic through virtual private networks, which are used to mask where someone is connecting to a server from, and by extension, their approximate physical location.

In the hands of some governments, that could be dangerous.

More on Apple’s iPhone Backdoor

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/more-on-apples-iphone-backdoor.html

In this post, I’ll collect links on Apple’s iPhone backdoor for scanning CSAM images. Previous links are here and here.

Apple says that hash collisions in its CSAM detection system were expected, and not a concern. I’m not convinced that this secondary system was originally part of the design, since it wasn’t discussed in the original specification.

Good op-ed from a group of Princeton researchers who developed a similar system:

Our system could be easily repurposed for surveillance and censorship. The design wasn’t restricted to a specific category of content; a service could simply swap in any content-matching database, and the person using that service would be none the wiser.

EDITED TO ADD (8/30): Good essays by Matthew Green and Alex Stamos, Ross Anderson, Edward Snowden, and Susan Landau. And also Kurt Opsahl.

Capturing Purpose Justification in Cloudflare Access

Post Syndicated from Molly Cinnamon original https://blog.cloudflare.com/access-purpose-justification/

Capturing Purpose Justification in Cloudflare Access

The digital world often takes its cues from the real world. For example, there’s a standard question every guard or agent asks when you cross a border—whether it’s a building, a neighborhood, or a country: “What’s the purpose of your visit?” It’s a logical question: sure, the guard knows some information—like who you are (thanks to your ID) and when you’ve arrived—but the context of “why” is equally important. It can set expectations around behavior during your visit, as well as what spaces you should or should not have access to.

Capturing Purpose Justification in Cloudflare Access
The purpose justification prompt appears upon login, asking users to specify their use case before hitting submit and proceeding.

Digital access follows suit. Recent data protection regulations, such as the GDPR, have formalized concepts of purpose limitation and data proportionality: people should only access data necessary for a specific stated reason. System owners know people need access to do their job, but especially for particularly sensitive applications, knowing why a login was needed is just as vital as knowing who, when, and how.

Starting today, Cloudflare for Teams administrators can prompt users to enter a justification for accessing an application prior to login. Administrators can add this prompt to any existing or new Access application with just two clicks, giving them the ability to:

  • Log and review employee justifications for accessing sensitive applications
  • Add additional layers of security to applications they deem sensitive
  • Customize modal text to communicate data use & sharing principles
  • Help meet regulatory requirements for data access control (such as GDPR)

Starting with Zero Trust access control

Cloudflare Access has been built with access management at its core: rather than trusting anyone on a private network, Access checks for identity, context and device posture every time someone attempts to reach an application or resource.

Behind the scenes, administrators build rules to decide who should be able to reach the tools protected by Access. When users need to connect to those tools, they are prompted to authenticate with one of the identity provider options. Cloudflare Access checks their login against the list of allowed users and, if permitted, allows the request to proceed.

Some applications and workflows contain data so sensitive that the user should have to prove who they are and why they need to reach that service. In this next phase of Zero Trust security, access to data should be limited to specific business use cases or needs, rather than generic all-or-nothing access.

Deploying Zero Trust purpose justification

We created this functionality because we, too, wanted to make sure we had these provisions in place at Cloudflare. We have sensitive internal tools that help our team members serve our customers, and we’ve written before about how we use Cloudflare Access to lock down those tools in a Zero Trust manner.

However, we were not satisfied with just restricting access in the least privileged model. We are accountable to the trust our customers put in our services, and we feel it is important to always have an explicit business reason when connecting to some data sets or tools.

We built purpose justification capture in Cloudflare Access to solve that problem. When team members connect to certain resources, Access prompts them to justify why. Cloudflare’s network logs that rationale and allows the user to proceed.

Purpose justification capture in Access helps fulfill policy requirements, but even for enterprises who don’t need to comply with specific regulations, it also enables a thoughtful privacy and security framework for access controls. Prompting employees to justify their use case helps solve the data management challenge of balancing transparency with security — helping to ensure that sensitive data is used the right way.

Capturing Purpose Justification in Cloudflare Access
Purpose justification capture adds an additional layer of context for enterprise administrators.

Distinguishing Sensitive Domains

So how do you distinguish if something is sensitive? There are two main categories of  applications that may be considered “sensitive.” First: does it contain personally identifiable information or sensitive financials? Second, do all the employees who have access actually need access? The flexibility of the configuration of Access policies helps effectively distinguish sensitive domains for specific user groups.

Purpose justification in Cloudflare Access enables Teams administrators to configure the language of the prompt itself by domain. This is a helpful place to remind employees of the sensitivity of the data, such as, “This application contains PII. Please be mindful of company policies and provide a justification for access,” or “Please enter the case number corresponding to your need for access.” The language can proactively ensure that employees with access to an internal tool are using it as intended.

Additionally, Access identity management allows Teams customers to configure purpose capture for only specific, more sensitive employee groups. For example, some employees need daily access to an application and should be considered “trusted.” But other employees may still have access, but should only rarely need to use the tool— security teams or data protection officers may view their access as higher risk. The policies enable flexible logical constructions that equate to actions such as “ask everyone but the following employees for a purpose.”

This distinction of sensitive applications and “trusted” employees enables friction to the benefit of data protection, rather than a loss of efficiency for employees.

Capturing Purpose Justification in Cloudflare Access
Purpose justification is configurable as an Access policy, allowing for maximum flexibility in configuring and layering rules to protect sensitive applications.

Auditing justification records

As a Teams administrator, enterprise data protection officer, or security analyst, you can view purpose justification logs for a specific application to better understand how it has been accessed and used. Auditing the logs can reveal insights about security threats, the need for improved data classification training, or even potential application development to more appropriately address employees’ use cases.

The justifications are seamlessly integrated with other Access audit logs — they are viewable in the Teams dashboard as an additional column in the table of login events, and exportable to a SIEM for further data analysis.

Capturing Purpose Justification in Cloudflare Access
Teams administrators can review the purpose justifications submitted upon application login by their employees.

Getting started

You can start adding purpose justification prompts to your application access policies in Cloudflare Access today. The purpose justification feature is available in all plans, and with the Cloudflare for Teams free plan, you can use it for up to 50 users at no cost.

We’re excited to continue adding new features that give you more flexibility over purpose justification in Access… Have feedback for us? Let us know in this community post.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware

Post Syndicated from Watson Ladd original https://blog.cloudflare.com/introducing-zero-knowledge-proofs-for-private-web-attestation-with-cross-multi-vendor-hardware/

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware

A few weeks ago we introduced Cryptographic Attestation of Personhood to replace CAPTCHAs with USB security keys, and today we announced additional support for on-device biometric hardware. While doing that work, it occurred to us that hardware attestation, proving identity or other properties of a user with a piece of hardware, could have many wider applications beyond just CAPTCHA alternatives and user authentication via WebAuthn. Really, why should someone have to have an account to prove they exist, when their own trusted device can do so?

Attestation in the WebAuthn standard lets websites know that your security key is authentic. It was designed to have good privacy properties baked into policies that must be followed by device manufacturers. The information your security key sends to websites is indistinguishable from that of myriad other keys.  Even so, we wanted to do better. If we’re taking attestation out of authentication, then we need to learn only that your security key is authentic — and we’ve designed a new Zero-Knowledge Proof for the browser to do that.

This is part of our work to improve privacy across the Internet. We’ve yet to put this proof of personhood in production, but you can see a demonstration of the technique in action. We’ve seen it work with YubiKeys among others. Most importantly, we’re open-sourcing the code so everyone can benefit and contribute. Read through below for details, as well as next steps.


WebAuthn attestation identifies the manufacturer of your hardware security key to the website that wants the attestation. It was intended for deployment in closed settings like financial institutions and internal services where the website already has a preexisting relationship with you. Since logging in identifies you, the privacy impact was minimal. In contrast, any open website that uses attestation, like we do for proof of personhood, learns the make and model of the key you used.

Make and model information doesn’t seem that sensitive, just like the make and model of your car doesn’t seem that sensitive. There are a lot of 2015 Priuses out there, so knowing you drive one doesn’t help identify you. But when paired with information such as user agent, language preferences, time of day, etc., it can contribute to building up a picture of the user — just as demographic details, height, weight and clothing together with the make and model of a car combine to make it easier to pinpoint a particular car on the highway. Therefore, browsers have a dialogue when a website obtains this attestation, to make sure users understand that the website is learning information that may help identify them. We take privacy seriously at Cloudflare, and want to avoid learning any information that could identify you.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware
An example browser warning.‌‌

The information that we see from attestation is a proof that the manufacturer of your security key really did make that key. It’s a digital signature using a private key held on your security key in a secure enclave, together with a certificate chain that leads to the manufacturer. These chains enable any server to see that the hardware security key is authentic. All we want for the Cryptographic Attestation of Personhood is a single bit: that you own a hardware security key that is trustworthy, and none of the details about the manufacturer or model.

Historically, attestation has been used in environments where only a few manufacturers were considered acceptable. For example, large financial institutions are understandably conservative. In this environment, revealing the manufacturer is necessary. In an open vendor design, we don’t want to privilege any particular manufacturer, but instead just know that the keys are trustworthy.

Trustworthiness is determined by the FIDO MetaData Service. It is a service from the FIDO2 alliance who maintain root certificates for the manufacturers. When these keys are compromised, they are listed as such in the FIDO system. We have automated scripts to download these roots and insert them into  releases of our software. This ensures that we are always up-to-date as new manufacturers emerge or older devices are compromised as attackers extract the keys or the keys get mishandled.

To their credit, the FIDO consortium requires that no fewer than 100,000 devices all share an attestation key, setting a lower bound on the device anonymity set size to minimize the impact of information collection. Alas, this hasn’t always happened. Some manufacturers might not have the volume necessary to ever get that batch size, and users shouldn’t have to flock to the biggest ones to have their privacy protected. At Cloudflare, we have a strong privacy policy that governs how we use this information, but we’d prefer not to know your key’s manufacturer at all. Just as we’ve removed cookies that we no longer needed, and log data customers need to debug their firewall rules without us being able to see it ourselves, we’re always looking for ways to reduce the information that we can see.

At the same time, we need to make sure that the device that’s responding to our request is a genuine security key and not some software emulation run by a bot. While we don’t care which key it is, we’d like to know that it actually is a key that meets our security requirements and hasn’t been compromised. In essence, we’d like to prove the legitimacy of the credential without learning anything else about it. This problem of anonymous credentials isn’t new, and lots of solutions have been proposed and some even deployed.

However, these schemes typically require that the hardware or software that implements the credential attestation was designed with the specific scheme in mind. We can’t go out and convince manufacturers to add features in a few months, let alone replace all the hardware authentication security keys in the world. We have to instead search for solutions that work with existing hardware.

A high-level introduction to Zero-Knowledge Proof

At first glance, it seems that we have an impossible task. How can I demonstrate that I know something without telling you what it is? But sometimes this is possible. If I claim to have the key to a mailbox, you can put a letter inside the mailbox, walk away, and ask me to read the letter. If I claim to know your telephone number, you can ask me to call you. Such a proof is known as a zero-knowledge proof, often abbreviated ZKP.

A classic example of a zero-knowledge proof is showing to someone that you know where Waldo is in Where’s Waldo. While you could point to Waldo on the page, this would tell the person exactly where Waldo is. If however you were to cover up the page with a big piece of paper that has a small hole that only shows Waldo, then the person could only see that Waldo was somewhere on the page, and would be unable to figure out where. They would know that you know where Waldo is, but not know where Waldo is themselves.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware
Caption: Sometimes finding Waldo isn’t the problem (Source: https://commons.wikimedia.org/wiki/File:Where%E2%80%99s_Wally_World_Record_(5846729480).jpg)

Cryptographers have designed numerous zero-knowledge proofs and ways to hook them together. The central piece in hooking them together is a commitment, a cryptographic envelope. A commitment prevents tampering with the value that was placed inside when it was made, and later can be opened to show what was placed in it. We use commitment schemes in real life. A magician might seal a piece of paper in an envelope to assure the audience that he cannot touch or tamper with it, and later have someone open the envelope to reveal his prediction. A silent auction involves people entering sealed envelopes containing bids that are then opened together, making sure that no one can adjust their bid after they see what others have bid.

One very simple cryptographic protocol that makes use of commitments is coin flipping. Suppose Alice and Bob want to flip a coin, but one of them has a few double-headed quarters and the other can flip a coin so it comes up on the side they want every time. The only way to get a fair flip is for both of them to be involved in a way that makes sure if either is honest, the result is a fair flip. But if they simply each flip a coin and trade the results, whoever goes last could pretend they had gotten the result that would make the desired outcome.

Using a commitment scheme solves this problem. Instead of Alice and Bob saying what their results are out loud, they trade commitments to the results. Then they open the commitments. Because they traded the commitments, neither of them can pretend to have gotten a different result based on what they learned, as then they will be detected when they open the commitments.

Commitments are like wires that tie zero-knowledge proofs together, making bigger and more complicated ones from simple ones. By proving that a value in a commitment has two different properties with different zero-knowledge proofs we can prove both properties hold for the value. This lets us link together proofs for statements like “the value in a commitment is a sum of values in two other commitments” and “the value in a commitment appears in a list” into much more complicated and useful statements.  Since we know how to prove statements like “this commitment is one if and only if both of these other commitments are one” and “this commitment is one if either of these two commitments is one” we have the building blocks to prove any statement. These generic techniques can produce a zero-knowledge proof for any statement in NP, although it will be quite slow and complicated by default.

Our Zero-Knowledge Proof system for the browser

In Cryptographic Attestation of Personhood the server sends a message to the browser that the hardware security signs, demonstrating its authenticity. Just as a paper signature ensures that the person making it saw it and signed it, a digital signature ensures the identity of the signer.  When we use our zero-knowledge proof, instead of sending the signature, the client sends a proof that the signature was generated by a key on a server provided list.

Because we only send the proof to the server, the server learns only that the attestation exists, and not which hardware security key generated it. This guarantees privacy as the identifying information about the security key never leaves the browser. But we need to make sure that proving and verification are efficient enough to carry out at scale, to have a deployable solution.

We investigated many potential schemes, including SNARKS. Unfortunately the code size, toolchain requirements, and proving complexity of a SNARK proved prohibitive.  The security of SNARKS relies on more complicated assumptions than the scheme we ultimately went with. Obviously this is an area of active research and the best technology today is not necessarily the best technology of the future.

For the hardware security keys we support, the digital signature in the attestation was produced by the Elliptic Curve Digital Signature Algorithm (ECDSA).  ECDSA is itself similar to many of the zero-knowledge proofs we use. It starts with the signer computing a point \(R=kG\) on an elliptic curve for a random value \(x\). Then the signer takes the \(x\) coordinate of the point, which is written as \(r\), and their private key, and the hash of the message, and computes a value \(s\). The pair \((r, s)\) is the signature. The verifier uses \(r\) and \(s\) and the public key to recompute \(R\), and then checks that the \(x\) coordinate matches \(r\).

Unfortunately, the verification equation as commonly presented involves some operations that would need to convert values from one form to another. In a zero knowledge proof these operations are complex and expensive, with many steps. We had to sidestep this limitation to make our system work. To transform ECDSA into a scheme we can work with, our prover sends \(R\) instead, and commits to a value \(z\) computed from \(r\) and \(s\) that simplifies the verification equation. Anyone can take an ECDSA signature and turn it into a signature for our tweaked scheme and vice versa without using any secret knowledge, so it is just as secure as ECDSA.

Since the statement we want to prove has two parts — “the message was signed by a key” and “the key is on the list” — it is natural to break up the problem of proving that statement into two pieces. First, the prover demonstrates that the key inside of a commitment signed the message, and then the prover demonstrates the committed key is on a list. The verifier likewise checks these two parts and if both parts work, indicates that the proof is valid.

To prove that the signature verifies under a key, we had to use a proof that one elliptic curve point is a known power of another. This proof is a fairly straightforward zero-knowledge proof, but some steps themselves require zero-knowledge protocols for proving that points were added correctly and arithmetic was done correctly. This proof consumes the bulk of the time in proof generation and verification. Once this proof is verified, the verifier knows that the message was signed by the committed public key.

The next step is for the prover to find where their key is on the list, and then prove that their key is in the list. To do this we use the zero-knowledge proof developed by Groth and Kohlweiss. Their proof first commits to the binary expansion of the place of the commitment in the list. The prover then proves that binary expansion is made out of bits, and supplies some extra information about how they proved it. With the extra info and the proofs, both sides can compute polynomials that evaluate to zero if the commitment is to a value on the list. This code is surprisingly short for such a complex task.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware
By evaluating a polynomial, we show our committed value is a zero.

The verifier then checks the Groth-Kohlweiss proof that the committed key is on the list, and then makes sure the message that was signed is what it should be. This is a very efficient proof, even as the list size grows: the work done per list element is a multiplication. If all matches, then we know that the signature was generated by a sufficiently secure security key, and nothing else. If it does not match we know that something is wrong.

Engineering a more efficient curve

We turned statements about ECDSA signatures into statements about points on the P-256 elliptic curve, and then into statements about arithmetic in the field that P-256 is defined over. These statements are easiest to prove if we have a group with a size matching the size of a field, and so we had to find one. This posed an interesting challenge as it’s the reverse of how we normally do things in cryptography. If you’d like to see how we solved it read on, otherwise skip ahead.

Most of the time in elliptic curve cryptography we start with a convenient base field, and search for elliptic curves of prime or nearly prime order with the right properties for our application. This way we can pick primes with properties convenient for computer hardware. When it comes to wanting pairing friendly curves, we typically do computer searches for curves whose parameters are given by polynomials that are known.

But here we wanted a curve with a given number of points, and so we would have to use some fairly advanced number theoretic machinery to determine this curve. Our doing so was a big part in getting our zero-knowledge attestation as efficient as it is.

Elliptic Curves and the Complex Plane

Elliptic curves are particularly nice over the complex numbers. An elliptic curve is isomorphic to a torus. All complex curves are isomorphic to tori over the complex numbers, but some have more than one hole.

Different elliptic curves are distinguished by how fat or thin the two directions around the torus are with respect to one another. If we imagine slicing around the holes in the torus, we see that we can get a torus from taking a rectangle and gluing up the sides.  There is an illustrative video of what this looks like Gluing a torus.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware

Instead of taking one rectangle and gluing it up, we can imagine taking the entire plane, and then folding it up so that every rectangle lines up. In doing so the corners of these rectangles all line up over the origin. The corners form what we call a lattice, and we can always scale and rotate to have one of the generators of the lattice be 1.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware

Viewed this way, addition of complex numbers becomes addition on the torus, just as addition of the integers modulo 2 is addition of integers, then reduced mod 2. But with elliptic curves we’re used to having algebraic equations for addition and multiplication, and also for the curves themselves. How does this analytic picture interact with the algebraic one?

Via a great deal of classical complex geometry we find they are closely related. The ring of complex-valued functions on the lattice is generated by the Weierstrass \(\mathcal{P}\) function and its derivative. These satisfy an algebraic equation of the form \(y^2=x^3+ax+b\), and the parameters \(a\) and \(b\) are functions of the lattice parameter. The classical formulas for the algebraic addition of points on elliptic curves emerge from this.

One of the parameters is the \(j\) invariant, which is a more arithmetically meaningful invariant than \(\tau\). The \(j\) invariant is an invariant of the elliptic curve: values of \(\tau\) that give rise to the same lattice produce the same \(j\) invariant, while they may have different \(a\) and \(b\).  There are many expressions for \(j\), with one being


Complex multiplication and class number

Suppose we take the lattice \(\{1, i\}\). If we multiply this lattice by \(i\), we get back \(\{i, -1\}\), which generates the same set of points. This is exceptional, and can only happen when the number we multiply by satisfies a quadratic equation. The elements of the lattice are then closely related to the solutions of that quadratic equation.

Associated with such a lattice is a discriminant: the discriminant of the quadratic field associated with the example. For our example with i the discriminant is \(-4\), the discriminant of the quadratic equation \(x^2+1\). If, for instance, we were to take \(\sqrt{-5}\) instead and consider the lattice \( \{1, \sqrt{-5}\} \), the discriminant would be \(-20\), the discriminant of \(x^2-5\). Note that there are different definitions of the discriminant, which change the sign and add various powers of \(2\).

Maps from elliptic curves to themselves are called endomorphisms. Most elliptic curves just have multiplication by integers as endomorphisms. But some curves have additional endomorphisms. For instance, if we turn the lattice \(\{1, i\}\) into an elliptic curve, we obtain \(y^2=x^3+x\). Now this curve has an extra endomorphism: if I send \(y\) to \(iy\) and \(x\) to \(-x\), I get a point that satisfies the curve equation as \((-iy)^2=(-x)^3-x\). Doing this map twice produces the same effect as inverting a point, and it’s no coincidence that multiplying twice by \(i\) sends a complex number \(z\) to \(-z\). So this extra endomorphism and multiplying by \(i\) satisfy the same equation. Having an extra endomorphism is called complex multiplication as its multiplication by a complex number. When the lattice an elliptic curve comes from has complex multiplication, the elliptic curve also has complex multiplication and vice versa.

Any set of mathematical objects comes with questions, and elliptic curves with complex multiplication are no exception. We can ask how many elliptic curves with complex multiplication there are for a given discriminant. How does that number grow as the discriminant grows? Some of these questions are still open today, despite years of research and computer experimentation. Key to approaching them is a link between lattices and arithmetic.

Earlier in the 19th century Gauss studied binary quadratic forms, equations of the form \(ax^2+bxy+cy^2\). Such forms are said to be equivalent if there is a substitution with integer coefficients for \(x\) and \(y\) that takes one into the other. This is a core notion, and in addition to algorithms for reducing a binary quadratic form, Gauss demonstrated that there was a composition law that made binary quadratic forms of a given discriminant into a group.

Later number theorists would develop the concept of an ideal, tying quadratic forms to the failure of factorization to be unique. \(x^2+5y^2\)and \(2x^2+2xy+3y^2\) are both quadratic forms of discriminant -20, and this is connected to the failure of unique factorization in \(\mathbb{Z}[\sqrt{-5}]\).

When we consider binary quadratic forms as the lengths of vectors in a plane, each lattice gives a binary quadratic form up to equivalence. The elliptic curves with complex multiplication by a given discriminant thus correspond to the classes of binary quadratic forms with a given discriminant, which connects to the arithmetic in the quadratic field. Three very different looking questions thus all have the same answer.

Why complex multiplication matters for finding curves

When we take an elliptic curve with integral coefficients and consider it over a prime, it gets an extra endomorphism: the Frobenius endomorphism that sends \(x\rightarrow x^p\) and \(ny\rightarrow y^p\) . This endomorphism satisfies a quadratic equation, and the linear term of that quadratic equation is the number of points minus \(p+1\).

If the elliptic curve has complex multiplication, there is another endomorphism, namely the one we get from complex multiplication. But an elliptic curve can only have one extra endomorphism unless it is supersingular. Supersingular curves are very rare. So the Frobenius endomorphism and the endomorphism from complex multiplication must be the same. Because we started out with complex multiplication, we know the quadratic equation the Frobenius must satisfy, and hence the number of points.

This is the conclusion of our saga: to find an elliptic curve with a given order, find integer solutions to the equation \(t^2+Dy^2=4N\) for small \(D\) and let \(p=N-t+1\) and see if it is prime. Then factor the Hilbert class polynomial of \(D\) over \(p\), and take one of the roots modulo \(p\) as the \(j\) invariant of our elliptic curve. We may need to take a quadratic twist to get the right number of points, since t is only identified up to sign.

This gave us the curve we needed for efficient proving of relations over the base field of P-256. All of this mathematics produces a script that runs in a few minutes and produces the curve with the desired order that we needed.

The results of our labor

After all this work, and much additional engineering work needed to make the proof run faster through optimizing little bits of it, we can generate a proof in a few seconds, and verify it in a few hundred milliseconds. This is fast enough to be practical, and means that websites that want to verify the security of security keys can do so without negative privacy impacts.

Introducing Zero-Knowledge Proofs for Private Web Attestation with Cross/Multi-Vendor Hardware

All a website that uses our technique learns is whether or not the signature was generated by a token whose attestation key is on the list they provided. Unlike using WebAuthn directly they do not get any more detailed information, even if the manufacturer has accidentally made the batch too small. Instead of having a policy-based approach to guarding user privacy, we’ve removed the troublesome information.

Next steps — a community effort!

Our demonstration shows that we have a working privacy enhancement based on a zero-knowledge proof. We’re continuing to improve it, adding more performance and security features. But our task isn’t done until we have a privacy-preserving WebAuthn extension that works in every browser, giving users assurance that their data is not leaving the device.

What we have now is a demonstration of what is possible: using zero-knowledge proofs to turn WebAuthn attestation into a system that treats every manufacturer equally by design, protects user privacy, and can be used by every website. The challenges around user privacy that are created by using attestation on a wide scale are solvable.

There’s much more that goes into a high-quality, reliable system than the core cryptographic insights. In order to make the user experience not involve warnings about the information our zero-knowledge proof discards, we need more integration with the browser. We also need a safe way for users of devices not on our list to send their key to us and demonstrate that it should be trusted, and a way to make sure that the list isn’t being abused to try to pinpoint particular keys.

In addition, this verification is more heavyweight than the older verification methods, so servers that implement it need to incorporate rate limiting and other protections against abuse. SNARKS would be a big advantage here, but comes at a cost of code size for the demonstration. Ultimately bringing these improvements into a core part of the web ecosystem requires working with users, browsers, and other participants to find a solution that works for them. We would like to hear from you at [email protected] if you would like to contribute to the process.

Our Cryptographic Attestation of Personhood gives users an easier way to demonstrate their humanity, and one which is more privacy-preserving than many CAPTCHA alternatives or providers. But we weren’t satisfied with the state of the art and saw a way to apply advanced cryptography techniques to improve the privacy of our users. Our work shows zero-knowledge proofs can enhance the privacy offered by real world protocols.

Apple Adds a Backdoor to iMessage and iCloud Storage

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/apple-adds-a-backdoor-to-imesssage-and-icloud-storage.html

Apple’s announcement that it’s going to start scanning photos for child abuse material is a big deal. (Here are five news stories.) I have been following the details, and discussing it in several different email lists. I don’t have time right now to delve into the details, but wanted to post something.

EFF writes:

There are two main features that the company is planning to install in every Apple device. One is a scanning feature that will scan all photos as they get uploaded into iCloud Photos to see if they match a photo in the database of known child sexual abuse material (CSAM) maintained by the National Center for Missing & Exploited Children (NCMEC). The other feature scans all iMessage images sent or received by child accounts — that is, accounts designated as owned by a minor — for sexually explicit material, and if the child is young enough, notifies the parent when these images are sent or received. This feature can be turned on or off by parents.

This is pretty shocking coming from Apple, which is generally really good about privacy. It opens the door for all sorts of other surveillance, since now that the system is built it can be used for all sorts of other messages. And it breaks end-to-end encryption, despite Apple’s denials:

Does this break end-to-end encryption in Messages?

No. This doesn’t change the privacy assurances of Messages, and Apple never gains access to communications as a result of this feature. Any user of Messages, including those with with communication safety enabled, retains control over what is sent and to whom. If the feature is enabled for the child account, the device will evaluate images in Messages and present an intervention if the image is determined to be sexually explicit. For accounts of children age 12 and under, parents can set up parental notifications which will be sent if the child confirms and sends or views an image that has been determined to be sexually explicit. None of the communications, image evaluation, interventions, or notifications are available to Apple.

Notice Apple changing the definition of “end-to-end encryption.” No longer is the message a private communication between sender and receiver. A third party is alerted if the message meets a certain criteria.

This is a security disaster. Read tweets by Matthew Green and Edward Snowden. Also this. I’ll post more when I see it.

Beware the Four Horsemen of the Information Apocalypse. They’ll scare you into accepting all sorts of insecure systems.

EDITED TO ADD: This is a really good write-up of the problems.

EDITED TO ADD: Alex Stamos comments.

An open letter to Apple criticizing the project.

A leaked Apple memo responding to the criticisms. (What are the odds that Apple did not intend this to leak?)

EDITED TO ADD: John Gruber’s excellent analysis.

EDITED TO ADD (8/11): Paul Rosenzweig wrote an excellent policy discussion.

EDITED TO ADD (8/13): Really good essay by EFF’s Kurt Opsahl. Ross Anderson did an interview with Glenn Beck. And this news article talks about dissent within Apple about this feature.

The Economist has a good take. Apple responds to criticisms. (It’s worth watching the Wall Street Journal video interview as well.)

EDITED TO ADD (8/14): Apple released a threat model

EDITED TO ADD (8/20): Follow-on blog posts here and here.

Paragon: Yet Another Cyberweapons Arms Manufacturer

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/paragon-yet-another-cyberweapons-arms-manufacturer.html

Forbes has the story:

Paragon’s product will also likely get spyware critics and surveillance experts alike rubbernecking: It claims to give police the power to remotely break into encrypted instant messaging communications, whether that’s WhatsApp, Signal, Facebook Messenger or Gmail, the industry sources said. One other spyware industry executive said it also promises to get longer-lasting access to a device, even when it’s rebooted.


Two industry sources said they believed Paragon was trying to set itself apart further by promising to get access to the instant messaging applications on a device, rather than taking complete control of everything on a phone. One of the sources said they understood that Paragon’s spyware exploits the protocols of end-to-end encrypted apps, meaning it would hack into messages via vulnerabilities in the core ways in which the software operates.

Read that last sentence again: Paragon uses unpatched zero-day exploits in the software to hack messaging apps.

Certifying our Commitment to Your Right to Information Privacy

Post Syndicated from Emily Hancock original https://blog.cloudflare.com/certifying-our-commitment-to-your-right-to-information-privacy/

Certifying our Commitment to Your Right to Information Privacy

Certifying our Commitment to Your Right to Information Privacy

Cloudflare recognizes privacy in personal data as a fundamental human right and has taken a number of steps, including certifying to international standards, to demonstrate our commitment to privacy.

Privacy has long been recognized as a fundamental human right. The United Nations included a right to privacy in its 1948 Universal Declaration of Human Rights (Article 12) and in the 1976 International Covenant on Civil and Political Rights (Article 17). A number of other jurisdiction-specific laws and treaties also recognize privacy as a fundamental right.

Cloudflare shares the belief that privacy is a fundamental right. We believe that our mission to help build a better Internet means building a privacy-respecting Internet, so people don’t feel they have to sacrifice their personal information — where they live, their ages and interests, their shopping habits, or their religious or political beliefs — in order to navigate the online world.

But talk is cheap. Anyone can say they value privacy. We show it. We demonstrate our commitment to privacy not only in the products and services we build and the way we run our privacy program, but also in the examinations we perform of our processes and products  to ensure they work the way we say they do.

Certifying to International Privacy and Security Standards

Cloudflare has a multi-faceted privacy program that incorporates critical privacy principles such as being transparent about our privacy practices, practicing privacy by design when we build our products and services, using the minimum amount of personal data necessary for our services to work, and only processing personal data for the purposes specified. We were able to demonstrate our holistic approach to privacy when, earlier this year, Cloudflare became one of the first organizations in our industry to certify to a new international privacy standard for protecting and managing the processing of personal data — ISO/IEC 27701:2019.

This standard took the concepts in global data protection laws like the EU’s watershed General Data Protection Regulation (“GDPR”) and adapted them into an international standard for how to manage privacy. This certification provides assurance to our customers that a third party has independently verified that Cloudflare’s privacy program meets GDPR-aligned industry standards. Having this certification helps our customers have confidence in the way we handle and protect our customer information, as both processor and controller of personal information.

The standard contains 31 controls identified for organizations that are personal data controllers, and 18 additional controls identified for organizations that are personal data processors.[1] The controls are essentially a set of best practices that data controllers and processors must meet in terms of data handling practices and transparency about those practices, documenting a legal basis for processing and for transfer of data to third countries (outside the EU), and handling data subject rights, among others.

For example, the standard requires that an organization maintain policies and document specific procedures related to the international transfer of personal data.

Cloudflare has implemented this requirement by maintaining an internal policy restricting the transfer of personal data between jurisdictions unless that transfer meets defined criteria. Customers, whether free or paid, enter into a standard Data Processing Addendum with Cloudflare which is available on the Cloudflare Customer Dashboard and which sets out the restrictions we must adhere to when processing personal data on behalf of customers, including when transferring personal data between jurisdictions. Additionally, Cloudflare publishes a list of sub-processors that we may use when processing personal data, and in which countries or jurisdictions that processing may take place.

The standard also requires that organizations should maintain documented personal data minimization objectives, including what mechanisms are used to meet those objectives.

Personal data minimization objective

Cloudflare maintains internal policies on how we manage data throughout its full lifecycle, including data minimization objectives. In fact, our commitment to privacy starts with the objective of minimizing personal data. That’s why, if we don’t have to collect certain personal data in order to deliver our service to customers, we’d prefer not to collect it at all in the first place. Where we do have to, we collect the minimum amount necessary to achieve the identified purpose and process it for the minimum amount necessary, transparently documenting the processing in our public privacy policy.

We’re also proud to have developed a Privacy by Design policy, which rigorously sets out the high-standards and evaluations that must be undertaken if products and services are to collect and process personal data. We use these mechanisms to ensure our collection and use of personal data is limited and transparently documented.

Demonstrating our adherence to laws and policies designed to protect the privacy of personal information is only one way to show how we value the people’s right to privacy. Another critical element of our privacy approach is the high level of security we apply to the data on our systems in order to keep that data private. We’ve demonstrated our commitment to data security through a number of certifications:

  • ISO 27001:2013: This is an industry-wide accepted information security certification that focuses on the implementation of an Information Security Management System (ISMS) and security risk management processes. Cloudflare has been ISO 27001 certified since 2019.
  • SOC 2 Type II:  Cloudflare has undertaken the AICPA SOC 2 Type II certification to attest that Security, Confidentiality, and Availability controls are in place in accordance with the AICPA Trust Service Criteria. Cloudflare’s SOC 2 Type II report covers security, confidentiality, and availability controls to protect customer data.
  • PCI DSS 3.2.1: Cloudflare maintains PCI DSS Level 1 compliance and has been PCI compliant since 2014. Cloudflare’s Web Application Firewall (WAF), Cloudflare Access, Content Delivery Network (CDN), and Time Service are PCI compliant solutions. Cloudflare is audited annually by a third-party Qualified Security Assessor (QSA).
  • BSI Qualification: Cloudflare has been recognized by the German government’s Federal Office for Information Security as a qualified provider of DDoS mitigation services.

More information about these certifications is available on our Certifications and compliance resources page.

In addition, we are continuing to look for other opportunities to demonstrate our compliance with data privacy best practices. For example, we are following the European Union’s approval of the first official GDPR codes of conduct in May 2021, and we are considering other privacy standards, such as the ISO 27018 cloud privacy certification.

Building Tools to Deliver Privacy

We think one of the most impactful ways we can respect people’s privacy is by not collecting or processing unnecessary personal data in the first place. We not only build our own network with this principle in mind, but we also believe in empowering individuals and entities of all sizes with technological tools to easily build privacy-respecting applications and minimize the amount of personal information transiting the Internet.

One such tool is our public DNS resolver — the Internet’s fastest, privacy-first public DNS resolver. When we launched our resolver, we committed that we would not retain any personal data about requests made using our resolver. And because we baked anonymization best practices into the resolver when we built it, we were able to demonstrate that we didn’t have any personal data to sell when we asked independent accountants to conduct a privacy examination of the resolver. While we haven’t made changes to how the product works since then, if we ever do so in the future, we’ll go back and commission another examination to demonstrate that when someone uses our public resolver, we can’t tell who is visiting any given website.

In addition to our resolver, we’ve built a number of other privacy-enhancing technologies, such as:

  • Cloudflare’s Web Analytics, which does not use any client-side state, such as cookies or localStorage, to collect usage metrics, and never ‘fingerprints’ individual users.
  • Supporting Oblivious DoH (ODoH), a proposed DNS standard — co-authored by engineers from Cloudflare, Apple, and Fastly — that separates IP addresses from DNS queries, so that no single entity can see both at the same time. In other words, ODoH means, for example, that no single entity can see that IP address sent an access request to the website example.com.
  • Universal SSL (now called Transport Layer Security), which we made available to all of our customers, paying and free. Supporting SSL means that we support encrypting the content of web pages, which had previously been sent as plain text over the Internet. It’s like sending your private, personal information in a locked box instead of on a postcard.

Building Trust

Cloudflare’s subscription-based business model has always been about offering an incredible suite of products that help make the Internet faster, more efficient, more secure, and more private for our users. Our business model has never been about selling users’ data or tracking individuals as they go about their digital lives. We don’t think people should have to trade their private information just to get access to Internet applications. We work every day to earn and maintain our users’ trust by respecting their right to privacy in their personal data as it transits our network, and by being transparent about how we handle and secure that data. You can find out more about the policies, privacy-enhancing technologies, and certifications that help us earn that trust by visiting the Cloudflare Trust Hub at www.cloudflare.com/trust-hub.

[1] The GDPR defines a “data controller” as the “natural or legal person (…) or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data”; and a “data processor” as “a natural or legal person (…) which processes personal data on behalf of the controller.”

De-anonymization Story

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/07/de-anonymization-story.html

This is important:

Monsignor Jeffrey Burrill was general secretary of the US Conference of Catholic Bishops (USCCB), effectively the highest-ranking priest in the US who is not a bishop, before records of Grindr usage obtained from data brokers was correlated with his apartment, place of work, vacation home, family members’ addresses, and more.


The data that resulted in Burrill’s ouster was reportedly obtained through legal means. Mobile carriers sold­ — and still sell — ­location data to brokers who aggregate it and sell it to a range of buyers, including advertisers, law enforcement, roadside services, and even bounty hunters. Carriers were caught in 2018 selling real-time location data to brokers, drawing the ire of Congress. But after carriers issued public mea culpas and promises to reform the practice, investigations have revealed that phone location data is still popping up in places it shouldn’t. This year, T-Mobile even broadened its offerings, selling customers’ web and app usage data to third parties unless people opt out.

The publication that revealed Burrill’s private app usage, The Pillar, a newsletter covering the Catholic Church, did not say exactly where or how it obtained Burrill’s data. But it did say how it de-anonymized aggregated data to correlate Grindr app usage with a device that appears to be Burrill’s phone.

The Pillar says it obtained 24 months’ worth of “commercially available records of app signal data” covering portions of 2018, 2019, and 2020, which included records of Grindr usage and locations where the app was used. The publication zeroed in on addresses where Burrill was known to frequent and singled out a device identifier that appeared at those locations. Key locations included Burrill’s office at the USCCB, his USCCB-owned residence, and USCCB meetings and events in other cities where he was in attendance. The analysis also looked at other locations farther afield, including his family lake house, his family members’ residences, and an apartment in his Wisconsin hometown where he reportedly has lived.

Location data is not anonymous. It cannot be made anonymous. I hope stories like these will teach people that.