Cloudflare shares the belief that privacy is a fundamental right. We believe that our mission to help build a better Internet means building a privacy-respecting Internet, so people don’t feel they have to sacrifice their personal information — where they live, their ages and interests, their shopping habits, or their religious or political beliefs — in order to navigate the online world.
But talk is cheap. Anyone can say they value privacy. We show it. We demonstrate our commitment to privacy not only in the products and services we build and the way we run our privacy program, but also in the examinations we perform of our processes and products to ensure they work the way we say they do.
Certifying to International Privacy and Security Standards
Cloudflare has a multi-faceted privacy program that incorporates critical privacy principles such as being transparent about our privacy practices, practicing privacy by design when we build our products and services, using the minimum amount of personal data necessary for our services to work, and only processing personal data for the purposes specified. We were able to demonstrate our holistic approach to privacy when, earlier this year, Cloudflare became one of the first organizations in our industry to certify to a new international privacy standard for protecting and managing the processing of personal data — ISO/IEC 27701:2019.
This standard took the concepts in global data protection laws like the EU’s watershed General Data Protection Regulation (“GDPR”) and adapted them into an international standard for how to manage privacy. This certification provides assurance to our customers that a third party has independently verified that Cloudflare’s privacy program meets GDPR-aligned industry standards. Having this certification helps our customers have confidence in the way we handle and protect our customer information, as both processor and controller of personal information.
The standard contains 31 controls identified for organizations that are personal data controllers, and 18 additional controls identified for organizations that are personal data processors. The controls are essentially a set of best practices that data controllers and processors must meet in terms of data handling practices and transparency about those practices, documenting a legal basis for processing and for transfer of data to third countries (outside the EU), and handling data subject rights, among others.
For example, the standard requires that an organization maintain policies and document specific procedures related to the international transfer of personal data.
Cloudflare has implemented this requirement by maintaining an internal policy restricting the transfer of personal data between jurisdictions unless that transfer meets defined criteria. Customers, whether free or paid, enter into a standard Data Processing Addendum with Cloudflare which is available on the Cloudflare Customer Dashboard and which sets out the restrictions we must adhere to when processing personal data on behalf of customers, including when transferring personal data between jurisdictions. Additionally, Cloudflare publishes a list of sub-processors that we may use when processing personal data, and in which countries or jurisdictions that processing may take place.
The standard also requires that organizations should maintain documented personal data minimization objectives, including what mechanisms are used to meet those objectives.
Personal data minimization objective
We’re also proud to have developed a Privacy by Design policy, which rigorously sets out the high-standards and evaluations that must be undertaken if products and services are to collect and process personal data. We use these mechanisms to ensure our collection and use of personal data is limited and transparently documented.
Demonstrating our adherence to laws and policies designed to protect the privacy of personal information is only one way to show how we value the people’s right to privacy. Another critical element of our privacy approach is the high level of security we apply to the data on our systems in order to keep that data private. We’ve demonstrated our commitment to data security through a number of certifications:
ISO 27001:2013: This is an industry-wide accepted information security certification that focuses on the implementation of an Information Security Management System (ISMS) and security risk management processes. Cloudflare has been ISO 27001 certified since 2019.
SOC 2 Type II: Cloudflare has undertaken the AICPA SOC 2 Type II certification to attest that Security, Confidentiality, and Availability controls are in place in accordance with the AICPA Trust Service Criteria. Cloudflare’s SOC 2 Type II report covers security, confidentiality, and availability controls to protect customer data.
PCI DSS 3.2.1: Cloudflare maintains PCI DSS Level 1 compliance and has been PCI compliant since 2014. Cloudflare’s Web Application Firewall (WAF), Cloudflare Access, Content Delivery Network (CDN), and Time Service are PCI compliant solutions. Cloudflare is audited annually by a third-party Qualified Security Assessor (QSA).
BSI Qualification: Cloudflare has been recognized by the German government’s Federal Office for Information Security as a qualified provider of DDoS mitigation services.
We think one of the most impactful ways we can respect people’s privacy is by not collecting or processing unnecessary personal data in the first place. We not only build our own network with this principle in mind, but we also believe in empowering individuals and entities of all sizes with technological tools to easily build privacy-respecting applications and minimize the amount of personal information transiting the Internet.
One such tool is our 22.214.171.124 public DNS resolver — the Internet’s fastest, privacy-first public DNS resolver. When we launched our 126.96.36.199 resolver, we committed that we would not retain any personal data about requests made using our 188.8.131.52 resolver. And because we baked anonymization best practices into the 184.108.40.206 resolver when we built it, we were able to demonstrate that we didn’t have any personal data to sell when we asked independent accountants to conduct a privacy examination of the 220.127.116.11 resolver. While we haven’t made changes to how the product works since then, if we ever do so in the future, we’ll go back and commission another examination to demonstrate that when someone uses our public resolver, we can’t tell who is visiting any given website.
In addition to our 18.104.22.168 resolver, we’ve built a number of other privacy-enhancing technologies, such as:
Cloudflare’s Web Analytics, which does not use any client-side state, such as cookies or localStorage, to collect usage metrics, and never ‘fingerprints’ individual users.
Supporting Oblivious DoH (ODoH), a proposed DNS standard — co-authored by engineers from Cloudflare, Apple, and Fastly — that separates IP addresses from DNS queries, so that no single entity can see both at the same time. In other words, ODoH means, for example, that no single entity can see that IP address 198.51.100.28 sent an access request to the website example.com.
Universal SSL (now called Transport Layer Security), which we made available to all of our customers, paying and free. Supporting SSL means that we support encrypting the content of web pages, which had previously been sent as plain text over the Internet. It’s like sending your private, personal information in a locked box instead of on a postcard.
Cloudflare’s subscription-based business model has always been about offering an incredible suite of products that help make the Internet faster, more efficient, more secure, and more private for our users. Our business model has never been about selling users’ data or tracking individuals as they go about their digital lives. We don’t think people should have to trade their private information just to get access to Internet applications. We work every day to earn and maintain our users’ trust by respecting their right to privacy in their personal data as it transits our network, and by being transparent about how we handle and secure that data. You can find out more about the policies, privacy-enhancing technologies, and certifications that help us earn that trust by visiting the Cloudflare Trust Hub at www.cloudflare.com/trust-hub.
 The GDPR defines a “data controller” as the “natural or legal person (…) or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data”; and a “data processor” as “a natural or legal person (…) which processes personal data on behalf of the controller.”
Monsignor Jeffrey Burrill was general secretary of the US Conference of Catholic Bishops (USCCB), effectively the highest-ranking priest in the US who is not a bishop, before records of Grindr usage obtained from data brokers was correlated with his apartment, place of work, vacation home, family members’ addresses, and more.
The publication that revealed Burrill’s private app usage, The Pillar, a newsletter covering the Catholic Church, did not say exactly where or how it obtained Burrill’s data. But it did say how it de-anonymized aggregated data to correlate Grindr app usage with a device that appears to be Burrill’s phone.
The Pillar says it obtained 24 months’ worth of “commercially available records of app signal data” covering portions of 2018, 2019, and 2020, which included records of Grindr usage and locations where the app was used. The publication zeroed in on addresses where Burrill was known to frequent and singled out a device identifier that appeared at those locations. Key locations included Burrill’s office at the USCCB, his USCCB-owned residence, and USCCB meetings and events in other cities where he was in attendance. The analysis also looked at other locations farther afield, including his family lake house, his family members’ residences, and an apartment in his Wisconsin hometown where he reportedly has lived.
Location data is not anonymous. It cannot be made anonymous. I hope stories like these will teach people that.
A Catholic priest was outed through commercially available surveillance data. Vice has a good analysis:
The news starkly demonstrates not only the inherent power of location data, but how the chance to wield that power has trickled down from corporations and intelligence agencies to essentially any sort of disgruntled, unscrupulous, or dangerous individual. A growing market of data brokers that collect and sell data from countless apps has made it so that anyone with a bit of cash and effort can figure out which phone in a so-called anonymized dataset belongs to a target, and abuse that information.
There is a whole industry devoted to re-identifying anonymized data. This was something that Snowden showed that the NSA could do. Now it’s available to everyone.
The challenges caused and entrenched by surveillance-based advertising include, but are not limited to:
privacy and data protection infringements
opaque business models
manipulation and discrimination at scale
fraud and other criminal activity
serious security risks
In the following chapters, we describe various aspects of these challenges and point out how today’s dominant model of online advertising is a threat to consumers, democratic societies, the media, and even to advertisers themselves. These issues are significant and serious enough that we believe that it is time to ban these detrimental practices.
A ban on surveillance-based practices should be complemented by stronger enforcement of existing legislation, including the General Data Protection Regulation, competition regulation, and the Unfair Commercial Practices Directive. However, enforcement currently consumes significant time and resources, and usually happens after the damage has already been done. Banning surveillance-based advertising in general will force structural changes to the advertising industry and alleviate a number of significant harms to consumers and to society at large.
A ban on surveillance-based advertising does not mean that one can no longer finance digital content using advertising. To illustrate this, we describe some possible ways forward for advertising-funded digital content, and point to alternative advertising technologies that may contribute to a safer and healthier digital economy for both consumers and businesses.
Treating data like it is property fails to recognize either the value that varieties of personal information serve or the abiding interest that individuals have in their personal information even if they choose to “sell” it. Data is not a commodity. It is information. Any system of information rights — whether patents, copyrights, and other intellectual property, or privacy rights — presents some tension with strong interest in the free flow of information that is reflected by the First Amendment. Our personal information is in demand precisely because it has value to others and to society across a myriad of uses.
From the conclusion:
Privacy legislation should empower individuals through more layered and meaningful transparency and individual rights to know, correct, and delete personal information in databases held by others. But relying entirely on individual control will not do enough to change a system that is failing individuals, and trying to reinforce control with a property interest is likely to fail society as well. Rather than trying to resolve whether personal information belongs to individuals or to the companies that collect it, a baseline federal privacy law should directly protect the abiding interest that individuals have in that information and also enable the social benefits that flow from sharing information.
Welcome to Data Privacy Day 2021! Last year at this time, I was writing about how Cloudflare builds privacy into everything we do, with little idea about how dramatically the world was going to change. The tragedy of the COVID-19 pandemic has reshaped the way we go about our daily lives. Our dependence on the Internet grew exponentially in 2020 as we started working from home, attending school from home, and participating in online weddings, concerts, parties, and more. So as we begin this new year, it’s impossible to think about data privacy in 2021 without thinking about how an always-on, always secure, always private Internet is more important than ever.
The pandemic wasn’t the only thing to dramatically shape data privacy conversations last year. We saw a flurry of new activity on data protection legislation around the globe, and a trend toward data localization in a variety of jurisdictions.
I don’t think I’m taking any risks when I say that 2021 looks to be another busy year in the world of privacy and data protection. Let me tell you a bit about what that looks like for us at Cloudflare. We’ll be spending a lot of time in 2021 helping our customers find the solutions they need to meet data protection obligations; enhancing our technical, organizational, and contractual measures to protect the privacy of personal data no matter where in the world it is processed; and continuing to develop privacy-enhancing technologies that can help everyone on the Internet.
Focus on International Data Transfers
One of the biggest stories in data protection in 2020 was the Court of Justice of the European Union’s decision in the “Schrems II” case (Case C-311/18, Data Protection Commissioner v Facebook Ireland and Maximillian Schrems) that invalidated the EU-U.S. Privacy Shield. The court’s interpretation of U.S. surveillance laws meant that data controllers transferring EU personal data to U.S. data processors now have an obligation to make sure additional safeguards are in place to provide the same level of data protection as the General Data Protection Regulation (“GDPR”).
The court decision was followed by draft guidance from the European Data Protection Board (EDPB) that created new expectations and challenges for transfers of EU personal data to processors outside the EU pursuant to the GDPR. In addition, the EU Commission issued new draft standard contractual clauses that further emphasized the need for data transfer impact assessments and due diligence to be completed prior to transferring EU personal data to processors outside the EU. Meanwhile, even before the EDPB and EU Commission weighed in, France’s data protection authority, the CNIL, challenged the use of a U.S. cloud service provider for the processing of certain health data.
This year, the EDPB is poised to issue its final guidance on international data transfers, the EU Commission is set to release a final version of new standard contractual clauses, and the new Biden administration in the United States has already appointed a deputy assistant secretary for services at the U.S. Department of Commerce who will focus on negotiations around a new EU-U.S. Privacy Shield or another data transfer mechanism.
However, the trend to regulate international data transfers isn’t confined to Europe. India’s Personal Data Protection Bill, likely to become law in 2021, would bar certain types of personal data from leaving India. And Brazil’s Lei Geral de Proteção de Dados (“LGPD”), which went into effect in 2020, contains requirements for contractual guarantees that need to be in place for personal data to be processed outside Brazil.
Meanwhile, we’re seeing more data protection regulation across the globe: The California Consumer Privacy Act (“CCPA”) was amended by a new ballot initiative last year. Countries like Japan, China, Singapore, Canada, and New Zealand, that already had data protection legislation in some form, proposed or enacted amendments to strengthen those protections. And even the United States is considering comprehensive Federal data privacy regulation.
In light of last year’s developments and those we expect to see in 2021, Cloudflare is thinking a lot about what it means to process personal data outside its home jurisdiction. One of the key messages to come out of Europe in the second half of 2020 was the idea that to be able to transfer EU personal data to the United States, data processors would have to provide additional safeguards to ensure GDPR-level protection for personal data, even in light of the application of U.S. surveillance laws. While we are eagerly awaiting the EDPB’s final guidance on the subject, we aren’t waiting to ensure that we have in place the necessary additional safeguards.
In fact, Cloudflare has long maintained policies to address concerns about access to personal data. We’ve done so because we believe it’s the right thing to do, and because the conflicts of law we are seeing today seemed inevitable. We feel so strongly about our ability to provide that level of protection for data processed in the U.S., that today we are publishing a paper, “Cloudflare’s Policies around Data Privacy and Law Enforcement Requests,” to describe how we address government and other legal requests for data.
Our paper describes our policies around data privacy and data requests, such as providing notice to our customers of any legal process requesting their data, and the measures we take to push back on any legal process requesting data where we believe that legal process creates a conflict of law. The paper also describes our public commitments about how we approach requests for data and public statements about things we have never done and, in CEO Matthew Prince’s words, that we “will fight like hell to never do”:
Cloudflare has never turned over our encryption or authentication keys or our customers’ encryption or authentication keys to anyone.
Cloudflare has never installed any law enforcement software or equipment anywhere on our network.
Cloudflare has never provided any law enforcement organization a feed of our customers’ content transiting our network.
Cloudflare has never modified customer content at the request of law enforcement or another third party.
In 2021, the Cloudflare team will continue to focus on these safeguards to protect all our customers’ personal data.
Addressing Data Localization Challenges
We also recognize that attention to international data transfers isn’t just a jurisdictional issue. Even if jurisdictions don’t require data localization by law, highly regulated industries like banking and healthcare may adopt best practice guidance asserting more requirements for data if it is to be processed outside a data subject’s home country.
With so much activity around data localization trends and international data transfers, companies will continue to struggle to understand regulatory requirements, as well as update products and business processes to meet those requirements and trends. So while we believe that Cloudflare can provide adequate protections for this data regardless of whether it is processed inside or outside its jurisdiction of origin, we also recognize that our customers are dealing with unique compliance challenges that we can help them face.
That means that this year we’ll also continue the work we started with our Cloudflare Data Localization Suite, which we announced during our Privacy & Compliance Week in December 2020. The Data Localization Suite is designed to help customers build local requirements into their global online operations. We help our customers ensure that their data stays as private as they want it to, and only goes where they want it to go in the following ways:
DDoS attacks are detected and mitigated at the data center closest to the end user.
Data centers inside the preferred region decrypt TLS and apply services like WAF, CDN, and Cloudflare Workers.
Keyless SSL and Geo Key Manager store private SSL keys in a user-specified region.
Edge Log Delivery securely transmits logs from the inspection point to the log storage location of your choice.
Doubling Down on Privacy-Enhancing Technologies
Cloudflare’s mission is to “Help Build a Better Internet,” and we’ve said repeatedly that a privacy-respecting Internet is a better Internet. We believe in empowering individuals and entities of all sizes with technological tools to reduce the amount of personal data that gets funnelled into the data ocean — regardless of whether someone lives in a country with laws protecting the privacy of their personal data. If we can build tools to help individuals share less personal data online, then that’s a win for privacy no matter what their country of residence.
For example, when Cloudflare launched the 22.214.171.124 public DNS resolver — the Internet’s fastest, privacy-first public DNS resolver — we committed to our public resolver users that we would not retain any personal data about requests made using our 126.96.36.199 resolver. And because we baked anonymization best practices into the 188.8.131.52 resolver when we built it, we were able to demonstrate that we didn’t have any personal data to sell when we asked independent accountants to conduct a privacy examination of the 184.108.40.206 resolver.
2021 will also see a continuation of a number of initiatives that we announced during Privacy and Compliance Week that are aimed at improving Internet protocols related to user privacy:
Fixing one of the last information leaks in HTTPS through Encrypted Client Hello (ECH), the evolution of Encrypted SNI.
Developing a superior protocol for password authentication, OPAQUE, that makes password breaches less likely to occur.
Making DNS even more private by supporting Oblivious DNS-over-HTTPS (ODoH).
Encrypted Client Hello (ECH)
Under the old TLS handshake, privacy-sensitive parameters were negotiated completely in the clear and available to network observers. One example is the Server Name Indication (SNI), used by the client to indicate to the server the website it wants to reach — this is not information that should be exposed to eavesdroppers. Previously, this problem was mitigated through the Encrypted SNI (ESNI) extension. While ESNI took a significant step forward, it is an incomplete solution; a major shortcoming is that it protects only SNI. The Encrypted Client Hello (ECH) extension aims to close this gap by enabling encryption of the entire ClientHello, thereby protecting all privacy-sensitive handshake parameters. These changes represent a significant upgrade to TLS, one that will help preserve end-user privacy as the protocol continues to evolve. As this work continues, Cloudflare is committed to doing its part, along with close collaborators in the standards process, to ensure this important upgrade for TLS reaches Internet-scale deployment.
Research has repeatedly shown that passwords are hard for users to manage — and they are also a challenge for servers: passwords are difficult to store securely, they’re frequently leaked and subsequently brute-forced. As long as people still use passwords, we’d like to make the process as secure as possible. Current methods rely on the risky practice of handling plaintext passwords on the server side while checking their correctness. One potential alternative is to use OPAQUE, an asymmetric Password-Authenticated Key Exchange (aPAKE) protocol that allows secure password login without ever letting the server see the passwords.
With OPAQUE, instead of storing a traditional salted password hash, the server stores a secret envelope associated with the user that is “locked” by two pieces of information: the user’s password (known only by the user), and a random secret key (known only by the server). To log in, the client initiates a cryptographic exchange that reveals the envelope key only to the client (but not to the server). The server then sends this envelope to the user, who now can retrieve the encrypted keys. Once those keys are unlocked, they will serve as parameters for an Authenticated Key Exchange (AKE) protocol, which establishes a secret key for encrypting future communications.
Cloudflare has been pushing the development of OPAQUE forward, and has released a reference core OPAQUE implementation in Go and a demo TLS integration (with a running version you can try out). A Typescript client implementation of OPAQUE is coming soon.
Oblivious DNS-over-HTTPS (ODoH)
Encryption is a powerful tool that protects the privacy of personal data. This is why Cloudflare has doubled down on its implementation of DNS over HTTPS (DoH). In the snail mail world, courts have long recognized a distinction between the level of privacy afforded to the contents of a letter vs. the addressing information on an envelope. But we’re not living in an age where the only thing someone can tell from the outside of the envelope are the “to” and “from” addresses and place of postage. The “digital envelopes” of DNS requests can contain much more information about a person than one might expect. Not only is there information about the sender and recipient addresses, but there is specific timestamp information about when requests were submitted, the domains and subdomains visited, and even how long someone stayed on a certain site. Encrypting those requests ensures that only the user and the resolver get that information, and that no one involved in the transit in between sees it. Given that our digital envelopes tell a much more robust story than the envelope in your physical mailbox, we think encrypting these envelopes is just as important as encrypting the messages they carry.
However, there are more ways in which DNS privacy can be enhanced, and Cloudflare took another incremental step in December 2020 by announcing support for Oblivious DoH (ODoH). ODoH is a proposed DNS standard — co-authored by engineers from Cloudflare, Apple, and Fastly — that separates IP addresses from queries, so that no single entity can see both at the same time. ODoH requires a proxy as a key part of the communication path between client and resolver, with encryption ensuring that the proxy does not know the contents of the DNS query (only where to send it), and the resolver knowing what the query is but not who originally requested it (only the proxy’s IP address). Barring collusion between the proxy and the resolver, the identity of the requester and the content of the request are unlinkable.
As with DoH, successful deployment requires partners. A key component of ODoH is a proxy that is disjoint from the target resolver. Cloudflare is working with several leading proxy partners — currently PCCW, SURF, and Equinix — who are equally committed to privacy, and hopes to see this list grow.
Even with all of these encryption measures, we also know that everything encrypted with today’s public key cryptography can likely be decrypted with tomorrow’s quantum computers. This makes deploying post-quantum cryptography a pressing privacy concern. We’re likely 10 to 15 years away from that development, but as our Head of Research Nick Sullivan described in his blog post in December, we’re not waiting for that future. We’ve been paying close attention to the National Institute of Standards and Technology (NIST)’s initiative to define post-quantum cryptography algorithms to replace RSA and ECC. Last year, Cloudflare and Google performed the TLS Post-Quantum Experiment, which involved implementing and supporting new key exchange mechanisms based on post-quantum cryptography for all Cloudflare customers for a period of a few months.
In addition, Cloudflare’s Research Team has been working with researchers from the University of Waterloo and Radboud University on a new protocol called KEMTLS. KEMTLS is designed to be fully post-quantum and relies only on public-key encryption. On the implementation side, Cloudflare has developed high-speed assembly versions of several of the NIST finalists (Kyber, Dilithium), as well as other relevant post-quantum algorithms (CSIDH, SIDH) in our CIRCL cryptography library written in Go. Cloudflare is endeavoring to use post-quantum cryptography for most internal services by the end of 2021, and plans to be among the first services to offer post-quantum cipher suites to customers as standards emerge.
Looking forward to 2021
If there’s anything 2020 taught us, it’s that our world can change almost overnight. One thing that doesn’t change, though, is that people will always want privacy for their personal data, and regulators will continue to define rules and requirements for what data protection should look like. And as these rules and requirements evolve, Cloudflare will be there every step of the way, developing innovative product and security solutions to protect data, and building privacy into everything we do.
Cloudflare is also celebrating Data Privacy Day on Cloudflare TV. Tune in for a full day of special programming.
No one who reads this blog regularly will be surprised:
A former employee of prominent home security company ADT has admitted that he hacked into the surveillance feeds of dozens of customer homes, doing so primarily to spy on naked women or to leer at unsuspecting couples while they had sex.
Authorities say that the IT technician “took note of which homes had attractive women, then repeatedly logged into these customers’ accounts in order to view their footage for sexual gratification.” He did this by adding his personal email address to customer accounts, which ultimately hooked him into “real-time access to the video feeds from their homes.”
We all know that our cell phones constantly give our location away to our mobile network operators; that’s how they work. A group of researchers has figured out a way to fix that. “Pretty Good Phone Privacy” (PGPP) protects both user identity and user location using the existing cellular networks. It protects users from fake cell phone towers (IMSI-catchers) and surveillance by cell providers.
It’s a clever system. The players are the user, a traditional mobile network operator (MNO) like AT&T or Verizon, and a new mobile virtual network operator (MVNO). MVNOs aren’t new. They’re intermediaries like Cricket and Boost.
Here’s how it works:
One-time setup: The user’s phone gets a new SIM from the MVNO. All MVNO SIMs are identical.
Monthly: The user pays their bill to the MVNO (credit card or otherwise) and the phone gets anonymous authentication (using Chaum blind signatures) tokens for each time slice (e.g., hour) in the coming month.
Ongoing: When the phone talks to a tower (run by the MNO), it sends a token for the current time slice. This is relayed to a MVNO backend server, which checks the Chaum blind signature of the token. If it’s valid, the MVNO tells the MNO that the user is authenticated, and the user receives a temporary random ID and an IP address. (Again, this is now MVNOs like Boost already work.)
On demand: The user uses the phone normally.
The MNO doesn’t have to modify its system in any way. The PGPP MVNO implementation is in software. The user’s traffic is sent to the MVNO gateway and then out onto the Internet, potentially even using a VPN.
All connectivity is data connectivity in cell networks today. The user can choose to be data-only (e.g., use Signal for voice), or use the MVNO or a third party for VoIP service that will look just like normal telephony.
The group prototyped and tested everything with real phones in the lab. Their approach adds essentially zero latency, and doesn’t introduce any new bottlenecks, so it doesn’t have performance/scalability problems like most anonymity networks. The service could handle tens of millions of users on a single server, because it only has to do infrequent authentication, though for resilience you’d probably run more.
User phone numbers
Other people’s phone numbers stored in address books
Profile pictures and
Status message including when a user was last online
Diagnostic data collected from app logs
Under the new terms, Facebook reserves the right to share collected data with its family of companies.
Where did the last month go? Were you able to catch all of the sessions in the Security, Identity, and Compliance track you hoped to see at AWS re:Invent? If you missed any, don’t worry—you can stream all the sessions released in 2020 via the AWS re:Invent website. Additionally, we’re starting 2021 with all new sessions that you can stream live January 12–15. Here are the new Security, Identity, and Compliance sessions—each session is offered at multiple times, so you can find the time that works best for your location and schedule.
Protecting sensitive data with Amazon Macie and Amazon GuardDuty – SEC210 Himanshu Verma, AWS Speaker
Tuesday, January 12 – 11:00 AM to 11:30 AM PST Tuesday, January 12 – 7:00 PM to 7:30 PM PST Wednesday, January 13 – 3:00 AM to 3:30 AM PST
As organizations manage growing volumes of data, identifying and protecting your sensitive data can become increasingly complex, expensive, and time-consuming. In this session, learn how Amazon Macie and Amazon GuardDuty together provide protection for your data stored in Amazon S3. Amazon Macie automates the discovery of sensitive data at scale and lowers the cost of protecting your data. Amazon GuardDuty continuously monitors and profiles S3 data access events and configurations to detect suspicious activities. Come learn about these security services and how to best use them for protecting data in your environment.
BBC: Driving security best practices in a decentralized organization – SEC211 Apurv Awasthi, AWS Speaker Andrew Carlson, Sr. Software Engineer – BBC
Tuesday, January 12 – 1:15 PM to 1:45 PM PST Tuesday, January 12 – 9:15 PM to 9:45 PM PST Wednesday, January 13 – 5:15 AM to 5:45 AM PST
In this session, Andrew Carlson, engineer at BBC, talks about BBC’s journey while adopting AWS Secrets Manager for lifecycle management of its arbitrary credentials such as database passwords, API keys, and third-party keys. He provides insight on BBC’s secrets management best practices and how the company drives these at enterprise scale in a decentralized environment that has a highly visible scope of impact.
Get ahead of the curve with DDoS Response Team escalations – SEC321 Fola Bolodeoku, AWS Speaker
Tuesday, January 12 – 3:30 PM to 4:00 PM PST Tuesday, January 12 – 11:30 PM to 12:00 AM PST Wednesday, January – 7:30 AM to 8:00 AM PST
This session identifies tools and tricks that you can use to prepare for application security escalations, with lessons learned provided by the AWS DDoS Response Team. You learn how AWS customers have used different AWS offerings to protect their applications, including network access control lists, security groups, and AWS WAF. You also learn how to avoid common misconfigurations and mishaps observed by the DDoS Response Team, and you discover simple yet effective actions that you can take to better protect your applications’ availability and security controls.
Network security for serverless workloads – SEC322 Alex Tomic, AWS Speaker
Thursday, January 14 -1:30 PM to 2:00 PM PST Thursday, January 14 – 9:30 PM to 10:00 PM PST Friday, January 15 – 5:30 AM to 6:00 AM PST
Are you building a serverless application using services like Amazon API Gateway, AWS Lambda, Amazon DynamoDB, Amazon Aurora, and Amazon SQS? Would you like to apply enterprise network security to these AWS services? This session covers how network security concepts like encryption, firewalls, and traffic monitoring can be applied to a well-architected AWS serverless architecture.
Building your cloud incident response program – SEC323 Freddy Kasprzykowski, AWS Speaker
Wednesday, January 13 – 9:00 AM to 9:30 AM PST Wednesday, January 13 – 5:00 PM to 5:30 PM PST Thursday, January 14 – 1:00 AM to 1:30 AM PST
You’ve configured your detection services and now you’ve received your first alert. This session provides patterns that help you understand what capabilities you need to build and run an effective incident response program in the cloud. It includes a review of some logs to see what they tell you and a discussion of tools to analyze those logs. You learn how to make sure that your team has the right access, how automation can help, and which incident response frameworks can guide you.
Wednesday, January 13 – 2:15 PM to 2:45 PM PST Wednesday, January 13 – 10:15 PM to 10:45 PM PST Thursday, January 14 – 6:15 AM to 6:45 AM PST
Amazon Cognito is a flexible user directory that can meet the needs of a number of customer identity management use cases. Web and mobile applications can integrate with Amazon Cognito in minutes to offer user authentication and get standard tokens to be used in token-based authorization scenarios. This session covers best practices that you can implement in your application to secure and protect tokens. You also learn about new Amazon Cognito features that give you more options to improve the security and availability of your application.
Event-driven data security using Amazon Macie – SEC325 Neha Joshi, AWS Speaker
Thursday, January 14 – 8:00 AM to 8:30 AM PST Thursday, January 14 – 4:00 PM to 4:30 PM PST Friday, January 15 – 12:00 AM to 12:30 AM PST
Amazon Macie sensitive data discovery jobs for Amazon S3 buckets help you discover sensitive data such as personally identifiable information (PII), financial information, account credentials, and workload-specific sensitive information. In this session, you learn about an automated approach to discover sensitive information whenever changes are made to the objects in your S3 buckets.
Thursday, January 14 – 10:15 AM to 10:45 AM PST Thursday, January 14 – 6:15 PM to 6:45 PM PST Friday, January 15 – 2:15 AM to 2:45 AM PST
In this session, learn about several instance containment and isolation techniques, ranging from simple and effective to more complex and powerful, that leverage native AWS networking services and account configuration techniques. If an incident happens, you may have questions like “How do we isolate the system while preserving all the valuable artifacts?” and “What options do we even have?”. These are valid questions, but there are more important ones to discuss amidst a (possible) incident. Join this session to learn highly effective instance containment techniques in a crawl-walk-run approach that also facilitates preservation and collection of valuable artifacts and intelligence.
Trusted connects for government workloads – SEC402 Brad Dispensa, AWS Speaker
Wednesday, January 13 – 11:15 AM to 11:45 AM PST Wednesday, January 13 – 7:15 PM to 7:45 PM PST Thursday, January 14 – 3:15 AM to 3:45 AM PST
Cloud adoption across the public sector is making it easier to provide government workforces with seamless access to applications and data. With this move to the cloud, we also need updated security guidance to ensure public-sector data remain secure. For example, the TIC (Trusted Internet Connections) initiative has been a requirement for US federal agencies for some time. The recent TIC-3 moves from prescriptive guidance to an outcomes-based model. This session walks you through how to leverage AWS features to better protect public-sector data using TIC-3 and the National Institute of Standards and Technology (NIST) Cybersecurity Framework (CSF). Also, learn how this might map into other geographies.
I look forward to seeing you in these sessions. Please see the re:Invent agenda for more details and to build your schedule.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.
The microphones on voice assistants are very sensitive, and can snoop on all sorts of data:
In Hey Alexa what did I just type? we show that when sitting up to half a meter away, a voice assistant can still hear the taps you make on your phone, even in presence of noise. Modern voice assistants have two to seven microphones, so they can do directional localisation, just as human ears do, but with greater sensitivity. We assess the risk and show that a lot more work is needed to understand the privacy implications of the always-on microphones that are increasingly infesting our work spaces and our homes.
Abstract: Voice assistants are now ubiquitous and listen in on our everyday lives. Ever since they became commercially available, privacy advocates worried that the data they collect can be abused: might private conversations be extracted by third parties? In this paper we show that privacy threats go beyond spoken conversations and include sensitive data typed on nearby smartphones. Using two different smartphones and a tablet we demonstrate that the attacker can extract PIN codes and text messages from recordings collected by a voice assistant located up to half a meter away. This shows that remote keyboard-inference attacks are not limited to physical keyboards but extend to virtual keyboards too. As our homes become full of always-on microphones, we need to work through the implications.
Gizmodo is reporting that schools in the US are buying equipment to unlock cell phones from companies like Cellebrite:
Gizmodo has reviewed similar accounting documents from eight school districts, seven of which are in Texas, showing that administrators paid as much $11,582 for the controversial surveillance technology. Known as mobile device forensic tools (MDFTs), this type of tech is able to siphon text messages, photos, and application data from student’s devices. Together, the districts encompass hundreds of schools, potentially exposing hundreds of thousands of students to invasive cell phone searches.
Sophisticated spyware, sold by surveillance tech companies to Mexican government agencies, are ending up in the hands of drug cartels:
As many as 25 private companies — including the Israeli company NSO Group and the Italian firm Hacking Team — have sold surveillance software to Mexican federal and state police forces, but there is little or no regulation of the sector — and no way to control where the spyware ends up, said the officials.
Lots of details in the article. The cyberweapons arms business is immoral in many ways. This is just one of them.
Privacy matters. Privacy and Compliance are at the heart of Cloudflare’s products and solutions. We are committed to providing built-in data protection and privacy throughout our global network and for every product in our portfolio. This is why we have dedicated a whole week to highlight important aspects of how we are working to make sure privacy will stay at the core of all we do as a business.
In case you missed any of the blog posts this week addressing the topics of Privacy and Compliance, you’ll find a summary below.
Welcome to Privacy & Compliance Week: Reflecting Values at Cloudflare’s Core
We started the week with this introduction by Matthew Prince. The blog post summarizes the early decisions that the founding team made to make sure customer data is kept private, that we do not sell or rent this data to third parties, and why trust is the foundation of our business. > Read the full blog post.
Introducing the Cloudflare Data Localization Suite
Cloudflare’s network is private and compliant by design. Preserving end-user privacy is core to our mission of helping to build a better Internet; we’ve never sold personal data about customers or end-users of our network. We comply with laws like GDPR and maintain certifications such as ISO-27001. In a blog post by John Graham-Cumming, we announced the Data Localization Suite, which helps businesses get the performance and security benefits of Cloudflare’s global network while making it easy to set rules and controls at the edge about where their data is stored and protected. The Data Localization Suite is available now as an add-on for Enterprise customers. > Read the full blog post.
Privacy needs to be built into the Internet
John also reflected upon three phases of the evolution of the Internet: from its invention to the mid-1990s the race was on for expansion and connectivity. Then, as more devices and networks became interconnected, the focus shifted with the introduction of SSL in 1994 to a second phase where security became paramount. We’re now in the full swing of phase 3, where privacy is becoming more and more important than ever. > Read the full blog post.
Helping build the next generation of privacy-preserving protocols
The Internet is growing in terms of its capacity and the number of people using it and evolving in terms of its design and functionality. As a player in the Internet ecosystem, Cloudflare has a responsibility to help the Internet grow in a way that respects and provides value for its users. In this blog post, Nick Sullivan summarizes several announcements on improving Internet protocols with respect to something important to our customers and Internet users worldwide: privacy. These initiatives are focussed around: fixing one of the last information leaks in HTTPS through Encrypted Client Hello (ECH), which supersedes Encrypted SNI; making DNS even more private by supporting Oblivious DNS-over-HTTPS (ODoH); developing a superior protocol for password authentication, OPAQUE, that makes password breaches less likely to occur. > Read the full blog post.
OPAQUE: The Best Passwords Never Leave your Device
Passwords are a problem. They are a problem for reasons that are familiar to most readers. For us at Cloudflare, the problem lies much deeper and broader. Most readers will immediately acknowledge that passwords are hard to remember and manage, especially as password requirements grow increasingly complex. Luckily there are great software packages and browser add-ons to help manage passwords. Unfortunately, the greater underlying problem is beyond the reaches of software to solve. Today’s deep-dive blog post by Tatiana Bradley, into OPAQUE, is one possible answer. OPAQUE is one among many examples of systems that enable a password to be useful without it ever leaving your possession. No one likes passwords, but as long they’re in use, at least we can ensure they are never given away. > Read the full blog post.
Good-bye ESNI, hello ECH!
In this post Christopher Patton dives into Encrypted Client Hello (ECH), a new extension for TLS that promises to significantly enhance the privacy of this critical Internet protocol. Today, a number of privacy-sensitive parameters of the TLS connection are negotiated in the clear. This leaves a trove of metadata available to network observers, including the endpoints’ identities, how they use the connection, and so on. > Read the full blog post.
Improving DNS Privacy with Oblivious DoH in 220.127.116.11
Tanya Verma and Sudheesh Singanamalla wrote this blog post for our announcement of support for a new proposed DNS standard — co-authored by engineers from Cloudflare, Apple, and Fastly — that separates IP addresses from queries, so that no single entity can see both at the same time. Even better, we’ve made source code available, so anyone can try out ODoH, or run their own ODoH service! > Read the full blog post.
Deprecating the __cfduid cookie
Cloudflare never tracks end-users across sites or sells their personal data. However, we didn’t want there to be any questions about our cookie use, and we don’t want any customer to think they need a cookie banner because of what we do. Therefore we’ve announced that Cloudflare is deprecating the __cfduid cookie. Starting on 10 May 2021, we will stop adding a “Set-Cookie” header on all HTTP responses. The last __cfduid cookies will expire 30 days after that. So why did we use the __cfduid cookie before, and why can we remove it now? Read the full blog post by Sergi Isasi to find out.
Cloudflare’s privacy-first Web Analytics is now available for everyone
In September, we announced that we’re building a new, free Web Analytics product for the whole web. In this blog post by Jon Levine, we’re announcing that anyone can now sign up to use our new Web Analytics — even without changing your DNS settings. In other words, Cloudflare Web Analytics can now be deployed by adding an HTML snippet (in the same way many other popular web analytics tools are) making it easier than ever to use privacy-first tools to understand visitor behavior.
Announcing Workplace Records for Cloudflare for Teams
As businesses worldwide have shifted to remote work, many employees have been working from “home” — wherever that may be. Some employees have taken this opportunity to venture further from where they usually are, sometimes crossing state and national borders. Businesses worldwide pay employment taxes based on where their employees do work. For most businesses and in normal times, where employees do work has been relatively easy to determine: it’s where they come into the office. But 2020 has made everything more complicated, even taxes. In this blog post by Matthew Prince and Sam Rhea, we’re announcing the beta of a new feature for Cloudflare for Teams to help solve this problem: Workplace Records. Cloudflare for Teams uses Access and Gateway logs to provide the state and country from which employees are working. Workplace Records can be used to help finance, legal, and HR departments determine where payroll taxes are due and provide a record to defend those decisions.
Securing the post-quantum world
Quantum computing will change the face of Internet security forever — particularly in the realm of cryptography, which is the way communications and information are secured across channels like the Internet. Cryptography is critical to almost every aspect of modern life, from banking to cellular communications to connected refrigerators and systems that keep subways running on time. This ultra-powerful, highly sophisticated new generation of computing has the potential to unravel decades of work that have been put into developing the cryptographic algorithms and standards we use today. When will a quantum computer be built that is powerful enough to break all modern cryptography? By some estimates, it may take 10 to 15 years. This makes deploying post-quantum cryptography as soon as possible a pressing privacy concern. Cloudflare is taking steps to accelerate this transition. Read the full blog post by Nick Sullivan to find out more.
How to Build a Global Network that Complies with Local Law
Governments around the world have long had an interest in getting access to online records. Sometimes law enforcement is looking for evidence relevant to criminal investigations. Sometimes intelligence agencies are looking to learn more about what foreign governments or actors are doing. And online service providers of all kinds often serve as an access point for those electronic records.
For service providers like Cloudflare, though, those requests can be fraught. The work that law enforcement and other government authorities do is important. At the same time, the data that law enforcement and other government authorities are seeking does not belong to us. By using our services, our customers have put us in a position of trust over that data. Maintaining that trust is fundamental to our business and our values. Alissa Stark details in her blog post how Cloudflare works to ensure compliance with laws like GDPR, particularly in the face of legal orders that might put us in the difficult position of being required to violate it and that requires involving the courts.
Encrypting your WAF Payloads with Hybrid Public Key Encryption (HPKE)
The Cloudflare Web Application Firewall (WAF) blocks more than 72B malicious requests per day from reaching our customers’ applications. Typically, our users can easily confirm these requests were not legitimate by checking the URL, the query parameters, or other metadata that Cloudflare provides as part of the security event log in the dashboard. Request headers may contain cookies and POST payloads may contain username and password pairs submitted during a login attempt among other sensitive data.
We recognize that providing clear visibility in any security event is a core feature of a firewall, as this allows users to better fine-tune their rules. To accomplish this, while ensuring end-user privacy, we built encrypted WAF matched payload logging. This feature will log only the specific component of the request the WAF has deemed malicious — and it is encrypted using a customer-provided key to ensure that no Cloudflare employee can examine the data. Michael Tremante goes over this in full detail, explaining how only application owners who also have access to the Cloudflare dashboard as Super Administrators will be able to configure encrypted matched payload logging.
Supporting Jurisdictional Restrictions for Durable Objects
Durable Objects, currently in limited beta, already make it easy for customers to manage state on Cloudflare Workers without worrying about provisioning infrastructure. Greg McKeon announces in this blog post the upcoming launch of Jurisdictional Restrictions for Durable Objects, which ensure that a Durable Object only stores and processes data in a given geographical region. Jurisdictional Restrictions make it easy for developers to build serverless, stateful applications that not only comply with today’s regulations but can handle new and updated policies as new regulations are added. Head over to the blog post to read more and also request an invite to the beta.
I want my Cloudflare TV
We have also had a full week of CloudflareTV segments focussed on privacy and compliance and you can get the full list and more details on our dedicated Privacy Week page.
As always, we welcome your feedback and comments and we stay committed to putting the privacy and safety of your data at the core of everything we do.
In September, we announced that we’re building a new, free Web Analytics product for the whole web. Today, I’m excited to announce that anyone can now sign up to use our new Web Analytics — even without changing your DNS settings. In other words, Cloudflare Web Analytics can now be deployed by adding an HTML snippet (in the same way many other popular web analytics tools are) making it easier than ever to use privacy-first tools to understand visitor behavior.
Why does the web need another analytics service?
Popular analytics vendors have business models driven by ad revenue. Using them implies a bargain: they track visitor behavior and create buyer profiles to retarget your visitors with ads; in exchange, you get free analytics.
At Cloudflare, our mission is to help build a better Internet, and part of that is to deliver essential web analytics to everyone with a website, without compromising user privacy. For free. We’ve never been interested in tracking users or selling advertising. We don’t want to know what you do on the Internet — it’s not our business.
Our customers have long relied on Cloudflare’s Analytics because we’re accurate, fast, and privacy-first. In September we released a big upgrade to analytics for our existing customers that made them even more flexible.
However, we know that there are many folks who can’t use our analytics, simply because they’re not able to onboard to use the rest of Cloudflare for Infrastructure — specifically, they’re not able to change their DNS servers. Today, we’re bringing the power of our analytics to the whole web. By adding a simple HTML snippet to your website, you can start measuring your web traffic — similar to other popular analytics vendors.
What can I do with Cloudflare Web Analytics?
We’ve worked hard to make our analytics as powerful and flexible as possible — while still being fast and easy to use.
When measuring analytics about your website, the most common questions are “how much traffic did I get?” and “how many people visited?” We answer this by measuring page views (the total number of times a page view was loaded) and visits (the number of times someone landed on a page view from another website).
With Cloudflare Web Analytics, it’s easy to switch between measuring page views or visits. Within each view, you can see top pages, countries, device types and referrers.
My favorite thing is the ability to add global filters, and to quickly drill into the most important data with actions like “zoom” and “group by”. Say you publish a new blog post, and you want to see the top sites that send you traffic right after you email your subscribers about it. It’s easy to zoom into the time period when you hit the email, and group by to see the top pages. Then you can add a filter to just that page — and then finally view top referrers for that page. It’s magic!
Best of all, our analytics is free. We don’t have limits based on the amount of traffic you can send it. Thanks to our ABR technology, we can serve accurate analytics for websites that get anywhere from one to one billion requests per day.
How does the new Web Analytics work?
The new Web Analytics works like most other measurement tools: by tracking visitors on the client. We’ve long had client-side measuring tools with Browser Insights, but these were only available to orange-cloud users (i.e. Cloudflare customers).
How do I sign up?
We’ve worked hard making our onboarding as simple as possible.
First, enter the name of your website. It’s important to use the domain name that your analytics will be served on — we use this to filter out any unwanted “spam” analytics reports.
(At this time, you can only add analytics from one website to each Cloudflare account. In the coming weeks we’ll add support for multiple analytics properties per account.)
Next, you’ll see a script tag that you can copy onto your website. We recommend adding this just before the closing </body> tag on the pages you want to measure.
And that’s it! After you release your website and start getting visits, you’ll be able to see them in analytics.
What does privacy-first mean?
Being privacy-first means we don’t track individual users for the purposes of serving analytics. We don’t use any client-side state (like cookies or localStorage) for analytics purposes. Cloudflare also doesn’t track users over time via their IP address, User Agent string, or any other immutable attributes for the purposes of displaying analytics — we consider “fingerprinting” even more intrusive than cookies, because users have no way to opt out.
The concept of a “visit” is key to this approach. Rather than count unique IP addresses, which would require storing state about what each visitor does, we can simply count the number of page views that come from a different site. This provides a perfectly usable metric that doesn’t compromise on privacy.
This is just the start for our privacy-first Analytics. We’re excited to integrate more closely with the rest of Cloudflare, and give customers even more detailed stats about performance and security (not just traffic.) We’re also hoping to make our analytics even more powerful as a standalone product by building support for alerts, real-time time updates, and more.
Please let us know if you have any questions or feedback, and happy measuring!
Cloudflare is deprecating the __cfduid cookie. Starting on 10 May 2021, we will stop adding a “Set-Cookie” header on all HTTP responses. The last __cfduid cookies will expire 30 days after that.
We never used the __cfduid cookie for any purpose other than providing critical performance and security services on behalf of our customers. Although, we must admit, calling it something with “uid” in it really made it sound like it was some sort of user ID. It wasn’t. Cloudflare never tracks end users across sites or sells their personal data. However, we didn’t want there to be any questions about our cookie use, and we don’t want any customer to think they need a cookie banner because of what we do.
So why did we use the __cfduid cookie before, and why can we remove it now?
The primary use of the cookie is for detectingbots on the web. Malicious bots may disrupt a service that has been explicitly requested by an end user (through DDoS attacks) or compromise the security of a user’s account (e.g. through brute force password cracking or credential stuffing, among others). We use many signals to build machine learning models that can detect automated bot traffic. The presence and age of the cfduid cookie was just one signal in our models. So for our customers who benefit from our bot management products, the cfduid cookie is a tool that allows them to provide a service explicitly requested by the end user.
The value of the cfduid cookie is derived from a one-way MD5 hash of the cookie’s IP address, date/time, user agent, hostname, and referring website — which means we can’t tie a cookie to a specific person. Still, as a privacy-first company, we thought: Can we find a better way to detect bots that doesn’t rely on collecting end user IP addresses?
For the past few weeks, we’ve been experimenting to see if it’s possible to run our bot detection algorithms without using this cookie. We’ve learned that it will be possible for us to transition away from using this cookie to detect bots. We’re giving notice of deprecation now to give our customers time to transition, while our bot management team works to ensure there’s no decline in quality of our bot detection algorithms after removing this cookie. (Note that some Bot Management customers will still require the use of a different cookie after April 1.)
While this is a small change, we’re excited about any opportunity to make the web simpler, faster, and more private.
Most communication on the modern Internet is encrypted to ensure that its content is intelligible only to the endpoints, i.e., client and server. Encryption, however, requires a key and so the endpoints must agree on an encryption key without revealing the key to would-be attackers. The most widely used cryptographic protocol for this task, called key exchange, is the Transport Layer Security (TLS) handshake.
In this post we’ll dive into Encrypted Client Hello (ECH), a new extension for TLS that promises to significantly enhance the privacy of this critical Internet protocol. Today, a number of privacy-sensitive parameters of the TLS connection are negotiated in the clear. This leaves a trove of metadata available to network observers, including the endpoints’ identities, how they use the connection, and so on.
ECH encrypts the full handshake so that this metadata is kept secret. Crucially, this closes a long-standing privacy leak by protecting the Server Name Indication (SNI) from eavesdroppers on the network. Encrypting the SNI secret is important because it is the clearest signal of which server a given client is communicating with. However, and perhaps more significantly, ECH also lays the groundwork for adding future security features and performance enhancements to TLS while minimizing their impact on the privacy of end users.
ECH is the product of close collaboration, facilitated by the IETF, between academics and the tech industry leaders, including Cloudflare, our friends at Fastly and Mozilla (both of whom are the affiliations of co-authors of the standard), and many others. This feature represents a significant upgrade to the TLS protocol, one that builds on bleeding edge technologies, like DNS-over-HTTPS, that are only now coming into their own. As such, the protocol is not yet ready for Internet-scale deployment. This article is intended as a sign post on the road to full handshake encryption.
The story of TLS is the story of the Internet. As our reliance on the Internet has grown, so the protocol has evolved to address ever-changing operational requirements, use cases, and threat models. The client and server don’t just exchange a key: they negotiate a wide variety of features and parameters: the exact method of key exchange; the encryption algorithm; who is authenticated and how; which application layer protocol to use after the handshake; and much, much more. All of these parameters impact the security properties of the communication channel in one way or another.
SNI is a prime example of a parameter that impacts the channel’s security. The SNI extension is used by the client to indicate to the server the website it wants to reach. This is essential for the modern Internet, as it’s common nowadays for many origin servers to sit behind a single TLS operator. In this setting, the operator uses the SNI to determine who will authenticate the connection: without it, there would be no way of knowing which TLS certificate to present to the client. The problem is that SNI leaks to the network the identity of the origin server the client wants to connect to, potentially allowing eavesdroppers to infer a lot of information about their communication. (Of course, there are other ways for a network observer to identify the origin — the origin’s IP address, for example. But co-locating with other origins on the same IP address makes it much harder to use this metric to determine the origin than it is to simply inspect the SNI.)
Although protecting SNI is the impetus for ECH, it is by no means the only privacy-sensitive handshake parameter that the client and server negotiate. Another is the ALPN extension, which is used to decide which application-layer protocol to use once the TLS connection is established. The client sends the list of applications it supports — whether it’s HTTPS, email, instant messaging, or the myriad other applications that use TLS for transport security — and the server selects one from this list, and sends its selection to the client. By doing so, the client and server leak to the network a clear signal of their capabilities and what the connection might be used for.
Some features are so privacy-sensitive that their inclusion in the handshake is a non-starter. One idea that has been floated is to replace the key exchange at the heart of TLS with password-authenticated key-exchange (PAKE). This would allow password-based authentication to be used alongside (or in lieu of) certificate-based authentication, making TLS more robust and suitable for a wider range of applications. The privacy issue here is analogous to SNI: servers typically associate a unique identifier to each client (e.g., a username or email address) that is used to retrieve the client’s credentials; and the client must, somehow, convey this identity to the server during the course of the handshake. If sent in the clear, then this personally identifiable information would be easily accessible to any network observer.
A necessary ingredient for addressing all of these privacy leaks is handshake encryption, i.e., the encryption of handshake messages in addition to application data. Sounds simple enough, but this solution presents another problem: how do the client and server pick an encryption key if, after all, the handshake is itself a means of exchanging a key? Some parameters must be sent in the clear, of course, so the goal of ECH is to encrypt all handshake parameters except those that are essential to completing the key exchange.
In order to understand ECH and the design decisions underpinning it, it helps to understand a little bit about the history of handshake encryption in TLS.
Handshake encryption in TLS
TLS had no handshake encryption at all prior to the latest version, TLS 1.3. In the wake of the Snowden revelations in 2013, the IETF community began to consider ways of countering the threat that mass surveillance posed to the open Internet. When the process of standardizing TLS 1.3 began in 2014, one of its design goals was to encrypt as much of the handshake as possible. Unfortunately, the final standard falls short of full handshake encryption, and several parameters, including SNI, are still sent in the clear. Let’s take a closer look to see why.
The TLS 1.3 protocol flow is illustrated in Figure 1. Handshake encryption begins as soon as the client and server compute a fresh shared secret. To do this, the client sends a key share in its ClientHello message, and the server responds in its ServerHello with its own key share. Having exchanged these shares, the client and server can derive a shared secret. Each subsequent handshake message is encrypted using the handshake traffic key derived from the shared secret. Application data is encrypted using a different key, called the application traffic key, which is also derived from the shared secret. These derived keys have different security properties: to emphasize this, they are illustrated with different colors.
The first handshake message that is encrypted is the server’s EncryptedExtensions. The purpose of this message is to protect the server’s sensitive handshake parameters, including the server’s ALPN extension, which contains the application selected from the client’s ALPN list. Key-exchange parameters are sent unencrypted in the ClientHello and ServerHello.
All of the client’s handshake parameters, sensitive or not, are sent in the ClientHello. Looking at Figure 1, you might be able to think of ways of reworking the handshake so that some of them can be encrypted, perhaps at the cost of additional latency (i.e., more round trips over the network). However, extensions like SNI create a kind of “chicken-and-egg” problem.
The client doesn’t encrypt anything until it has verified the server’s identity (this is the job of the Certificate and CertificateVerify messages) and the server has confirmed that it knows the shared secret (the job of the Finished message). These measures ensure the key exchange is authenticated, thereby preventing monster-in-the-middle (MITM) attacks in which the adversary impersonates the server to the client in a way that allows it to decrypt messages sent by the client. Because SNI is needed by the server to select the certificate, it needs to be transmitted before the key exchange is authenticated.
In general, ensuring confidentiality of handshake parameters used for authentication is only possible if the client and server already share an encryption key. But where might this key come from?
Full handshake encryption in the early days of TLS 1.3. Interestingly, full handshake encryption was once proposed as a core feature of TLS 1.3. In early versions of the protocol (draft-10, circa 2015), the server would offer the client a long-lived public key during the handshake, which the client would use for encryption in subsequent handshakes. (This design came from a protocol called OPTLS, which in turn was borrowed from the original QUIC proposal.) Called “0-RTT”, the primary purpose of this mode was to allow the client to begin sending application data prior to completing a handshake. In addition, it would have allowed the client to encrypt its first flight of handshake messages following the ClientHello, including its own EncryptedExtensions, which might have been used to protect the client’s sensitive handshake parameters.
Ultimately this feature was not included in the final standard (RFC 8446, published in 2018), mainly because its usefulness was outweighed by its added complexity. In particular, it does nothing to protect the initial handshake in which the client learns the server’s public key. Parameters that are required for server authentication of the initial handshake, like SNI, would still be transmitted in the clear.
Nevertheless, this scheme is notable as the forerunner of other handshake encryption mechanisms, like ECH, that use public key encryption to protect sensitive ClientHello parameters. The main problem these mechanisms must solve is key distribution.
Before ECH there was (and is!) ESNI
The immediate predecessor of ECH was the Encrypted SNI (ESNI) extension. As its name implies, the goal of ESNI was to provide confidentiality of the SNI. To do so, the client would encrypt its SNI extension under the server’s public key and send the ciphertext to the server. The server would attempt to decrypt the ciphertext using the secret key corresponding to its public key. If decryption were to succeed, then the server would proceed with the connection using the decrypted SNI. Otherwise, it would simply abort the handshake. The high-level flow of this simple protocol is illustrated in Figure 2.
For key distribution, ESNI relied on another critical protocol: Domain Name Service (DNS). In order to use ESNI to connect to a website, the client would piggy-back on its standard A/AAAA queries a request for a TXT record with the ESNI public key. For example, to get the key for crypto.dance, the client would request the TXT record of _esni.crypto.dance:
The base64-encoded blob contains an ESNI public key and related parameters such as the encryption algorithm.
But what’s the point of encrypting SNI if we’re just going to leak the server name to network observers via a plaintext DNS query? Deploying ESNI this way became feasible with the introduction of DNS-over-HTTPS (DoH), which enables encryption of DNS queries to resolvers that provide the DoH service (18.104.22.168 is an example of such a service.). Another crucial feature of DoH is that it provides an authenticated channel for transmitting the ESNI public key from the DoH server to the client. This prevents cache-poisoning attacks that originate from the client’s local network: in the absence of DoH, a local attacker could prevent the client from offering the ESNI extension by returning an empty TXT record, or coerce the client into using ESNI with a key it controls.
While ESNI took a significant step forward, it falls short of our goal of achieving full handshake encryption. Apart from being incomplete — it only protects SNI — it is vulnerable to a handful of sophisticated attacks, which, while hard to pull off, point to theoretical weaknesses in the protocol’s design that need to be addressed.
ESNI was deployed by Cloudflare and enabled by Firefox, on an opt-in basis, in 2018, an experience that laid bare some of the challenges with relying on DNS for key distribution. Cloudflare rotates its ESNI key every hour in order to minimize the collateral damage in case a key ever gets compromised. DNS artifacts are sometimes cached for much longer, the result of which is that there is a decent chance of a client having a stale public key. While Cloudflare’s ESNI service tolerates this to a degree, every key must eventually expire. The question that the ESNI protocol left open is how the client should proceed if decryption fails and it can’t access the current public key, via DNS or otherwise.
Another problem with relying on DNS for key distribution is that several endpoints might be authoritative for the same origin server, but have different capabilities. For example, a request for the A record of “example.com” might return one of two different IP addresses, each operated by a different CDN. The TXT record for “_esni.example.com” would contain the public key for one of these CDNs, but certainly not both. The DNS protocol does not provide a way of atomically tying together resource records that correspond to the same endpoint. In particular, it’s possible for a client to inadvertently offer the ESNI extension to an endpoint that doesn’t support it, causing the handshake to fail. Fixing this problem requires changes to the DNS protocol. (More on this below.)
The future of ESNI. In the next section, we’ll describe the ECH specification and how it addresses the shortcomings of ESNI. Despite its limitations, however, the practical privacy benefit that ESNI provides is significant. Cloudflare intends to continue its support for ESNI until ECH is production-ready.
The ins and outs of ECH
The goal of ECH is to encrypt the entire ClientHello, thereby closing the gap left in TLS 1.3 and ESNI by protecting all privacy-sensitive handshake-parameters. Similar to ESNI, the protocol uses a public key, distributed via DNS and obtained using DoH, for encryption during the client’s first flight. But ECH has improvements to key distribution that make the protocol more robust to DNS cache inconsistencies. Whereas the ESNI server aborts the connection if decryption fails, the ECH server attempts to complete the handshake and supply the client with a public key it can use to retry the connection.
But how can the server complete the handshake if it’s unable to decrypt the ClientHello? As illustrated in Figure 3, the ECH protocol actually involves two ClientHello messages: the ClientHelloOuter, which is sent in the clear, as usual; and the ClientHelloInner, which is encrypted and sent as an extension of the ClientHelloOuter. The server completes the handshake with just one of these ClientHellos: if decryption succeeds, then it proceeds with the ClientHelloInner; otherwise, it proceeds with the ClientHelloOuter.
The ClientHelloInner is composed of the handshake parameters the client wants to use for the connection. This includes sensitive values, like the SNI of the origin server it wants to reach (called the backend server in ECH parlance), the ALPN list, and so on. The ClientHelloOuter, while also a fully-fledged ClientHello message, is not used for the intended connection. Instead, the handshake is completed by the ECH service provider itself (called the client-facing server), signaling to the client that its intended destination couldn’t be reached due to decryption failure. In this case, the service provider also sends along the correct ECH public key with which the client can retry handshake, thereby “correcting” the client’s configuration. (This mechanism is similar to how the server distributed its public key for 0-RTT mode in the early days of TLS 1.3.)
At a minimum, both ClientHellos must contain the handshake parameters that are required for a server-authenticated key-exchange. In particular, while the ClientHelloInner contains the real SNI, the ClientHelloOuter also contains an SNI value, which the client expects to verify in case of ECH decryption failure (i.e., the client-facing server). If the connection is established using the ClientHelloOuter, then the client is expected to immediately abort the connection and retry the handshake with the public key provided by the server. It’s not necessary that the client specify an ALPN list in the ClientHelloOuter, nor any other extension used to guide post-handshake behavior. All of these parameters are encapsulated by the encrypted ClientHelloInner.
This design resolves — quite elegantly, I think — most of the challenges for securely deploying handshake encryption encountered by earlier mechanisms. Importantly, the design of ECH was not conceived in a vacuum. The protocol reflects the diverse perspectives of the IETF community, and its development dovetails with other IETF standards that are crucial to the success of ECH.
The first is an important new DNS feature known as the HTTPS resource record type. At a high level, this record type is intended to allow multiple HTTPS endpoints that are authoritative for the same domain name to advertise different capabilities for TLS. This makes it possible to rely on DNS for key distribution, resolving one of the deployment challenges uncovered by the initial ESNI deployment. For a deep dive into this new record type and what it means for the Internet more broadly, check out Alessandro Ghedini’s recent blog post on the subject.
The second is the CFRG’s Hybrid Public Key Encryption (HPKE) standard, which specifies an extensible framework for building public key encryption schemes suitable for a wide variety of applications. In particular, ECH delegates all of the details of its handshake encryption mechanism to HPKE, resulting in a much simpler and easier-to-analyze specification. (Incidentally, HPKE is also one of the main ingredients of Oblivious DNS-over-HTTPS.
The road ahead
The current ECH specification is the culmination of a multi-year collaboration. At this point, the overall design of the protocol is fairly stable. In fact, the next draft of the specification will be the first to be targeted for interop testing among implementations. Still, there remain a number of details that need to be sorted out. Let’s end this post with a brief overview of the road ahead.
Resistance to traffic analysis
Ultimately, the goal of ECH is to ensure that TLS connections made to different origin servers behind the same ECH service provider are indistinguishable from one another. In other words, when you connect to an origin behind, say, Cloudflare, no one on the network between you and Cloudflare should be able to discern which origin you reached, or which privacy-sensitive handshake-parameters you and the origin negotiated. Apart from an immediate privacy boost, this property, if achieved, paves the way for the deployment of new features for TLS without compromising privacy.
Encrypting the ClientHello is an important step towards achieving this goal, but we need to do a bit more. An important attack vector we haven’t discussed yet is traffic analysis. This refers to the collection and analysis of properties of the communication channel that betray part of the ciphertext’s contents, but without cracking the underlying encryption scheme. For example, the length of the encrypted ClientHello might leak enough information about the SNI for the adversary to make an educated guess as to its value (this risk is especially high for domain names that are either particularly short or particularly long). It is therefore crucial that the length of each ciphertext is independent of the values of privacy-sensitive parameters. The current ECH specification provides some mitigations, but their coverage is incomplete. Thus, improving ECH’s resistance to traffic analysis is an important direction for future work.
The spectre of ossification
An important open question for ECH is the impact it will have on network operations.
One of the lessons learned from the deployment of TLS 1.3 is that upgrading a core Internet protocol can trigger unexpected network behavior. Cloudflare was one of the first major TLS operators to deploy TLS 1.3 at scale; when browsers like Firefox and Chrome began to enable it on an experimental basis, they observed a significantly higher rate of connection failures compared to TLS 1.2. The root cause of these failures was network ossification, i.e., the tendency of middleboxes — network appliances between clients and servers that monitor and sometimes intercept traffic — to write software that expects traffic to look and behave a certain way. Changing the protocol before middleboxes had the chance to update their software led to middleboxes trying to parse packets they didn’t recognize, triggering software bugs that, in some instances, caused connections to be dropped completely.
This problem was so widespread that, instead of waiting for network operators to update their software, the design of TLS 1.3 was altered in order to mitigate the impact of network ossification. The ingenious solution was to make TLS 1.3 “look like” another protocol that middleboxes are known to tolerate. Specifically, the wire format and even the contents of handshake messages were made to resemble TLS 1.2. These two protocols aren’t identical, of course — a curious network observer can still distinguish between them — but they look and behave similar enough to ensure that the majority of existing middleboxes don’t treat them differently. Empirically, it was found that this strategy significantly reduced the connection failure rate enough to make deployment of TLS 1.3 viable.
Once again, ECH represents a significant upgrade for TLS for which the spectre of network ossification looms large. The ClientHello contains parameters, like SNI, that have existed in the handshake for a long time, and we don’t yet know what the impact will be of encrypting them. In anticipation of the deployment issues ossification might cause, the ECH protocol has been designed to look as much like a standard TLS 1.3 handshake as possible. The most notable difference is the ECH extension itself: if middleboxes ignore it — as they should, if they are compliant with the TLS 1.3 standard — then the rest of the handshake will look and behave very much as usual.
It remains to be seen whether this strategy will be enough to ensure the wide-scale deployment of ECH. If so, it is notable that this new feature will help to mitigate the impact of future TLS upgrades on network operations. Encrypting the full handshake reduces the risk of ossification since it means that there are less visible protocol features for software to ossify on. We believe this will be good for the health of the Internet overall.
The old TLS handshake is (unintentionally) leaky. Operational requirements of both the client and server have led to privacy-sensitive parameters, like SNI, being negotiated completely in the clear and available to network observers. The ECH extension aims to close this gap by enabling encryption of the full handshake. This represents a significant upgrade to TLS, one that will help preserve end-user privacy as the protocol continues to evolve.
The ECH standard is a work-in-progress. As this work continues, Cloudflare is committed to doing its part to ensure this important upgrade for TLS reaches Internet-scale deployment.
The first phase of the Internet lasted until the early 1990s. During that time it was created and debugged, and grew globally. Its growth was not hampered by concerns about data security or privacy. Until the 1990s the race was for connectivity.
Connectivity meant that people could get online and use the Internet wherever they were. Because the “inter” in Internet implied interoperability the network was able to grow rapidly using a variety of technologies. Think dialup modems using ordinary phones lines, cable modems sending the Internet over coax originally designed for television, Ethernet, and, later, fibre optic connections and WiFi.
By the 1990s, the Internet was being used widely and for uses far beyond its academic origins. Early web pioneers, like Netscape, realized that the potential for e-commerce was gigantic but would be held back if people couldn’t have confidence in the security of online transactions.
Thus, with the introduction of SSL in 1994, the Internet moved to a second phase where security became paramount. Securing the web, and the Internet more generally, helped create the dotcom rush and the secure, online world we live in today. But this security was misunderstood by some as providing guarantees about privacy which it did not.
People feel safe going online to shop, read the news, look up ailments and search for a life partner because cryptography prevents an eavesdropper from seeing what they are doing, and provides a guarantee that a website is who it claims to be. But it does not provide any privacy guarantee. The website you are visiting knows, at the very least, the IP address of your Internet connection.
And even with encryption a well placed eavesdropper can learn at least the names of websites you are visiting because of that information leaks from protocols that weren’t designed to preserve privacy.
People who aim to remain anonymous on the Internet therefore turn to technologies like Tor or VPNs. But remaining anonymous from a website you shop from or an airline’s online booking site doesn’t make any sense. In those instances, the company you are dealing with will know who you are because you tell them your home address, name, passport number etc. You want them to know.
That makes privacy a nuanced thing: you want to remain anonymous to an eavesdropper but make sure a retailer knows where you live.
The connectivity phase of the Internet made it possible for you to connect to a computer anywhere in the world just as easily as one in your own city. The security phase of the Internet solved the problem of giving you confidence to hand over information to an airline or a retailer. Combining these two phases resulted in an Internet you can trust to transmit your data, but little control over where that data ultimately ended up.
A French citizen could just as easily buy goods from a Spanish website as from a North American one. In both cases, the retailer would know the French name and address where the purchases were to be delivered. This creates a conundrum for a privacy-conscious citizen. The Internet created an amazing global platform for commerce, news and information (how easy it is for the French citizen to stay in contact with family in Cote d’Ivoire and even read the local news there from afar).
And while shopping an eavesdropper (such as an ISP, a coffee shop owner or an intelligence agency) could tell which website the French citizen was visiting.
And the Internet also meant that your and my information is dispersed across the world. And different countries have different rules about how that data is to be stored and shared. And countries and regions have data sharing agreements to allow cross-border transfer of private information about citizens.
Concerns about eavesdropping and where data ends up have created the world we are living in today where privacy concerns are coming to the forefront, especially in Europe but in many other countries as well.
In addition, the economics and flexibility of SaaS and cloud applications meant that it made sense to actually transfer data to a limited number of large data centers (which are sometimes confusingly called regions) where data from people all over the world can be processed. And, by and large, that was the world of the Internet, universal connectivity, widespread security, and data sharing through cross-border agreements.
This apparent utopia got snowed on by the leaking of secret documents describing the relationship between the US NSA (and its Five Eyes partners) and large Internet companies, and that intelligence agencies were scooping up data from choke points on the Internet. Those revelations brought to the public’s attention the fact that their data could, in some cases, be accessed by foreign intelligence agencies
Quite quickly those large data centers in far flung countries looked like a bad idea, and governments and citizens started to demand control of data. This is the third phase of the Internet. Privacy joins universal connectivity and security as core.
But what is control over data or privacy? Different governments have different ideas and different requirements, which can differ for different data sets. Some countries are convinced that the only way to control data is to keep it inside their countries, where they believe they can control who gets access to it. Other countries believe that they can address the risks by putting restrictions to prevent certain governments or companies from getting access to data. And the regulatory challenges are only getting more complicated.
This will be an enormous challenge for companies that have built a business on aggregating citizens’ information in order to target advertising, but it is also a challenge for anyone offering an Internet service. Just as companies have had to face the scourge of DDoS attacks and hacking, and have had to stay up to date with the latest in encryption technology, they will fundamentally have to store and process their customers’ data in different countries in different ways.
The European Union, in particular, has pushed a comprehensive approach to data privacy. Although the EU has had data protection principles in place since 1995, the implementation of the EU’s General Data Protection Regulation (GDPR) in 2018 has generated a new era of privacy online. GDPR imposes limitations on how the personal data of EU residents can be collected, stored, deleted, modified and otherwise processed.
Among the GDPR’s requirements are provisions on how EU personal data should be protected if that personal data leaves the EU. Although the US and the EU worked together to develop a set of voluntary commitments to make it easier for companies to transfer data between the two countries, that framework — the Privacy Shield — was invalidated this past summer. As a result, companies are grappling with how they can transfer data outside the EU, consistent with GDPR requirements. Recommendations recently issued by the European Data Protection Board (EDPB), which require data exporters to assess the law in third countries, determine whether that law adequately protects privacy, and if necessary, obtain guarantees of additional safeguards from data importers, have only added to companies’ concerns.
This anxiety over whether there are controls over data adequate to address the concerns of European regulators has prompted many of our customers to explore whether it is possible to prevent data subject to the GDPR from leaving the EU at all.
Gone are the days when all the world’s data could be processed in a massive data center regardless of its provenance.
One reaction to this change could be a retreat into every country building its own online email services, HR systems, e-commerce providers, and more. This would be a massive wasted effort. There are economies of scale if the same service can be used by Germans, Peruvians, Indonesians, Australians…
The answer to this privacy challenge is the same as the answer to the connectivity and security phases of the Internet: build it! We need to build a privacy-respecting Internet and give companies the tools to easily build privacy-respecting applications.
This week we’ll be talking about new tools from Cloudflare that make building privacy-respecting applications easy by allowing companies to situate their users’ data in the countries and regions of their choosing. And we’ll be talking about new protocols that build privacy into the very structure of the Internet. We’ll update on the latest quantum-resistant algorithms that help keep private data private today and into the far future.
We’ll show how it’s possible to run a massive DNS resolver service like 22.214.171.124 and preserve users’ privacy through a clever new protocol. We’ll look at how to make passwords that can’t be leaked. And we’ll give everyone the power to get web analytics without tracking people.
Welcome to Phase 3 of the Internet: always on, always secure, always private.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.