Tag Archives: Policy & Legal

Giving users choice with Cloudflare’s new Content Signals Policy

2025-09-24 Will Allen

Post Syndicated from Will Allen original https://blog.cloudflare.com/content-signals-policy/

If we want to keep the web open and thriving, we need more tools to express how content creators want their data to be used while allowing open access. Today the tradeoff is too limited. Either website operators keep their content open to the web and risk people using it for unwanted purposes, or they move their content behind logins and limit their audience.

To address the concerns our customers have today about how their content is being used by crawlers and data scrapers, we are launching the Content Signals Policy. This policy is a new addition to robots.txt that allows you to express your preferences for how your content can be used after it has been accessed.

What `robots.txt` does, and does not, do today

Robots.txt is a plain text file hosted on your domain that implements the Robots Exclusion Protocol. It allows you to instruct which crawlers and bots can access which parts of your site. Many crawlers and some bots obey robots.txt files, but not all do.

For example, if you wanted to allow all crawlers to access every part of your site, you could host a robots.txt file that has the following:

User-agent: * 
Allow: /

A user-agent is how your browser, or a bot, identifies themselves to the resource they are accessing. In this case, the asterisk tells visitors that any user agent, on any device or browser, can access the content. The / in the Allow field tells the visitor that they can access any part of the site as well.

The robots.txt file can also include commentary by adding characters after # symbol. Bots and machines will ignore these comments, but it is one way to leave more human-readable notes to someone reviewing the file. Here is one example:

#    .__________________________.
#    | .___________________. |==|
#    | | ................. | |  |
#    | | ::[ Dear robot ]: | |  |
#    | | ::::[ be nice ]:: | |  |
#    | | ::::::::::::::::: | |  |
#    | | ::::::::::::::::: | |  |
#    | | ::::::::::::::::: | |  |
#    | | ::::::::::::::::: | | ,|
#    | !___________________! |(c|
#    !_______________________!__!
#   /                            \
#  /  [][][][][][][][][][][][][]  \
# /  [][][][][][][][][][][][][][]  \
#(  [][][][][____________][][][][]  )
# \ ------------------------------ /
#  \______________________________/

Website owners can make robots.txt more specific by listing certain user-agents (such as for only permitting certain bot user-agents or browser user-agents) and by stating which parts of a site they are or are not allowed to crawl. The example below tells bots to skip crawling the archives path.

User-agent: * 
Disallow: /archives/

And the example here gets more specific, telling Google’s bot to skip crawling the archives path.

User-agent: Googlebot 
Disallow: /archives/

This allows you to specify which crawlers are allowed and what parts of your site they can access. It does not, however, let them know what they are able to do with your content after accessing it. As many have realized, there needs to be a standard, machine-readable way to signal the rules of your road for how your data can be used even after it has been accessed.

That is what the Content Signals Policy allows you to express: your preferences for what a crawler can, and cannot do with your content.

Why are we launching the Content Signals Policy now?

There are companies that scrape vast troves of data from the Internet every day. There is a real cost to website operators to serve these data scrapers, in particular when they receive no compensation in return; we are experiencing a classic free-rider problem. This is only going to get worse: we expect bot traffic to exceed human traffic on the Internet by the end of 2029, and by 2031, we anticipate that bot activity alone will surpass the sum of current Internet traffic.

The de facto defaults of the Internet permitted this. The norm had been that your data would be ingested, but then you, the creator of that content, would get something in return: either referral traffic that you could monetize, or at a minimum some sort of attribution that cited you as the author. Think of the linkback in the early days of blogging, which was a way to give credit to the original creator of the work. No money changed hands, but that attribution drove future discovery and had intrinsic value. This norm has been embedded in many permissive licenses such as MIT and Creative Commons, each of which require attribution back to the original creator.

That world has changed; that scraped content is now sometimes used to economically compete against the original creator. It’s left many with an impossible choice: do you lock down access to your content and data, or accept the reality of fewer referrals and minimal attribution? If the only recourse is the former, the open transmission of ideas on the web is harmed and newer entrants to the AI ecosystem are put at an unfair disadvantage for their efforts to train new models.

The Cloudflare Content Signals Policy

The Content Signals Policy integrates into website operators’ robots.txt files. It is human-readable text following the # symbol to designate it as a comment. This policy defines three content signals – search, ai-input, and ai-train – and their relevance to crawlers.

A website operator can then optionally express their preferences via machine-readable content signals.

# As a condition of accessing this website, you agree to abide by the following content signals:

# (a)  If a content-signal = yes, you may collect content for the corresponding use.
# (b)  If a content-signal = no, you may not collect content for the corresponding use.
# (c)  If the website operator does not include a content signal for a corresponding use, the website operator neither grants nor restricts permission via content signal with respect to the corresponding use.

# The content signals and their meanings are: 

# search: building a search index and providing search results (e.g., returning hyperlinks and short excerpts from your website's contents).  Search does not include providing AI-generated search summaries.
# ai-input: inputting content into one or more AI models (e.g., retrieval augmented generation, grounding, or other real-time taking of content for generative AI search answers). 
# ai-train: training or fine-tuning AI models.

# ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.

There are three parts to this text:

The first paragraph explains to companies how to interpret any given content signal. “Yes” means go, “no” means stop, and the absence of a signal conveys no meaning. That final, neutral option is important: it lets website operators express a preference with respect to one content signal without requiring them to do so for another.
The second paragraph defines the content signals vocabulary. We kept the signals simple to make it easy for anyone accessing content to abide by them.
The final paragraph reminds those automating access to data that these content signals might have legal rights in various jurisdictions.

A website operator can then announce their specific preferences in machine-readable text using comma-delimited, ‘yes’ or ‘no’ syntax. If a website operator wants to allow search, disallow training, and expressed no preference regarding ai-input, they could include the following in their robots.txt:

User-Agent: *
Content-Signal: search=yes, ai-train=no 
Allow: /

If a website operator leaves the content signal for ai-input blank like in the above example, it does not mean they have no preference regarding that use; it just means they have not used this part of their robots.txt file to express it.

How to add content signals to your website

If you already know how to configure your robots.txt file, deploying content signals is as simple as adding the Content Signals Policy above and then defining your preferences via a content signal.

We want to make adopting content signals simple. Cloudflare customers have already turned on our managed robots.txt feature for over 3.8 million domains. By doing so, they have chosen to instruct companies that they do not want the content on those domains to be used for AI training. For these customers, we will update the robots.txt file that we already serve on their behalf to include the Content Signals Policy and the following signals:

Content-Signal: search=yes, ai-train=no

We will not serve an “ai-input” signal for our managed robots.txt customers. We don’t know their preference with respect to that signal, and we don’t want to guess.

Starting today, we also will serve the commented, human-readable Content Signals Policy for any free customer zone that does not have an existing robots.txt file. In practice, that means a request to robots.txt on that domain would return the comments that define what content signals are. These comments are ignored by crawlers. Importantly, it will not include any Allow or Disallow directives, nor will not serve any actual content signals. The users are the ones to choose and express their actual preferences if and when they are ready to do so. Customers with an existing robots.txt file will see no change.

Zones on a free plan can turn off the Content Signals Policy in the Security Settings section of the Cloudflare dashboard, as well as via the Overview section.

To create your own content signals, just copy and paste the text that we help you generate at ContentSignals.org into your robots.txt file, or immediately deploy via the Deploy to Cloudflare button. You can alternatively turn on our managed robots.txt feature if you would like to express your preference to disallow training.

It’s important to remember that content signals express preferences; they are not technical countermeasures against scraping. Some companies might simply ignore them. If you are a website publisher seeking to control what others do with your content, we think it is best to combine your content signals with WAF rules and Bot Management.

While these Cloudflare features aim to make it easier to use, we want to encourage adoption by anyone, anywhere. In order to promote this practice, we are releasing this policy under a CC0 License, which allows anyone to implement and use it freely.

What’s next

Our customers are fully in the driver’s seat for what crawlers they want to allow and what they’d like to block. Some want to write for the superintelligence, others want more control: we think they should be the ones to decide.

Content signals allow anyone to express how they want their content to be used after it has been accessed. Enabling the ability to express preferences was overdue.

We know there’s more work to do. Signaling the rules of the road only works if others recognize those rules. That’s why we’ll continue to work in standards bodies to develop and standardize solutions that meet the needs of our customers and are accepted by the broader Internet community.

We hope you’ll join us in these efforts: the open web is worth fighting for.

To build a better Internet in the age of AI, we need responsible AI bot principles. Here’s our proposal.

2025-09-24 Leah Romm

Post Syndicated from Leah Romm original https://blog.cloudflare.com/building-a-better-internet-with-responsible-ai-bot-principles/

Cloudflare has a unique vantage point: we see not only how changes in technology shape the Internet, but also how new technologies can unintentionally impact different stakeholders. Take, for instance, the increasing reliance by everyday Internet users on AI–powered chatbots and search summaries. On the one hand, end users are getting information faster than ever before. On the other hand, web publishers, who have historically relied on human eyeballs to their website to support their businesses, are seeing a dramatic decrease in those eyeballs, which can reduce their ability to create original high-quality content. This cycle will ultimately hurt end users and AI companies (whose success relies on fresh, high-quality content to train models and provide services) alike.

We are indisputably at a point in time when the Internet needs clear “rules of the road” for AI bot behavior (a note on terminology: throughout this blog we refer to AI bots and crawlers interchangeably). We have had ongoing cross-functional conversations, both internally and with stakeholders and partners across the world, and it’s clear to us that the Internet at large needs key groups — publishers and content creators, bot operators, and Internet infrastructure and cybersecurity companies — to reach a consensus on certain principles that AI bots should follow.

Of course, agreeing on what exactly those principles are will take time and require continued discussion and collaboration, and a policy framework can’t perfectly capture every technical concern. Nevertheless, we think it’s important to start a conversation that we hope others will join. After all, a rough draft is better than a blank page.

That is why we are proposing the following responsible AI bot principles as starting points:

Public disclosure: Companies should publicly disclose information about their AI bots;
Self-identification: AI bots should truthfully self-identify, eventually replacing less reliable methods, like user agent and IP address verification, with cryptographic verification;
Declared single purpose: AI bots should have one distinct purpose and declare it;
Respect preferences: AI bots should respect and comply with preferences expressed by website operators where proportionate and technically feasible;
Act with good intent: AI bots must not flood sites with excessive traffic or engage in deceptive behavior.

Each principle is discussed in greater detail below. These principles focus on AI bots because of the impact generative AI is having on the Internet, but we have already seen these practices in action with other types of (non-AI) bots as well. We believe these principles will help move the Internet in a better direction. That said, we acknowledge that they are a starting point for this conversation, which requires input from other stakeholders. The Internet has always been a collaborative place for innovation, and these principles should be seen as equally dynamic and evolving.

Why Cloudflare is encouraging this conversation

Since declaring July 1st Content Independence Day, Cloudflare has strived to play a balanced and effective role in safeguarding the future of the Internet in the age of generative AI. We have enabled customers to charge AI crawlers for access or block them with one click, published and enforced our verified bots policy and developed the Web Bot Auth proposal, and unapologetically called out and stopped bad behavior.

While we have recently focused our attention on AI crawlers, Cloudflare has long been a leader in the bot management space, helping our customers protect their websites from unwanted — and even malicious —traffic. We also want to make sure that anyone — whether they’re our customer or not — can see which AI bots are abiding by all, some, or none of these best practices.

But we aren’t ignorant to the fact that companies operating crawlers are also adapting to a new Internet landscape — and we genuinely believe that most players in this space want to do the right thing, while continuing to innovate and propel the Internet in an exciting direction. Our hope is that we can use our expertise and unique vantage point on the Internet to help bring seemingly incompatible parties together and find a path forward — continuing our mission of helping to build a better Internet for everyone.

Responsible AI bot principles

The following principles are a launchpad for a larger conversation, and we recognize that there is work to be done to address many nuanced perspectives. We envision these principles applying to AI bots but understand that technical complexity may require flexibility. Ultimately, our goal is to emphasize transparency, accountability, and respect for content access and use preferences. If these principles fall short of that — or fail to consider other important priorities — we want to know.

Principle #1: Public disclosure

Companies should publicly disclose information about their AI bots. The following information should be publicly available and easy to find:

Identity: information that helps external parties identify a bot, e.g., user agent, relevant IP address(es), and/or individual cryptographic identification (more on this below, in Principle #2: Self-identification).
Operator: the legal entity responsible for the AI bot, including a point of contact (e.g., for reporting abuse);
Purpose: for which purpose the accessed data will be used, i.e., search, AI-input, or training (more on this below, in Principle #3: Declared Single Purpose).

OpenAI is an example of a leading AI company that clearly discloses their bots, complete with detailed explanations of each bot’s purpose. The benefits of this disclosure are apparent in the subsequent principles. It helps website operators validate that a given request is in fact coming from OpenAI and what its purpose is (e.g., search indexing or AI model training). This, in turn, enables website operators to control access to and use of their content through preference expression mechanisms, like robots.txt files.

Principle #2: Self-identification

AI bots should truthfully self-identify. Not only should information about bots be disclosed in a publicly accessible location, this information should also be clearly communicated by bots themselves, e.g., through an HTTP request that conveys the bot’s official user agent and comes from an IP address that the bot claims to send traffic from. Admittedly, this current approach is flawed, as we discuss in more detail below. But until cryptographic verification is more widely adopted, we think relying on user agent and IP verification is better than nothing.

OpenAI’s GPTBot is an example of this principle in action. OpenAI publicly shares the expected full user-agent string for this bot and includes it in its requests. OpenAI also explains this bot’s purpose (“used to make [OpenAI’s] generative AI foundation models more useful and safe” and “to crawl content that may be used in training [their] generative AI foundation models”). And we have observed this bot sending traffic from IP addresses reported by OpenAI. Because site operators see GPTBot’s user agent and IP addresses matching what is publicly disclosed and expected, and they know information about the bot is publicly documented, they can confidently recognize the bot. This enables them to make informed decisions about whether they want to allow traffic from it.

Unfortunately, not all bots uphold this principle, making it difficult for website owners to know exactly which bot operators respect their crawl preferences, much less enforce them. For example, while Anthropic publishes its user agent alone, absent other verifiable information, it’s unclear which requests are truly from Anthropic. And xAI’s bot, grok, does not self-identify at all, making it impossible for website operators to block it. Anthropic and xAI’s lack of identification undermines trust between them and website owners, yet this could be fixed with minimal effort on their parts.

A note on cryptographic verification and the future of Principle #2

Truthful declaration of user agent and dedicated IP lists have historically been a functional way to verify. But in today’s rapidly-evolving bot climate, bots are increasingly vulnerable to being spoofed by bad actors. These bad actors, in turn, ignore robots.txt, which communicates allow/disallow preferences only on a user agent basis (so, a bad bot could spoof a permitted user agent and circumvent that domain’s preferences).

Ultimately, every AI bot should be cryptographically verified using an accepted standard. This would protect them against spoofing and ensure website operators have the accurate and reliable information they need to properly evaluate access by AI bots. At this time, we believe that Web Bot Auth is sufficient proof of compliance with Principle #2. We recognize that this standard is still in development, and, as a result, this principle may evolve accordingly.

Web Bot Auth uses cryptography to verify bot traffic; cryptographic signatures in HTTP messages are used as verification that a given request came from an automated bot. Our implementation relies on proposed IETF directory and protocol drafts. Initial reception of Web Bot Auth has been very positive, and we expect even more adoption. For example, a little over a month ago, Vercel announced that its bot verification now supports Web Bot Auth. And OpenAI’s ChatGPT agent now signs its requests using Web Bot Auth, in addition to using the HTTP Message Signatures standard.

We envision a future where cryptographic authentication becomes the norm, as we believe this will further strengthen the trustworthiness of bots.

Principle #3: Declared single purpose

AI bots should have one distinct purpose and declare it. Today, some bots self-identify their purpose as Training, Search, or User Action (i.e., accessing a web page in response to a user’s query).

However, these purposes are sometimes combined without clear distinction. For example, content accessed for search purposes might also be used to train the AI model powering the search engine. When a bot’s purpose is unclear, website operators face a difficult decision: block it and risk undermining search engine optimization (SEO), or allow it and risk content being used in unwanted ways.

When operators deploy bots with distinct purposes, website owners are able to make clear decisions over who can access their content. What those purposes should be is up for debate, but we think the following breakdown is a starting point based on bot activity we see. We recognize this is an evolving space and changes may be required as innovation continues:

Search: building a search index and providing search results (e.g., returning hyperlinks and short excerpts from your website’s contents). Search does not include providing AI-generated search summaries;
AI-input: inputting content into one or more AI models, e.g., retrieval-augmented generation (RAG), grounding, or other real-time taking of content for generative AI search answers; and
Training: training or fine-tuning AI models.

Relatedly, bots should not combine purposes in a way that prevents web operators from deliberately and effectively deciding whether to allow crawling.

Let’s consider two AI bots, OAI-SearchBot and Googlebot, from the perspective of Vinny, a website operator trying to make a living on the Internet. OAI-SearchBot has a single purpose: linking to and surfacing websites in ChatGPT’s search features. If Vinny takes OpenAI at face value (which we think it makes sense to do), he can trust that OAI-SearchBot does not crawl his content for training OpenAI’s generative AI models rather, a separate bot (GPTBot, as discussed in Principle #2: Self-identification) does. Vinny can decide how he wants his content used by OpenAI, e.g., permitting its use for search but not for AI training, and feel confident that his choices are respected because OAI-SearchBot only crawls for search purposes, while GPTBot is not granted access to the content in the first place (and therefore cannot use it).

On the other hand, while Googlebot scrapes content for traditional search-indexing (not model training), it also uses that content for inference purposes, such as for AI Overviews and AI Mode. Why is this a problem for Vinny? While he almost certainly wants his content appearing in search results, which drive the human eyeballs that fund his site, Vinny is forced to also accept that his content will appear in Google’s AI-generated summaries. If eyeballs are satisfied by the summary then they never visit Vinny’s website, which leads to “zero-click” searches and undermines Vinny’s ability to financially benefit from his content.

This is a vicious cycle: creating high-quality content, which typically leads to higher search rankings, now inadvertently also reduces the chances an eyeball will visit the site because that same valuable content is surfaced in an AI Overview (if it is even referenced as a source in the summary). To prevent this, Vinny must either opt out of search completely or use snippet controls (which risks degrading how his content appears in search results). This is because the only available signal to opt-out of AI, disallowing Google-Extended, is limited to training and does not apply to AI Overview, which is attached to search. Whether by accident or by design, this setup forces an impossible choice onto website owners.

Finally, the prominent technical argument in favor of combining multiple purposes — that this reduces the crawler operator’s costs — needs to be debunked. To reason by analogy: it’s like arguing that placing one call to order two pizzas is cheaper than placing two calls to order two pizzas. In reality, the cost of the two pizzas (both of which take time and effort to make) remains the same. The extra phone call may be annoying, but its costs are negligible.

Similarly, whether one bot request is made for two purposes (e.g., search indexing and AI model training) or a separate bot request is made for each of two purposes, the costs basically remain the same. For the crawler, the cost of compute is the same because the content still needs to be processed for each purpose. And the cost of two connections (i.e., for two requests) is virtually the same as one. We know this because Cloudflare runs one of the largest networks in the world, handling on average 84 million requests per second, so we understand the cost of requests at Internet scale. (As an aside, while additional crawls incur costs on website operators, they have the ability to choose whether the crawl is worth the cost, especially when bots have a single purpose.)

Principle # 4: Respect preferences

AI bots should respect and comply with preferences expressed by website operators where proportionate and technically feasible. There are multiple options for expressing preferences. Prominent examples include the longstanding and familiar robots.txt, as well as newly emerging HTTP headers.

Given the widespread use of robots.txt files, bots should make a good faith attempt to fetch a robots.txt file first, in accordance with RFC 9309, and abide by both the access and use preferences specified therein. AI bot operators should also stay up to date on how those preferences evolve as a result of a draft vocabulary currently under development by an IETF working group. The goal of the proposed vocabulary is to improve granularity in robots.txt files, so that website operators are empowered to control how their assets are used.

At the same time, new industry standards under discussion may involve the attachment of machine-readable preferences to different formats, such as individual files. AI bot operators should eventually be prepared to comply with these standards, too. One idea currently being explored is a way for site owners to list preferences via HTTP headers, which offer a server-level method of declaring how content should be used.

Principle #5: Act with good intent

AI bots must not flood sites with excessive traffic or engage in deceptive behavior. AI bot behavior should be benign or helpful to website operators and their users. It is also incumbent on companies that operate AI bots to monitor their networks and resources for breaches and patch vulnerabilities. Jeopardizing a website’s security or performance or engaging in harmful tactics is unacceptable.

Nor is it appropriate to appear to comply with the principles, only to secretly circumvent them. Reaffirming a long-standing principle of acceptable bot behavior, AI bots must never engage in stealth crawling or use other stealth tactics to try and dodge detection, such as modifying their user agent, changing their source ASNs to hide their crawling activity, or ignoring robots.txt files. Doing so would undermine the preceding four principles, hurting website operators and worsening the Internet for all.

The road ahead: multi-stakeholder efforts to bring these principles to life

As we continue working on these principles and soliciting feedback, we strive to find a balance: we want the wishes of content creators respected while still encouraging AI innovation. It’s a privilege to sit at the intersection of these important interests and to play a crucial role in developing an agreeable path forward.

We are continuing to engage with right holders, AI companies, policy-makers, and regulators to shape global industry standards and regulatory frameworks accordingly. We believe that the influx of generative AI use need not threaten the Internet’s place as an open source of quality content. Protecting its integrity requires agreement on workable technical standards that reflect the interests of web publishers, content creators, and AI companies alike.

The whole ecosystem must continue to come together and collaborate towards a better Internet that truly works for everyone. Cloudflare advocates for neutral forums where all affected parties can discuss the impact of AI developments on the Internet. One such example is the IETF, which has current work focused on some of the technical aspects being considered. Those efforts attempt to address some, but not all, of the issues in an area that deserves holistic consideration. We believe the principles we have proposed are a step in the right direction — but we hope others will join this complex and important conversation, so that norms and behavior on the Internet can successfully adapt to this exciting new technological age.

The White House AI Action Plan: a new chapter in U.S. AI policy

2025-07-25 Zaid Zaid

Post Syndicated from Zaid Zaid original https://blog.cloudflare.com/the-white-house-ai-action-plan-a-new-chapter-in-u-s-ai-policy/

On July 23, 2025, the White House unveiled its AI Action Plan (Plan), a significant policy document outlining the current administration’s priorities and deliverables in Artificial Intelligence. This plan emerged after the White House received over 10,000 public comments in response to a February 2025 Request for Information (RFI). Cloudflare’s comments urged the White House to foster conditions for U.S. leadership in AI and support open-source AI, among other recommendations.

There is a lot packed into the three pillar, 28-page Plan.

Pillar I: Accelerate AI Innovation. Focuses on removing regulations, enabling AI adoption and developing, and ensuring the availability of open-source and open-weight AI models.
Pillar II: Build American AI Infrastructure. Prioritizes the construction of high-security data centers, bolstering critical infrastructure cybersecurity, and promoting Secure-by-Design AI technologies.
Pillar III: Lead in International AI Diplomacy and Security. Centers on providing America’s allies and partners with access to AI, as well as strengthening AI compute export control enforcement.

Each of these pillars outlines policy recommendations for various federal agencies to advance the plan’s overarching goals. There’s much that the Plan gets right. Below we cover a few parts of the Plan that we think are particularly important.

Encouraging U.S. technology leadership

The Plan takes the position that the U.S. is in a global race to achieve AI dominance, and that it is a national priority for U.S. technology companies to be the gold standard for AI globally. Through the Plan, President Trump commits his Administration to support American workers, technology, and energy to achieve that objective.

We share the view that governments have a helpful role to play in shaping rules and regulations that will enable private-sector innovation to flourish. For Cloudflare’s network to continue to operate globally, we need the U.S. government to shape and influence the right regulatory conditions. They should balance national and economic security concerns, promote consensus industry-led international standards, and support interoperable regulatory regimes.

Far too often in recent years, we’ve observed policy developments that have unnecessarily increased restrictions on U.S. technology providers and have made it challenging to operate. Protectionist mandates, including data sovereignty requirements, customer data retention policies, various supervisory and government access requirements, do little to improve security or innovation and have unintended consequences. Protectionism increases costs for businesses, limits access to world-class technologies, and increases cybersecurity risk.

Implementing policies that guarantee access to global, distributed edge-compute networks and the freedom to choose the best technology for users’ needs will help ensure the right conditions to enable AI to flourish.

The AI ecosystem needed to spur innovation and development

The Plan endorses open-source and open-weight AI models to spur innovation and to benefit commercial and government adoption. The plan recommends ensuring access to computing resources to increase capability in the start up and academic worlds.

Cloudflare shares the view that open-source AI models play a crucial role in driving innovation. As recognized in the Plan, these models offer companies flexibility, freeing them from dependence on closed providers and enabling the use of AI with sensitive data where exporting to closed models might not be possible. That’s why Cloudflare includes access to more than fifty open-source models as part of our Workers AI model catalog.

However, access to open-source models alone is not enough to harness AI’s potential. A complete ecosystem is needed to build and deploy the AI applications and tools that will usher in the new age imagined by the Plan. Cloudflare’s global network, with our GPU-powered inference, can play an essential role. Having a distributed network like ours which allows AI inference at the edge is critical for fast, efficient AI development and for building the next generation of AI applications.

Open ecosystems are deeply embedded in Cloudflare’s DNA. Our developer platform democratizes access, providing powerful tools for anyone to build and deploy applications. We offer global network infrastructure that removes complexities and reduces barriers. This lets AI developers innovate freely, using many different AI models, without relying on gatekeepers. Our commitment to making these tools easy to use mirrors the Plan’s call to foster innovation and support U.S. AI leadership by enabling developers to use open-source AI models to build, deploy, and scale new AI applications globally.

Enhancing cybersecurity with AI

The Plan stresses the importance of cybersecurity for AI in several ways. There are two we want to highlight.

First, it endorses the use of AI technologies for the cybersecurity of critical infrastructure. The use of AI-assisted cyber-defense tools are force multipliers for network defenders, and will be absolutely necessary for all organizations — but particularly critical infrastructure — to protect against cyber threats.

Cloudflare’s network uses predictive AI and machine learning to block 247 billion cyberattacks daily. Under the theory of Defensive AI, Cloudflare uses information to constantly improve the effectiveness of our security solutions. With AI Labyrinth, we’ve even created a new tool that uses AI to trap AI. It is a new, next generation honeypot and cybersecurity defensive tool that leverages AI to confuse crawlers and bots that ignore “no crawl” directives. Instead of blocking these bots, AI Labyrinth directs bots into an endless maze of convincing, AI-generated pages.

Second, to address potential vulnerabilities in AI technologies, the Plan tasks the U.S. government with ensuring that they are secure-by-design.

To secure AI, Cloudflare has been active in shaping the cybersecurity and risk management of AI technologies. We have supported and provided feedback to the U.S. National Institute of Standards and Technology’s efforts to develop a Cybersecurity Profile for Artificial Intelligence. This is critically important and builds on our Secure-by-Design commitment.

We look forward to working with the Administration on the proposed AI information sharing and analysis center and the proposed vulnerability information exchange.

Cloudflare stands ready to accelerate AI adoption in government

The Plan envisions the federal government playing a key role in accelerating AI adoption. Cloudflare can help. As the Plan notes, integrating AI can significantly enhance public service, making government more efficient and effective. Most, if not all, federal agencies now have Chief AI Officers, indicating a clear commitment to this technological shift. The government can further its efforts by fostering information sharing between government agencies, promoting best practices, and training its workforce to maximize AI’s efficiency gains.

Cloudflare can be a key partner in this journey. Our platform provides the secure, reliable, and scalable infrastructure necessary for federal agencies to deploy AI applications with full-stack AI building blocks. Cloudflare is FedRAMP Moderate authorized, and we are committed to FedRAMP High. By leveraging Cloudflare’s global network, federal agencies can ensure their AI initiatives are resilient and accessible, driving greater public benefit.

The need to balance the export of AI with export controls

To lead on AI internationally, the Plan outlines a dual strategy, presenting two approaches in tension with each other: aggressive AI export to allies and partners, and stringent restrictions on exporting AI compute and semiconductors. On one hand, the Plan emphasizes that providing the full U.S. AI technology stack is crucial to prevent allies from turning to rivals. This aims to solidify a global AI alliance and ensure the enduring diffusion of American technology.

Conversely, the plan calls for strengthening export control enforcement and plugging loopholes to prevent export of sensitive technologies. The administration seeks to use export controls — restrictions on what goods a company can export — to deny foreign adversaries access to certain resources for both geostrategic competition and national security concerns. The challenge arises because overly stringent export controls, while aiming to deny access to adversaries, may inadvertently make it harder to export AI even to allies.

This dual approach highlights a critical tightrope walk. Cloudflare, along with many other industry players, will be watching closely to see how the administration balances these competing goals. Providing individuals across the world with access to resources that enable them to innovate and build applications close to their end users aligns with our mission to help build a better, more connected Internet. Having a globally distributed network like ours also enables U.S. AI companies to deploy their services globally. Although we appreciate the need for restricting access to sensitive compute resources, overly broad or imprecise controls could inadvertently stifle innovation and impede the open exchange of ideas crucial for AI development. The implementation of export controls must be meticulously balanced to target adversaries effectively without unwittingly hindering the very innovation and secure global digital ecosystem it seeks to protect.

A reassuring aspect of the Plan is its clear recognition of the private sector’s indispensable role. The document repeatedly emphasizes the need for collaboration with industry and consultation with leading technology companies across various recommended policy actions. For instance, it specifically calls for establishing programs within the Department of Commerce to gather proposals from industry consortia for AI export packages. Furthermore, for strengthening AI compute export control enforcement, it advises exploring new measures “in collaboration with industry.” This commitment to partnership is essential to navigate the complexities of AI development and deployment. This collaboration with industry will ensure that policies are technically feasible, globally effective, and avoid unforeseen negative impacts on the digital economy and cybersecurity.

Shaping the future of AI together

The Plan represents a critical moment for U.S. AI leadership, and Cloudflare stands ready to partner in shaping the future of this critical technology. We applaud the Plan’s focus on accelerating AI development, building robust infrastructure, and leading global diplomacy. The Internet’s global nature means that achieving these goals requires a delicate balance, particularly as the business model for the AI-powered web rapidly evolves. Cloudflare champions an approach that fosters innovation while upholding an open, secure, and interoperable Internet. By prioritizing consensus-driven standards and ensuring that regulations do not inadvertently create barriers to a globally distributed AI infrastructure, we help ensure continued U.S. technological leadership and a sustainable, beneficial AI ecosystem.

Russian Internet users are unable to access the open Internet

2025-06-27 Michael Tremante

Post Syndicated from Michael Tremante original https://blog.cloudflare.com/russian-internet-users-are-unable-to-access-the-open-internet/

Since June 9, 2025, Internet users located in Russia and connecting to web services protected by Cloudflare have been throttled by Russian Internet Service Providers (ISPs).

As the throttling is being applied by local ISPs, the action is outside of Cloudflare’s control and we are unable, at this time, to restore reliable, high performance access to Cloudflare products and protected websites for Russian users in a lawful manner.

Internal data analysis suggests that the throttling allows Internet users to load only the first 16 KB of any web asset, rendering most web navigation impossible.

Cloudflare has not received any formal outreach or communication from Russian government entities about the motivation for such an action. Unfortunately, the actions are consistent with longstanding Russian efforts to isolate the Internet within its borders and reduce reliance on Western technology by replacing it with domestic alternatives. Indeed, Russian President Vladimir Putin recently publicly threatened to throttle US tech companies operating inside Russia.

External reports corroborate our analysis, and further suggest that a number of other service providers are also affected by throttling or other disruptive actions in Russia, including at least Hetzner, DigitalOcean, and OVH.

The impact

Cloudflare is seeing disruptions across connections initiated from inside Russia, even when the connection reaches our servers outside of Russia. Consistent with public reporting on Russia’s practices, this suggests that the disruption is happening inside Russian ISPs, close to users.

Russian Internet Services Providers (ISPs) confirmed to be implementing these disruptive actions include, but are not limited to, Rostelecom, Megafon, Vimpelcom, MTS, and MGTS.

Based on our observations, Russian ISPs are using several throttling and blocking mechanisms affecting sites protected by Cloudflare, including injected packets to halt the connection and blocking packets so the connection times out. A new tactic that began on June 9 limits the amount of content served to 16 KB, which renders many websites barely usable.

The throttling affects all connection methods and protocols, including HTTP/1.1 and HTTP/2 on TCP and TLS, as well as HTTP/3 on QUIC.

The view from Cloudflare data

Traffic trends

Cloudflare Radar exists to share insights and bring transparency to Internet trends. The high rate of connectivity errors to all our data centers has resulted in an overall decrease in traffic served to Russian users. The reduction in traffic can be observed on Cloudflare Radar:

Client-side reports via Network Error Logging

Some customers elect to enable W3C-defined Network Error Logging (NEL), a feature that embeds error-reporting instructions inside the headers of web content that users request. The instructions tell web browsers what errors to report, and how to do so. Below is a view of NEL reports that show an increase of TCP connections being ‘reset’ prematurely (as explained in our tampering and Radar resets blogs). Separately, the large growth in h3.protocol.error shows that QUIC connections have been greatly affected:

Corroboration of throttling using internal data

The effects of the throttling can also be observed in our internal tooling. The chart below shows packet loss to our Russian data centers, each data center represented by a different line. The Y-axis is the proportion of packet loss:

High packet loss is a strong signal but does not on its own indicate throttling, since there might be other explanations. For example, an explanation may be our servers trying to resend packets multiple times in during some other mass failure that hinders, but does not completely halt, communication.

However, we have two additional pieces of information to work with. The first consists of public reports that “throttling” in this case means blocking all connections after 16 KB of data has been transmitted, which takes 10 to 14 packets (depending on the underlying technology). Second, we have our recently deployed “Resets and Timeouts” data that captures anomalous behaviour in TCP when it occurs within the first 10 packets. Since 10 packets can contain 16 KB of data, some connections that are blocked around 16 KB will be visible at the “Post PSH” stage in the Radar data. In TCP, the ‘PSH’ message means Cloudflare got the initial request and data transfer has begun. If the connection is blocked at this stage, then many of the sent packets will be lost.

The graph below uses Radar’s Data Explorer to focus on just the Post-PSH stage, where there is a dip followed by an immediate and proportionally large increase before June 11. This pattern corresponds closely with the loss data seen above:

If you run Internet sites for Russian users

If you are using Cloudflare to protect your sites, unfortunately, at this time, Cloudflare does not have the ability to restore Internet connectivity for Russia-based users. We advise you to reach out and solicit Russian entities to lift the throttling measures that have been put in place.

If you are a Cloudflare enterprise customer, please reach out to your account team for further assistance.

Access to a free and open Internet is critical for individual rights and economic development. We condemn any attempt to prevent Russian citizens from accessing it.

Vulnerability transparency: strengthening security through responsible disclosure

2025-05-16 Sri Pulla

Post Syndicated from Sri Pulla original https://blog.cloudflare.com/vulnerability-transparency-strengthening-security-through-responsible/

In an era where digital threats evolve faster than ever, cybersecurity isn’t just a back-office concern — it’s a critical business priority. At Cloudflare, we understand the responsibility that comes with operating in a connected world. As part of our ongoing commitment to security and transparency, Cloudflare is proud to have joined the United States Cybersecurity and Infrastructure Security Agency’s (CISA) “Secure by Design” pledge in May 2024.

By signing this pledge, Cloudflare joins a growing coalition of companies committed to strengthening the resilience of the digital ecosystem. This isn’t just symbolic — it’s a concrete step in aligning with cybersecurity best practices and our commitment to protect our customers, partners, and data.

A central goal in CISA’s Secure by Design pledge is promoting transparency in vulnerability reporting. This initiative underscores the importance of proactive security practices and emphasizes transparency in vulnerability management — values that are deeply embedded in Cloudflare’s Product Security program. We believe that openness around vulnerabilities is foundational to earning and maintaining the trust of our customers, partners, and the broader security community.

Why transparency in vulnerability reporting matters

Transparency in vulnerability reporting is essential for building trust between companies and customers. In 2008, Linus Torvalds noted that disclosure is inherently tied to resolution: “So as far as I’m concerned, disclosing is the fixing of the bug”, emphasizing that resolution must start with visibility. While this mindset might apply well to open-source projects and communities familiar with code and patches, it doesn’t scale easily to non-expert users and enterprise users who require structured, validated, and clearly communicated disclosures regarding a vulnerability’s impact. Today’s threat landscape demands not only rapid remediation of vulnerabilities but also clear disclosure of their nature, impact and resolution. This builds trust with the customer and contributes to the broader collective understanding of common vulnerability classes and emerging systemic flaws.

What is a CVE?

Common Vulnerabilities and Exposures (CVE) is a catalog of publicly disclosed vulnerabilities and exposures. Each CVE includes a unique identifier, summary, associated metadata like the Common Weakness Enumeration (CWE) and Common Platform Enumeration (CPE), and a severity score that can range from None to Critical.

The format of a CVE ID consists of a fixed prefix, the year of the disclosure and an arbitrary sequence number like CVE-2017-0144. Memorable names such as “EternalBlue” (CVE-2017-0144) are often associated with high-profile exploits to enhance recall.

What is a CNA?

As an authorized CVE Numbering Authority (CNA), Cloudflare can assign CVE identifiers for vulnerabilities discovered within our products and ecosystems. Cloudflare has been actively involved with MITRE’s CVE program since its founding in 2009. As a CNA, Cloudflare assumes the responsibility to manage disclosure timelines ensuring they are accurate, complete, and valuable to the broader industry.

Cloudflare CVE issuance process

Cloudflare issues CVEs for vulnerabilities discovered internally and through our Bug Bounty program when they affect open source software and/or our distributed closed source products.

The findings are triaged based on real-world exploitability and impact. Vulnerabilities without a plausible exploitation path, in addition to findings related to test repositories or exposed credentials like API keys, typically do not qualify for CVE issuance.

We recognize that CVE issuance involves nuance, particularly for sophisticated security issues in a complex codebase (for example, the Linux kernel). Issuance relies on impact to users and the likelihood of the exploit, which depends on the complexity of executing an attack. The growing number of CVEs issued industry-wide reflects a broader effort to balance theoretical vulnerabilities against real-world risk.

In scenarios where Cloudflare was impacted by a vulnerability, but the root cause was within another CNA’s scope of products, Cloudflare will not assign the CVE. Instead, Cloudflare may choose other mediums of disclosure, like blog posts.

How does Cloudflare disclose a CVE?

Our disclosure process begins with internal evaluation of severity and scope, and any potential privacy or compliance impacts. When necessary, we engage our Legal and Security Incident Response Teams (SIRT). For vulnerabilities reported to Cloudflare by external entities via our Bug Bounty program, our standard disclosure timeline is within 90 days. This timeline allows us to ensure proper remediation, thorough testing, and responsible coordination with affected parties. While we are committed to transparent disclosure, we believe addressing and validating fixes before public release is essential to protect users and uphold system security. For open source projects, we also issue security advisories on the relevant GitHub repositories. Additionally, we encourage external researchers to publish/blog about their findings after issues are remediated. Full details and process of Cloudflare’s external researcher/entity disclosure policy can be found via our Bug Bounty program policy page

Outcomes

To date, Cloudflare has issued and disclosed multiple CVEs. Because of the security platforms and products that Cloudflare builds, vulnerabilities have primarily been in the areas of denial of service, local privilege escalation, logical flaws, and improper input validation. Cloudflare also believes in collaboration and open sources of some of our software stack, therefore CVEs in these repositories are also promptly disclosed.

Cloudflare disclosures can be found here. Below are some of the most notable vulnerabilities disclosed by Cloudflare:

CVE-2024-1765: quiche: Memory Exhaustion Attack using post-handshake CRYPTO frames

Cloudflare quiche (through version 0.19.1/0.20.0) was affected by an unlimited resource allocation vulnerability causing rapid increase of memory usage of the system running a quiche server or client.

A remote attacker could take advantage of this vulnerability by repeatedly sending an unlimited number of 1-RTT CRYPTO frames after previously completing the QUIC handshake.

Exploitation was possible for the duration of the connection, which could be extended by the attacker.

quiche 0.19.2 and 0.20.1 are the earliest versions containing the fix for this issue.

CVE-2024-0212: Cloudflare WordPress plugin enables information disclosure of Cloudflare API (for low-privilege users)

The Cloudflare WordPress plugin was found to be vulnerable to improper authentication. The vulnerability enables attackers with a lower privileged account to access data from the Cloudflare API.

The issue has been fixed in version >= 4.12.3 of the plugin

CVE-2023-2754 – Plaintext transmission of DNS requests in Windows 1.1.1.1 WARP client

The Cloudflare WARP client for Windows assigns loopback IPv4 addresses for the DNS servers, since WARP acts as a local DNS server that performs DNS queries securely. However, if a user is connected to WARP over an IPv6-capable network, the WARP client did not assign loopback IPv6 addresses but rather Unique Local Addresses, which under certain conditions could point towards unknown devices in the same local network, enabling an attacker to view DNS queries made by the device.

This issue was patched in version 2023.7.160.0 of the WARP client (Windows).

CVE-2025-0651 – Improper privilege management allows file manipulations

An improper privilege management vulnerability in Cloudflare WARP for Windows allowed file manipulation by low-privilege users. Specifically, a user with limited system permissions could create symbolic links within the C:\ProgramData\Cloudflare\warp-diag-partials directory. When the “Reset all settings” feature is triggered, the WARP service — running with SYSTEM-level privileges — followed these symlinks and may delete files outside the intended directory, potentially including files owned by the SYSTEM user.

This vulnerability affected versions of WARP prior to 2024.12.492.0.

CVE-2025-23419: TLS client authentication can be bypassed due to ticket resumption (disclosed Cloudflare impact via blog post)

Cloudflare’s mutual TLS implementation caused a vulnerability in the session resumption handling. The underlying issue originated from BoringSSL’s process to resume TLS sessions. BoringSSL stored client certificates, which were reused from the original session (without revalidating the full certificate chain) and the original handshake’s verification status was not re-validated.

While Cloudflare was impacted by the vulnerability, the root cause was within NGINX’s implementation, making F5 the appropriate CNA to assign the CVE. This is an example of alternate mediums of disclosure that Cloudflare sometimes opt for. This issue was fixed as per guidance from the respective CVE — please see our blog post for more details.

Conclusion

Irrespective of the industry, if your organization builds software, we encourage you to familiarize yourself with CISA’s “Secure by Design” principles and create a plan to implement them in your company. The CISA Secure by Design pledge is built around seven security goals, prioritizing the security of customers, and challenges organizations to think differently about security.

As we continue to enhance our security posture, Cloudflare remains committed to enhancing our internal practices, investing in tooling and automation, and sharing knowledge with the community. CVE transparency is not a one-time initiative — it’s a sustained effort rooted in openness, discipline, and technical excellence. By embedding these values in how we design, build and secure our products, we aim to meet and exceed expectations set out in the CISA pledge and make the Internet more secure, faster and reliable!

For more updates on our CISA progress, review our related blog posts. Cloudflare has delivered five of the seven CISA Secure by Design pledge goals, and we aim to complete the remainder of the pledge goals in May 2025.

Cloudflare’s 2024 Transparency Reports – now live with new data and a new format

2025-02-28 Abby Vollmer

Post Syndicated from Abby Vollmer original https://blog.cloudflare.com/cloudflare-2024-transparency-reports-now-live-with-new-data-and-a-new-format/

Cloudflare’s 2024 Transparency Reports are now live — with new topics, new data points, and a new format. For over 10 years, Cloudflare has published transparency reports twice a year in order to provide information to our customers, policymakers, and the public about how we handle legal requests and abuse reports relating to the websites using our services. Such transparency reporting is now recognized as a best practice among companies offering online services, and has even been written into law with the European Union’s Digital Service Act (DSA).

While Cloudflare has been publishing transparency reports for a long time, this year we chose to revamp the report in light of new reporting obligations under the DSA, and our goal of making our reports both comprehensive and easy to understand. Before you dive into the reports, learn more about Cloudflare’s longstanding commitment to transparency reporting and the key updates we made in this year’s reports.

Cloudflare’s approach to transparency reporting

Cloudflare started issuing transparency reports early on, because we have long believed that transparency is essential to earning trust. In addition to sharing data about the number and nature of requests we receive, our transparency reports have provided a forum for Cloudflare to articulate the principles we apply in approaching legal requests for customer information and how we handle abuse.

Grounded in Cloudflare’s principles, our transparency reports have necessarily evolved over time as the scale and complexity of our services has grown. While our initial reports were focused on governmental requests for customer information, our reports have expanded to cover a broader set of issues, including civil requests for customer information, legal requests to limit or terminate services, and our process for handling reports of abuse on websites using our services.

The EU’s Digital Services Act

A key driver of this year’s updates was the transparency reporting obligations in the EU’s Digital Services Act (DSA). As we have written about previously, the DSA replaced a 20-year-old law called the e-Commerce Directive, providing an important framework for addressing the legal responsibilities of online service providers.

While the DSA addresses a number of topics, an important one is transparency. The DSA sets different transparency reporting obligations for different services, establishing baseline reporting requirements for all intermediary services, more detailed reporting for hosting services, and the most extensive reporting for online platforms like social media sites and search engines. Most of Cloudflare’s services are pass-through (intermediary) services related to security and performance with limited transparency reporting requirements under the DSA, while our hosting services have some additional requirements related to our abuse-related actions.

The DSA transparency obligations align with Cloudflare’s longstanding practices and company principles toward transparency. Because Cloudflare has always strived to provide meaningful transparency into its approach to these issues, we are well positioned to comply with the specific reporting obligations set forth in the DSA. That said, while we believe that our existing reports already satisfied much of the DSA, we identified changes we wanted to make to match specific types of data or formatting called for under the DSA.

New data and a new format

Our 2024 Transparency Reports include more information than ever before, all in a new format that we believe will make the information easier to understand.

Prompted by the DSA’s requirements and the continued expansion of services we offer, the 2024 reports includes new information, including additional categories of hosted content abuse, automated steps Cloudflare has taken to mitigate phishing and technical abuse, the mean time to take action on different types of abuse reports, and information about additional types of requests for customer information that we have received. You’ll find a machine-readable version of the data alongside our transparency reports, consistent with DSA requirements. We also introduced “additional context” boxes to call out trends or notable developments during the reporting period.

To try to make all of this information as digestible as possible, we divided our transparency report into two parts. Our report on Legal Requests for Information addresses the law enforcement, government, and civil requests for customer information that Cloudflare receives in the United States and around the world. Our report on Abuse Processes addresses Cloudflare’s processes for handling reports of abuse on websites using our services and our response to legal requests to terminate or restrict access to our users.

Because we divided the report into two parts, you’ll find our ‘warrant canaries’ on the transparency report landing page of our Trust Hub and no longer in the reports themselves. The warrant canary statements about things we have never done as a company are an essential part of our commitment to transparency in how we handle both customers’ information in response to legal requests and abuse reports. All of our warrant canaries remain intact, meaning we still haven’t done any of these things.

We’ll continue to publish transparency reports twice a year, available on the Transparency page of our website as well as through an RSS feed. Our approach to these reports will continue to evolve in order to provide meaningful transparency in line with our company principles, product portfolio growth, and in line with the new regulatory environment.

Cloudflare meets new Global Cross-Border Privacy standards

2025-01-28 Rory Malone

Post Syndicated from Rory Malone original https://blog.cloudflare.com/cloudflare-cbpr-a-global-privacy-first/

Cloudflare proudly leads the way with our approach to data privacy and the protection of personal information, and we’ve been an ardent supporter of the need for the free flow of data across jurisdictional borders. So today, on Data Privacy Day (also known internationally as Data Protection Day), we’re happy to announce that we’re adding our fourth and fifth privacy validations, and this time, they are global firsts! Cloudflare is the first organisation to announce that we have been successfully audited against the brand new Global Cross-Border Privacy Rules (Global CBPRs) for data controllers and the Global Privacy Recognition for Processors (Global PRP). These validations demonstrate our support and adherence to global standards that provide for privacy-respecting data flows across jurisdictions. Organizations that have been successfully audited will be formally certified when the certifications officially launch, which we expect to happen later in 2025.

Our participation in the Global CBPRs and Global PRP joins our roster of privacy validations: we were one of the first cybersecurity organizations to certify to the international privacy standard ISO 27701:2019 when it was published, and in 2022 we also certified to the cloud privacy certification, ISO 27018:2019. In 2023, we added our third privacy validation, undergoing a review by an independent monitoring body in the European Union (EU) and declared to be adherent to the first official GDPR code of conduct — the EU Cloud Code of Conduct.

Why this matters to Cloudflare customers

Taking these privacy certifications together, Cloudflare demonstrates that we are meeting key official privacy validations in 39 jurisdictions around the world, from Australia and Austria to Sweden and the United States. An additional four jurisdictions (United Kingdom, Bermuda, Mauritius, and the Dubai International Finance Centre) are also in the process of joining and recognising the Global CBPR certifications. That’s important for Cloudflare customers as it provides reassurance that the privacy practices we have built are recognised by governments around the world.

What is the Global CBPR System?

In the last three years, governments across the world have been busy preparing two brand-new international privacy standards. A major milestone was achieved on April 30, 2024 when the Global CBPR System was established. The CBPRs are a voluntary, enforceable, international, accountability-based system that facilitates privacy-respecting data flows among members’ economies. They provide a baseline level of privacy protection for consumers through a set of rules on how to handle people’s personal information. This facilitates the free flow of data by upholding consumer privacy across participating members, despite each jurisdiction having their own individual data protection laws.

The CBPR System was developed by the Global CBPR Forum, an intergovernmental forum between the governments of Australia, Canada, Japan, Republic of Korea, Mexico, Philippines, Singapore, Chinese Taipei, and the United States. The United Kingdom is also an associate member of the CBPR Forum, as are Bermuda, Mauritius, and the Dubai IFC, signifying their intent to join as full members in the future.

Over the last year, we have been busy preparing for the launch of the Global CBPR System. On May 1, 2024 — the very first day after the establishment of the system — Cloudflare applied to join. And we have now achieved the major milestone of successfully completing audits against the requirements, meaning we expect to be the first organization in the world to be newly certified to the Global CBPR system, as well as the related Global Privacy Recognition for Processors, when companies can officially be certified, which is expected later in 2025.

What the Global CBPR System covers

The Global CBPR System contains a detailed list of fifty requirements that organizations must meet in order to be certified under the scheme. The requirements derive from the nine Global CBPR Privacy Principles, which are consistent with the core principles of the Organisation for Economic Co-operation and Development (OECD) Guidelines on the Protection of Privacy and Trans-Border Flows of Personal Data. The fifty requirements cover how organizations should collect, manage, and safeguard personal information in their custody. Organizations must meet every one of the fifty requirements in order to be Global CBPR certified. The nine principles underlying the requirements are:

Preventing Harm	Notice	Collection Limitation
Uses of Personal Information	Choice	Integrity of Personal Information
Security Safeguards	Access and Correction	Accountability

^{The nine Global CBPR Privacy Principles}

The Global CBPR certification covers the handling of personal information controlled by the organization, such as the personal details of customers, employees, and job applicants. For Cloudflare, this also includes network information — our observations about how our global cloud platform handles server, network, or traffic data generated by Cloudflare in the course of providing our services.

The related Global Privacy Recognition for Processors (PRP) certification covers the handling of personal information processed by the organization on behalf of a different organization, usually their customer. The eighteen requirements of the PRP relate to the two privacy principles most relevant when processing this information on behalf of another organization: Security Safeguards and Accountability. For Cloudflare, this covers the processing of data pursuant to the Data Processing Addendum we sign with all of our customers, chiefly, the Customer Content flowing across our network and the Customer Logs generated by those data flows. Organizations must meet every one of the eighteen requirements in order to be Global PRP certified.

A deeper dive into some of the requirements of the Global CBPRs

As noted, the key requirements of the Global CBPRs and the Global PRP cover the well-known data protection principles of notice, choice, collection limitation (data minimization), the right of data subject access and correction, providing adequate security, preventing harm, integrity of personal information, accountability, and uses of personal information. There are dozens of requirements that cover these principles, so we’ll just touch on a few of them here.

Let’s first look at the principle of notice. One of the more obvious requirements from the CBPRs is question 1:

Do you provide clear and easily accessible statements about your practices and policies that govern the personal information described above (a privacy statement)?

Being transparent about the collection and use of personal information is a key principle of privacy and data protection, and transparency is one of Cloudflare’s core commitments. Documenting our practices and policies in regard to how we use personal information allows individuals to decide if they want to provide their information, and that’s why it’s best practice for the privacy notice to be available and visible at the time the information is being collected. Indeed, this concept of providing notice is clear from Article 13 of the EU’s GDPR. Cloudflare meets this CBPR requirement by providing a clear and accessible privacy notice visible from the footer of each page on our website. We also provide a link to the notice when we collect personal data such as through a form on a webpage.

In terms of how we use personal information, question 8 asks:

Do you limit the use of the personal information you collect (whether directly or through the use of third parties acting on your behalf) as identified in your privacy statement?

It has long been a commitment of Cloudflare’s that we only use the personal information we collect for the purposes of providing the services we offer. Our business is built on providing customers with the tools to protect their network applications and to make them faster, more secure, more reliable, and more private. In our Privacy Policy, we commit that we will “only share or otherwise disclose your personal information as necessary to provide our Services or as otherwise described in this Policy, except in cases where we first provide you with notice and the opportunity to consent.” And we maintain internal documentation (in keeping with the CBPR’s accountability principle) to document the data we are processing and the purposes for which we process it.

Another key set of requirements in both the Global CBPRs and the Global PRP have to do with security safeguards. CBPR requirement question 27 asks:

Describe the physical, technical and administrative safeguards you have implemented to protect personal information against risks such as loss or unauthorized access, destruction, use, modification or disclosure of information or other misuses?

The similar requirement in the Global PRP is question 2:

Describe the physical, technical and administrative safeguards that implement your organization’s information security policy.

Cloudflare has implemented an information security program in accordance with the ISO/IEC 27000 family of standards. Details of Cloudflare’s security program are documented in Annex 2 (“Technical and Organizational Security Measures”) of Cloudflare’s Customer Data Processing Addendum, including the physical, technical and administrative safeguards implemented to protect personal information.

Related to the Accountability principle, question 46 asks:

Do you have mechanisms in place with personal information processors, agents, contractors, or other service providers pertaining to personal information they process on your behalf, to ensure that your obligations to the individual will be met?

When we have vendors who handle any of our, or our customers’, personal information, we require them to sign a Data Processing Addendum with us. This ensures the commitments we make to our customers in our customer agreements in turn flow through to our vendors, including the security requirements — holding them, and us, accountable.

More information

We are excited about the launch of the Global CBPR certifications, expected later in 2025, and we are proud that on this Data Privacy Day, we can yet again demonstrate our commitment to universally held principles for protecting the privacy of personal data.

You can find more about the Global CBPR System, the Global PRP, download a full copy of the requirements, and keep up to date with related news at globalcbpr.org.

For the latest information about our certifications, please visit our Trust Hub. Customers can also find out how to download a copy of Cloudflare’s certifications and reports from the Cloudflare dashboard.

Demonstrating reduction of vulnerability classes: a key step in CISA’s “Secure by Design” pledge

2025-01-14 Sri Pulla

Post Syndicated from Sri Pulla original https://blog.cloudflare.com/cisa-pledge-commitment-reducing-vulnerability/

In today’s rapidly evolving digital landscape, securing software systems has never been more critical. Cyber threats continue to exploit systemic vulnerabilities in widely used technologies, leading to widespread damage and disruption. That said, the United States Cybersecurity and Infrastructure Agency (CISA) helped shape best practices for the technology industry with their Secure-by-Design pledge. Cloudflare signed this pledge on May 8, 2024, reinforcing our commitment to creating resilient systems where security is not just a feature, but a foundational principle.

We’re excited to share an update aligned with one of CISA’s goals in the pledge: To reduce entire classes of vulnerabilities. This goal aligns with the Cloudflare Product Security program’s initiatives to continuously automate proactive detection and vigorously prevent vulnerabilities at scale.

Cloudflare’s commitment to the CISA pledge reflects our dedication to transparency and accountability to our customers. This blog post outlines why we prioritized certain vulnerability classes, the steps we took to further eliminate vulnerabilities, and the measurable outcomes of our work.

The core philosophy that continues: prevent, not patch

Cloudflare’s core security philosophy is to prevent security vulnerabilities from entering production environments. One of the goals for Cloudflare’s Product Security team is to champion this philosophy and ensure secure-by-design approaches are part of product and platform development. Over the last six months, the Product Security team aggressively added both new and customized rulesets aimed at completely eliminating secrets and injection code vulnerabilities. These efforts have enhanced detection precision, reducing false positives, while enabling the proactive detection and blocking of these two vulnerability classes. Cloudflare’s security practice to block vulnerabilities before they are introduced into code at merge or code changes serves to maintain a high security posture and aligns with CISA’s pledge around proactive security measures.

Injection vulnerabilities are a critical vulnerability class, irrespective of the product or platform. These occur when code and data are improperly mixed due to lack of clear boundaries as a result of inadequate validation, unsafe functions, and/or improper sanitization. Injection vulnerabilities are considered high impact as they lead to compromise of confidentiality, integrity, and availability of the systems involved. Some of the ways Cloudflare continuously detects and prevents these risks is through security reviews, secure code scanning, and vulnerability testing. Additionally, ongoing efforts to institute improved precision serve to reduce false positives and aggressively detect and block these vulnerabilities at the source if engineers accidentally introduce these into code.

Secrets in code is another vulnerability class of high impact, as it presents significant risk related to confidential information leaks, potentially leading to unauthorized access and insider threat challenges. In 2023, Cloudflare prioritized tuning our security tools and systems to further improve the detection and reduction of secrets within code. Through audits and usage patterns analysis across all Cloudflare repositories, we further decreased the probability of the reintroduction of these vulnerabilities into new code by writing and enabling enhanced secrets detection rules.

Cloudflare is committed to elimination of these vulnerability classes regardless of their criticality. By addressing these vulnerabilities at their source, Cloudflare has significantly reduced the attack surface and the potential for exploitation in production environments. This approach established secure defaults by enabling developers to rely on frameworks and tools that inherently separate data or secrets from code, minimizing the need for reactive fixes. Additionally, resolving these vulnerabilities at the code level “future-proofs” applications, ensuring they remain resilient as the threat landscape evolves.

Cloudflare’s techniques for addressing these vulnerabilities

To address both injection and embedded secrets vulnerabilities, Cloudflare focused on building secure defaults, leveraging automation, and empowering developers. To establish secure default configurations, Cloudflare uses frameworks designed to inherently separate data from code. We also increased reliance on secure storage systems and secret management tools, integrating them seamlessly into the development pipeline.

Continuous automation played a critical role in our strategy. Static analysis tools integration with DevOps process were enhanced with customized rule sets to block issues based on observed patterns and trends. Additionally, along with security scans running on every pull and merge request, software quality assurance measures of “build break” and “stop the code” were enforced. This prevented risks from entering production when true positive vulnerabilities were detected across all Cloudflare development activities, irrespective of criticality and impacted product. This proactive approach has further reduced the likelihood of these vulnerabilities reaching production environments.

Developer enablement was another key pillar. Priority was placed on bolstering existing continuous education and training for engineering teams by providing additional guidance and best practices on preventing security vulnerabilities, and leveraging our centralized secrets platform in an automated way. Embedding these principles into daily workflows has fostered a culture of shared responsibility for security across the organization.

The role of custom rulesets and “build break”

To operationalize the more aggressive detection and blocking capabilities, Cloudflare’s Product Security team wrote new detection rulesets for its static application security testing (SAST) tool integrated in CI/CD workflows and hardened the security criteria for code releases to production. Using the SAST tooling with both default and custom rulesets allows the security team to perform comprehensive scans for secure code, secrets, and software supply chain vulnerabilities, virtually eliminating injection vulnerabilities and secrets from source code. It also enables the security team to identify and address issues early while systematically enforcing security policies.

Cloudflare’s expansion of the security tool suite played a critical role in the company’s secure product strategy. Initially, rules were enabled in “monitoring only” mode to understand trends and potential false positives. Then rules were fine-tuned to enforce and adjust priorities without disrupting development workflows. Leveraging internal threat models, the team writes custom rules tailored to Cloudflare’s infrastructure. Every pull request (PR) and merge request (MR) was scanned against these specific rule sets, including those targeting injection and secrets. The fine-tuned rules, optimized for high precision, are then activated in blocking mode, which leads to breaking the build when detected. This process provides vulnerability remediation at the PR/MR stage.

Hardening these security checks directly into the CI/CD pipeline enforces a proactive security assurance strategy in the development lifecycle. This approach ensures vulnerabilities are detected and addressed early in the development process before reaching production. The detection and blocking of these issues early reduces remediation efforts, minimizes risk, and strengthens the overall security of our products and systems.

Outcomes

Cloudflare continues to follow a culture of transparency as it provides increased visibility into the root cause of an issue and consequently allowing us to improve the process/product at scale. As a result, these efforts have yielded tangible results and continue to strengthen the security posture of all Cloudflare products.

In the second half of 2024, the team aggressively added new rulesets that helped detect and remove new secrets introduced into code repositories. This led to a 79% reduction of secrets in code over the previous quarter, underscoring Cloudflare’s commitment to safeguarding the company’s codebase and protecting sensitive information. Following a similar approach, the team also introduced new rulesets in blocking mode, irrespective of the criticality level for all injection vulnerabilities. These improvements led to an additional 44% reduction of potential SQL injection and code injection vulnerabilities.

While security tools may produce false positives, customized rulesets with high-confidence true positives remain a key step in order to methodically evaluate and address the findings. These reductions reflect the effectiveness of proactive security measures in reducing entire vulnerability classes at scale.

Future plans

Cloudflare will continue to mature the current practices and enforce secure-by-design principles. Some other security practices we will continue to mature include: providing secure frameworks, threat modeling at scale, integration of automated security tooling in every stage of the software development lifecycle (SDLC), and ongoing role based developer training on leading edge security standards. All of these strategies help reduce, or eliminate, entire classes of vulnerabilities.

Conclusion

Irrespective of the industry, if your organization builds software, we encourage you to familiarize yourself with CISA’s ‘Secure by Design’ principles and create a plan to implement them in your company. The commitment is built around seven security goals, prioritizing the security of customers.

The CISA Secure by Design pledge challenges organizations to think differently about security. By addressing vulnerabilities at their source, Cloudflare has demonstrated measurable progress in reducing systemic risks.

Cloudflare’s continued focus on addressing vulnerability classes through prevention mechanisms outlined above serves as a critical foundation. These efforts ensure the security of Cloudflare systems, employees, and customers. Cloudflare is invested in continuous innovation and building a safe digital world.

You can also find more updates on our blog as we build our roadmap to meet all seven CISA Secure by Design pledge goals by May 2025, such as our post about reaching Goal #5 of the pledge.

As a cybersecurity company, Cloudflare considers product security an integral part of its DNA. We strongly believe in CISA’s principles issued in the Secure by Design pledge, and will continue to uphold these principles in the work we do.

What robots.txt does, and does not, do today

Why are we launching the Content Signals Policy now?

The Cloudflare Content Signals Policy

How to add content signals to your website

What’s next

Why Cloudflare is encouraging this conversation

Responsible AI bot principles

Principle #1: Public disclosure

Principle #2: Self-identification

A note on cryptographic verification and the future of Principle #2

Principle #3: Declared single purpose

Principle # 4: Respect preferences

Principle #5: Act with good intent

The road ahead: multi-stakeholder efforts to bring these principles to life

Encouraging U.S. technology leadership

The AI ecosystem needed to spur innovation and development

Enhancing cybersecurity with AI

Cloudflare stands ready to accelerate AI adoption in government

The need to balance the export of AI with export controls

Shaping the future of AI together

The impact

The view from Cloudflare data

Traffic trends

Client-side reports via Network Error Logging

Corroboration of throttling using internal data

If you run Internet sites for Russian users

Why transparency in vulnerability reporting matters

What is a CVE?

What is a CNA?

Cloudflare CVE issuance process

How does Cloudflare disclose a CVE?

Outcomes

CVE-2024-1765: quiche: Memory Exhaustion Attack using post-handshake CRYPTO frames

CVE-2024-0212: Cloudflare WordPress plugin enables information disclosure of Cloudflare API (for low-privilege users)

CVE-2023-2754 – Plaintext transmission of DNS requests in Windows 1.1.1.1 WARP client

CVE-2025-0651 – Improper privilege management allows file manipulations

CVE-2025-23419: TLS client authentication can be bypassed due to ticket resumption (disclosed Cloudflare impact via blog post)

Conclusion

Cloudflare’s approach to transparency reporting

The EU’s Digital Services Act

New data and a new format

Why this matters to Cloudflare customers

What is the Global CBPR System?

What the Global CBPR System covers

A deeper dive into some of the requirements of the Global CBPRs

More information

The core philosophy that continues: prevent, not patch

Cloudflare’s techniques for addressing these vulnerabilities

The role of custom rulesets and “build break”

Outcomes

Future plans

Conclusion

The collective thoughts of the interwebz

What `robots.txt` does, and does not, do today