Content Independence Day, one year on: building the business model for the agentic Internet

Post Syndicated from Arielle Weiss original https://blog.cloudflare.com/agentic-internet-bot-report/

One year ago, we declared Content Independence Day. At the time, we could see what many in the industry were beginning to sense: the fundamental economics of the Internet were shifting. AI adoption was accelerating, publishers were experiencing rapid declines in referral traffic, and AI companies were crawling the web at unprecedented scale, often without clearly declaring intent, and almost always without compensation.

We changed the defaults. For all new domains on Cloudflare, AI training crawlers would be blocked by default unless domain owners chose otherwise. We didn’t do this to wall off the web. We did it because we believed a healthier ecosystem required transparency, control, scarcity, and ultimately, a market where high-quality content could be valued and exchanged fairly.

A year later, that market has emerged. But the transformation of the Internet has happened even faster than we anticipated. In this report, we share key data points that illustrate how quickly the business model of the Internet has shifted – and what this new content market means for publishers and site owners.

Part I: The Internet has changed – faster than anyone expected

The vertical adoption curve

AI is not just another technology cycle. It is a platform shift happening at more than 2x the speed that smartphones were adopted. In just 3.5 years, over 30% of humanity — 2.5 billion active users — has adopted regular use of generative AI. The adoption curve isn’t merely steep: it’s going vertical.


The decline of the open web

Never before have we seen such a rapid change in how humans interact with information, perform work, and spend time online.


The way people use the Internet is changing dramatically. Today, for every hour spent online searching for information, only 15 minutes is spent on the open web. Traditional search behavior is collapsing as users shift to AI-driven discovery and consumption. Instead of visiting multiple sites to source and compare information, users simply type a prompt and receive a nearly instantaneous, consolidated answer.

The agentic Internet is here

This year, agent traffic crossed a historic threshold for the first time: more than 50% of traffic on the Internet is now non-human. This shift has staggering implications for publishers, content owners, and the future of the open web.


Crawlers have changed their purpose

When looking at the crawlers Cloudflare identifies by purpose, the composition of crawler traffic tells the story clearly:

  • 52% of crawler requests are now for AI training as of June 2026, up from 22% in Spring 2025.

  • Mixed-use crawlers (those blending search, agent use, and training) represent over 36% of activity.

  • Pure search crawling now represents a small and declining share of overall crawler activity, despite remaining critical for publisher visibility.


As AI training becomes a primary driver of crawler activity, the ability to distinguish between discovery and training becomes increasingly important. Mixed-use crawlers blur that distinction, putting content owners in a difficult position: choose between remaining discoverable in the agentic era, and giving away their most valuable content without compensation.

The old business model is gone

For decades, the economic model of the open web was straightforward. Content creators exchanged access to their content for visibility in search engines, which returned referral traffic. That traffic became the primary mechanism through which publishers, creators, and businesses generated economic value.

But today, that exchange is breaking down. Content is still being crawled, indexed, and used — but increasingly without corresponding traffic being returned to the source. As AI systems answer questions, compare products, conduct research, and complete tasks directly, information across the open web is increasingly becoming part of AI training and retrieval systems. The existential question this raises is simple: if content is consumed without audiences ever visiting the source, how do content creators sustain themselves?


The implications are industry-agnostic

The earliest industries to feel the impact were news organizations and media companies. Today, similar dynamics are impacting businesses across retail, software, IT, and finance. Some of the most heavily crawled categories have seen human traffic decline as much as 40% in less than one year.


Many publishers are now preparing for what they call “Google Zero” — a world where little to no traffic comes from search referrals.

The implications extend to essentially every industry. Any organization that publishes proprietary information on the Internet will need to understand how to operate in an agentic era. This dynamic matters not just to content owners, but to all of us. The Internet is a critical part of the global economy and one of the world’s most important public resources for surfacing information. Ensuring it remains healthy and sustainable is essential for all.

Part II: The market has emerged

What we built

When we launched Content Independence Day, we committed to three things:

  1. Transparency and control for site owners, enabling them to define how their content is accessed and monetized.

  2. Tools that create scarcity, shifting the balance of power back to content owners.

  3. A marketplace where content creators and AI companies of all sizes can discover, license, and determine the value of content more efficiently.

One year later, a market for monetized content is here, and the conditions for a dynamic marketplace are forming.

Transparency and control created scarcity

Historically, publishers have had limited visibility into how AI companies accessed and used their content. As referral traffic declined, that lack of visibility became an economic problem prompting publishers to seek new ways to capture value.

Cloudflare’s attribution, business intelligence, and enforcement tools gave publishers visibility into AI consumption at the network level — an enforcement mechanism far more effective than voluntary standards like robots.txt. For the first time, publishers could determine how their content was accessed and monetized. That control created scarcity, and drove a supply-and-demand content economy.

Scarcity created leverage

Publishers that exercised control over access successfully created scarcity, giving them negotiating leverage that led to better deals. For the first time, publishers gained operator-level attribution data — evidence of how often LLMs attempted to access their content, which competitive LLMs were crawling, what their most in-demand URLs were, and what their crawl-to-referral ratios looked like. This reduced information asymmetry in licensing discussions and enabled publishers to negotiate from a position of knowledge.

Leverage is changing the balance of power

This leverage has empowered our customers. As they have gained greater visibility into how AI systems access and use their content, they’ve become better equipped to understand the implications for their businesses and more confidently articulate the value of the information, brand, and audiences they have built.

As the balance of power between content owners and AI companies begins to change, a licensing economy is emerging: 

  • More than 50 publisher-AI agreements have been signed since 2023.

  • Major AI companies now actively license content, increasingly recognizing the value of differentiated and premium content.

  • Collective licensing models continue to emerge and scale.

  • Large publishers are securing meaningful licensing agreements, demonstrating that content has real economic value within the AI ecosystem.

The conversation is no longer whether content should be compensated. The conversation now is how.

The market is maturing, but inefficiencies remain

Early licensing agreements proved demand exists, but licensing today remains largely bespoke and unlikely to fully replace lost referral, advertising, and affiliate revenue. As a result, publishers are increasingly optimizing for AI consumption alongside traditional human discovery while exploring new monetization pathways.

Supply and demand remain difficult to match efficiently, and while there’s an understanding that not all content carries the same value, content valuation is still unresolved.

The Google convergence problem

No discussion of this market is complete without addressing Google’s unique role. Google remains the dominant gateway to online discovery, accounting for approximately 88% of referral traffic. But increasingly, Google is helping users consume content directly within Google-owned AI experiences.


Discovery and consumption serve fundamentally different purposes. Search drives users to content, while AI-powered experiences increasingly summarize and reuse it without requiring users to visit the source. Website owners view these activities differently because one generates traffic, while the other increasingly substitutes for it.

These differences become especially important when site owners are deciding who should be allowed to access their content and for what purpose. Most leading AI companies separate discovery crawlers from training crawlers, making it relatively simple for publishers to enable content access for one purpose or the other. Google does not. Today, Google has access to about 2x more information than leading AI companies because Google leverages a mixed-use bot that makes it difficult for customers to participate in Google’s search ecosystem without also participating in Google’s AI ecosystem. 

Unlike other AI providers, Google’s mixed-use crawler also limits transparency for site owners. Because discovery and AI access are combined into a single crawler, publishers cannot tell why Google is accessing their content or distinguish between traffic used for search and traffic used for AI experiences. They also lose the visibility and evidence that comes from being able to allow or block these activities independently at the network level.

This dynamic has accelerated demand for greater transparency and control, as well as new monetization models to better serve both content owners and AI companies of all sizes.

Part III: A unique view of the ecosystem

Cloudflare sits at the intersection of the emerging agentic economy.

More than 20% of the web sits behind Cloudflare’s network. Of the world’s most-visited websites, 36% rely on our network, and more than 40% of the Fortune 500 are Cloudflare customers. Nearly 80% of leading AI companies use Cloudflare, alongside thousands of developers and emerging AI companies.

This unique position gives us visibility into both sides of the market. We see the content owners creating content, the AI companies consuming it, and the signals increasingly connecting them. That perspective has given us a unique view into how the market has evolved over the past year, and what it now requires.

Part IV: Lessons from an emerging market

As publishers and AI companies adapt to a new agentic economy, Cloudflare has gained a clearer understanding of what the ecosystem now needs.

Transparency must become the standard

Content owners increasingly need visibility and control over who is accessing their content, how it is being used, and for what purpose. AI companies increasingly recognize that transparency builds trust and reduces friction with publishers. Visibility and enforcement are no longer security concerns alone — they have become business requirements that directly influence licensing negotiations and commercial decision making.

To help make transparency the standard, Cloudflare is continuing to invest in enhanced attribution, measurement, and publisher controls that give content owners greater visibility into and control over how their content is accessed and used.

As the industry shifts toward greater transparency, we believe that verifiable bot self-identification and declarations of crawl intent are fundamental to a sustainable ecosystem. Today, more than one-third of crawler activity on our network still comes from mixed-use bots that make it impossible for content owners to distinguish crawl intent. We are actively engaging with the ecosystem and investing in tooling to help drive that number to zero by this time next year.

Better AI requires better signals

Over the past year, it has become increasingly clear that AI companies need more than access to content. They need better ways to determine what to access, when to access it, and how frequently it has changed. Indiscriminate crawling wastes compute for AI companies and creates unnecessary bandwidth burden for publishers, reducing efficiency across the ecosystem.

We believe better answers require better intelligence. We are investing in real-time freshness signals with richer trust, quality, and relevance to help AI companies discover differentiated information while reducing unnecessary crawling across the web.

Markets need better discovery before better pricing

We believe better discovery must precede better pricing. In order for the market to mature, publishers and AI companies need better information about one another. We are investing in richer market intelligence, content signaling, and capabilities that improve discovery between both sides of the ecosystem, laying the foundation for more scalable market mechanisms over time.

Part V. Building the infrastructure for the agentic Internet

One year ago, Content Independence Day introduced a simple idea: content owners should have greater control over how AI companies access and use their information.

Over the past twelve months, that control helped give rise to a market. Transparency created scarcity. Scarcity created leverage. Leverage accelerated licensing. What was once a theoretical discussion about the future of AI and content has become an active market, with publishers, AI companies, and technology providers all adapting to a new set of economic realities.

The market is now entering a new phase that demands new infrastructure. As the Internet becomes increasingly agentic, the underlying systems that support it must evolve to handle permissions, licensing, and commercial transactions at scale. Content owners and AI companies need more efficient ways to connect and exchange value. We believe these capabilities will converge into programmable, scalable mechanisms for content discovery and monetization – reducing friction while unlocking richer forms of value exchange.

Cloudflare’s role is to build the infrastructure and business intelligence, and contribute to the standards that allow the market to determine value more efficiently and help publishers and AI companies participate in a healthier, more dynamic content economy.

The Internet has always evolved. This evolution is faster and more consequential than most. But with the right infrastructure, the right incentives, and a commitment to transparency, we believe the agentic Internet can become more sustainable, more efficient, and better for everyone.

Methodology:
The data in this report is compiled from Cloudflare Radar and the Cloudflare Investor Day 2026 Presentation.

Cloudflare Radar is a hub showcasing global Internet traffic, attack, and technology trends and insights. Powered by data from Cloudflare’s global network, Radar was created to help anyone understand what is happening on the Internet from a security, performance and usage perspective.

Cloudflare’s unique understanding of the Internet comes from its global network — one of the world’s largest, spanning 330+ cities in 100+ countries — and aggregated and anonymized data from Cloudflare’s 1.1.1.1 public DNS Resolver, widely used as a fast and private way to browse the Internet. More than 20% of the web sits behind Cloudflare’s network.

Making AI search smarter

Post Syndicated from Matthew Conroy original https://blog.cloudflare.com/making-ai-search-smarter/

Search drives most experiences on the web. It’s how we get things done, and how nearly everything on the web gets found — the creators, the merchants, the answer to whatever you just typed into a box. For nearly 30 years, that discovery journey ran on a simple bargain: let a search engine crawl your content, and it sends you visitors. You turned those visitors into a business — through ads, subscriptions, or just the audience itself. Being discoverable and getting paid were the same thing. A year ago, on the first Content Independence Day, we drew a line to defend that bargain in the AI era. But a line in the sand was only a first step. Since then, the prevalence of AI search in consumers’ lives has only accelerated as more than 50% of traffic online is non-human. The threat is no longer a handful of training crawlers you can block; it’s search itself being rebuilt around AI answers.

Today’s answer engines read your page and hand the user a summary, so the visit — and the revenue that depended on it — isn’t needed. We see it firsthand, and independent research backs it up: a 2025 Pew Research Center study found that when Google shows an AI summary, users clicked on a traditional search result link just 8% of the time (about half as often as when there’s no summary) and clicked a link inside the summary only 1% of the time. That leaves our customers in a bind: opt out of AI and be hard to find, or opt in and deliver significant value to users while seeing increasingly little in return. Our customers want to be found and compensated for the value they provide, and right now they’re forced to choose.

Today, we’ve announced new bot options to help our customers better control who can access their site and what they can do with it. But blocking was only step one: saying “no” protects content without rebuilding the business models that sustain it. So, it’s time to start building the new economic model of the Internet, starting with search.

Rebuilding the bargain

Transparency and control are the foundation, but more is needed. In 2025, we laid out our foundation via a set of responsible AI bot principles: bots should be transparent about who they are and what they’re for, respect site owners’ choices, and act in good faith. Our tools hold bots to that bar. But enforcing good bot behavior doesn’t make AI search any better for the people relying on it, and it doesn’t send a dollar back to the creator whose work made the answer possible. We can do more than help the web say “no”; we can help rebuild what it says “yes” to.

So today, we’re announcing two initiatives that move from defense to offense and start putting both halves of that old bargain back together.

Make AI search smarter: By using the signals we see across our global network, like what’s fresh, what’s high quality, and what’s actually changed, we can help answer engines surface the most relevant content and reduce unwanted crawling. People searching get better answers, while costs are reduced for both AI companies and site owners if webpages are only recrawled when they’ve changed.

Pay creators for the value they provide: When your work is used to answer someone’s question, you should be rewarded instead of just being scraped for free. And you should be able to see what’s being used and what people are asking. This should be a real revenue stream, and an incentive to keep producing original content worth finding.

Making search smarter

Today we’re launching a research program to make AI search smarter and stop our customers footing the bill for crawls that produce nothing new.

More than 20% of the web sits behind Cloudflare’s network, which gives us a unique perspective. We can tell which pages have genuinely changed and which ones people and agents are flocking to. Through this program, we will explore using signals our customers have chosen to share about the freshness of their content, and we will combine those with our own insight into traffic flows, both human and bot. For answer engines, that’s a roadmap to high-quality content. For our customers, it provides a view of what users are actually asking, and how their content shows up in AI results. The aim is to measure two things: how much these signals help answer engines to surface fresher, higher-quality content, and how much unnecessary crawling they cut out.

That second benefit, cutting unnecessary crawling, is bigger than it sounds. Cloudflare data suggests that more than 50% of crawl traffic from good bots goes to re-fetching pages that haven’t changed — and that number is likely to climb as crawl volumes do. A signal that just says “nothing’s changed here” lets a crawler skip the trip. That saves the answer engine compute. More importantly, it saves site owners from serving and paying for requests they never needed to. 

The program is neutral by design: our goal is to make it work for every answer engine willing to play fair. It’s limited to search. We aren’t sharing any content, and nothing is used to train foundation models. We intend to publish what we learn, including the benefits to site owners such as better content discoverability and reduced server strain. We plan to make the capability broadly available later this year and reduce unnecessary crawling across our network.

From Pay Per Crawl to Pay Per Use

Last year we launched Pay Per Crawl so publishers could charge AI companies for crawling their content. It was a real start, but crawling is a crude measure of value. A single page might be crawled once and then cited in thousands of answers, or crawled over and over and never used at all. Creators want to be paid fairly for the value they provide.

So we’re starting to shape Pay Per Crawl into Pay Per Use. We’re running experiments with top AI companies, like Ceramic.ai and You.com, and the arrangement is straightforward: organizations can bring their payment models and easily scale them to content owners across the Cloudflare network.

Ceramic has built what it calls a “pay-per-query” model, so publishers who opt in can be paid when their content appears in Ceramic’s search results. This means payment is designed to follow the value the work delivers rather than the number of times a crawler happens to fetch it.

“To scale the future of AI search, we need a partner with massive reach and a shared commitment to transparency and fair compensation,” says Anna Patterson, founder and CEO of Ceramic.ai. “Cloudflare allows us to easily and programmatically scale our operations. By bringing our pay-per-query model to their network, we ensure millions of content owners can seamlessly opt in to be compensated every single time their content appears in our search results.”

In addition to compensation, content owners participating in the Cloudflare/Ceramic program will unlock new reporting to help with answer engine optimization (AEO). Customers can finally see the top queries leading to their content appearing in search results, the specific webpage and snippet, their average search result ranking position, and more. This is the first of many products we’ll be launching to help our customers with discoverability.

This is just one emerging approach. Another comes from You.com: agents can pay on demand for a specific piece of premium content they need, without any upfront commitment. New payment models from AI providers are being tested (e.g., Pay per Query, Pay per Result, etc.) and we have the infrastructure to support them all. 

We want to be honest that this is an experiment. There’s a lot to learn, including exactly how this holds up at the scale of the Internet. We’ll work that out with our partners and our customers as we go, and share what we learn. But the goal is clear: AI search companies get fresher, better-grounded answers, and the customers whose work makes the answers possible get paid when they help. Cloudflare’s job in all of this is to provide the infrastructure layer that makes this market flourish. 

We think this is a more natural fit for where the economics of search are heading. The old, human web optimized search to save time — providing excerpts, ten blue links, and a click. The agentic Internet is different: an agent can read fast and search continuously. Search is becoming something an agent does dozens of times to answer a single question, closer to a utility than a destination. In that world, the unit that matters isn’t the crawl or the click. It’s the outcome. Pricing the outcome, and paying the people who made it possible, is how the web continues to thrive.

The headline we want to earn

A year ago on Content Independence Day, the headline was a default ‘no’: AI can’t crawl without compensation. This year, our focus is on giving our users more products and controls to say ‘yes’ and bring more benefits with it.

Today’s announcements are just the beginning. Cloudflare’s research project is designed to see if our signals produce better results with less crawling. Pay Per Use is a promising direction we’ll experiment with alongside partners who believe that content creators deserve fair compensation for their work. This is how the last 30 years of the web got built too: somebody runs the pilot that turns “the model is broken” into “here’s the new model,” one experiment at a time. We believe there’s value to our customers to be discoverable in this new agentic era, and to optimize their content for maximum discovery. But they should be able to do this without giving away their most valuable creative assets for free.

The web is changing, and the business models it’s relied on are changing with it. The old Internet was open, neutral, and worth contributing to. We have a rare chance to keep it that way, and to build the business models that fund it in the future. Smarter answers for humans and agents asking the questions. A fair deal for the people whose skill, creativity, and commitment makes the answers worthwhile. That’s how we pursue Cloudflare’s mission: to help build a better Internet.

Happy Content Independence Day!

Building on the open, agent-ready web? If you are interested in learning more about the Ceramic and You programs, please fill out this form. If you’re building an answer engine and want to crawl smarter, we’d love to hear from you too: [email protected].


Your site, your rules: new AI traffic options for all customers

Post Syndicated from Jin-Hee Lee original https://blog.cloudflare.com/content-independence-day-ai-options/

One year ago, we declared the first Content Independence Day, and we gave website owners the means to take back control of their content. The deal between crawlers and website owners that had held up for 30 years — we crawl you, and you get referrals — was no longer true. AI was taking everything and sending back nothing, presenting an existential threat to website owners. And so we launched a one-click “Block AI Bots” option, along with a Pay-Per-Crawl marketplace.

A lot has changed in a year. Last July, conversations around “AI bots” centered around blocking AI training without compensation, pointing to the win–lose deal where content was used for model training with no value driven back to the website owner. But a desire for more nuance has emerged: Content owners still want to be able to protect their content, and they should be compensated for the original content that they work hard to create, curate, and share. We also know that locking down content isn’t a one-size-fits-all solution; website owners want more options than resorting to “block all automation, every time.”

If you run a small site, the problem isn’t just that someone could train models on your content — it’s that nobody can find you in the first place. So you have to make a Faustian bargain: either show up in search and let AI train on you, or risk losing discoverability. This unfairly advantages incumbent search providers if they use the same bots for both search and training; and this unfair advantage incentivizes new players to be evasive as they try to close the competitive gap.

Now, AI can be anything

Today, AI can be in anything. Google search has changed from being sorted by AI to being a full answer engine that answers your question directly on the results page. And Google is not unique in this position — this is the direction in which “search” is moving.

We could debate the cutoff for what qualifies as “AI” today, just to find that the standard changes tomorrow. So, instead of defining a bot primarily as “AI” or not, our updated approach to classification will ask deeper questions about bot or agent behavior: What are they doing on my site? What are they storing? And how will they reshare my content?

A pragmatic taxonomy

To address these questions, we need a more nuanced view — a pragmatic taxonomy that aligns with the AI use cases our customers care about. So we are opening the discussion beyond AI training alone and focusing on three AI use cases that we want all customers to be able to manage:

  • Search: any behavior that collects or indexes your content, so it can answer questions about it later. The key is that Search is proactively building a database of your site to later respond to queries with. Site owners should expect to get referral traffic or other equitable compensation as a result.

  • Agent: automated behavior that is acting, usually in real time, on a person’s behalf, to get something done right now. This includes chat fetch bots (e.g., ChatGPT-User) and browser-use agents (e.g., Gemini or Claude driving Chrome). The key is that it visits your web application in order to complete a job, and often there’s a human waiting on the other end.

  • Training: a crawler taking your content to train or fine-tune a model. The key is that your data is permanently absorbed into the underlying architecture of the AI to improve its capabilities.

Many popular crawlers on the web fall into one of the classifications above; some fall into multiple. We classify plenty of other behaviors beyond the three above — including ads verification, feed fetching, and agentic transactions (more on this below). But we believe it should be simple for all website owners to manage access for these three AI-centered use cases. We believe that bot operators should separate their crawlers because that creates more transparency for website owners: allowing them to better understand why a given crawler is visiting them, as well as to better manage the access they extend to that crawler. If a company runs automation that builds Search indexes, acts as an Agent, and collects data to Train their models, then we strongly encourage that company to separate the automation into three separate crawlers.

We want a classification system that is scalable and representative of the world of automated traffic as it evolves. Tracking a bot’s purposes is nothing new, but our new taxonomy involves a few updates that better represent the state of bot traffic today. Most notably, we want to recognize that bots that have multiple purposes should be tracked with all purposes, not just one of them.

New options to manage AI traffic

We want to provide more options for managing different kinds of AI traffic, to all website owners on the Cloudflare network.

The managed preset to “Block AI bots” that we’ve announced in the past included single-purpose bots that crawled data for model training, as shown below: 


Screenshot of the existing setting to manage AI bot traffic on July 1, 2025.

But not all AI use is the same, and we want our customers to have the controls they need. So, we’re launching the ability to manage AI traffic based on three major use cases: Search, Agent, and Training crawlers. With these new options, our customers can more finely tune how they manage AI bot traffic — including customers on our Free tier.


Screenshot of the new options to manage AI bot traffic on July 1, 2026.

Setting new defaults

On September 15, 2026, we’ll be setting new defaults for each of these three classifications. For all new domains onboarding to Cloudflare, the categories of Training and Agent will be blocked by default on the pages that display ads, while Search will remain allowed by default. 

An ad is a signal that a website owner meant for a person to land there and see it — something monetizable that fuels the business. So, on those pages, we treat human attention as the end goal, and keep away the bots that may prevent this attention (i.e., Training and Agent bots). On the other hand, Search is the behavior that most naturally funnels back visitors, and we believe it’s in the interest of most site owners to allow this.

Another change that will apply on September 15 is that multi-purpose crawlers (specifically those that combine Search with Training) will be allowed/blocked according to all of their behaviors, in line with our call for transparency for website owners. Since the defaults will be enforced by the most restrictive applicable rules, multi-purpose crawlers such as Googlebot, Applebot, and BingBot will be blocked by customers who have selected to block Training (either through the new options to manage AI traffic, or through the legacy Block AI bots service).

Of course, customer choice is paramount: if a website owner wants to opt out of these new default configurations, they can easily mark this in their Security settings any time leading up to September 15, which will confirm that they want no changes on Training crawlers that also crawl for Search purposes. We’ll also continue to notify customers of the upcoming change to defaults as we approach September 15 to ensure that customers who want to choose settings different from the defaults have the opportunity to do so.

BotBase: a new visibility plane for Enterprise customers

We’re also excited to launch a major visibility update as a new feature of Enterprise Bot Management. As Cloudflare’s directory of tracked bots has grown, so has the desire to manage these bots in sensible groupings and to understand more detail about a particular bot. 

Introducing BotBase. BotBase is our new database tracking all known bots, including Verified bots and agents. This database provides a comprehensive, searchable view of our entire directory of bots, directly on the Cloudflare dashboard. We’re tackling visibility first, but, later this year, we’ll expand BotBase to provide a direct control center for known automated content on your website.

With this new view, Enterprise Bot Management customers can see the full catalogue of all Verified bots/agents and where they are classified in this updated taxonomy — a view we’ve never shown dynamically on the Cloudflare dashboard before. Customers who want to precisely target a specific bot can also easily filter for all traffic from this bot, plus copy the detection ID to use in Security rules. All of this is now live within a dedicated page, which can be accessed through the Bot Management configuration card

As we built BotBase, we wanted to account for all of the pieces of information that would allow us to build scalable, powerful insights from bot to bot. One of these pieces is a cornerstone for our updated taxonomy, which is based on what a bot may do on your site — its behavior. We separate these classifications as shared below, and each bot is classified with one or more of these behaviors.

Bot classification

Behaviors and uses

Search

Crawling to scan your site to help it appear in search engine results

Agent

User-directed agents visiting a page on behalf of a human

Training

Crawling to train or fine-tune models

Transact

Checkout actions on behalf of users

Data Collection

Includes price scraping, competitive intelligence gathering, and third-party analytics

Security Testing

Includes vulnerability scanning and penetration testing

SEO

SEO crawling, site auditing, accessibility checks

Ads Verification

Ad placement verification, ad fraud detection

Social / Link Preview

Link previews for social platforms and messaging apps

Feed Fetching

Includes RSS readers, podcast aggregators, and news feed bots

Monitoring & Operations

Includes uptime monitoring, webhooks, and health checks

Bold italicized rows indicate the new configurable options that are available to all customers.

How does a crawler use my content?

Another piece of information we’ve heard is important to our customers is a bot’s content use — what a bot may keep and reshare after it has crawled your content. To address this, we are building capabilities for Bot Management customers to select and block based on the “content use.” This setting can be set to one of three levels, from least to most permissive:

  • immediate — interact, but store and reuse nothing

  • reference (default) — index, excerpt, and link back

  • full — summarize and reproduce

These values can be combined with bot classifications to express nuanced rules, such as “allow all bots that are used for Search, SEO, and Ads Verification, but only up to the reference use level.” This allows website owners to make decisions in sensible groupings rather than manage individual bot-by-bot rules.

To further support this, starting today, we’re testing a new signal, use, that extends Content Signals and lives in your robots.txt. This extends the three fields of the first version of Content Signals with a fourth, optional field that expresses the same preference as above:

  • use=immediate

  • use=reference

  • use=full

As with all other items listed in the robots.txt file, the values of content use signal a website owner’s preference, rather than issuing blocks directly. We’re now adding support for this extension: all customers who have already enabled managed robots.txt — which prepends the preference to robots.txt that crawling for search is okay, but that crawling for training is not — will now have the additional preference of use=reference added to their robots.txt.

# Cloudflare Managed content with original Content Signals

User-agent: *
Content-Signal: search=yes,ai-train=no
Allow: /

The contents of Cloudflare managed robots.txt with the original Content Signals values.

# Cloudflare Managed content with the new content-use signal

User-agent: *
Content-Signal: search=yes,ai-train=no,use=reference
Allow: /

The contents of Cloudflare managed robots.txt with the added parameter.

We’re also starting to track content uses for every bot in BotBase, and when we discover a bot abusing these signals, it will lose the “Verified” status, resulting in it no longer being allowed. Today, bots that reproduce in full cannot have the Verified status.

What does it mean for a bot to be Verified?

Speaking of “Verified,” the definition of Verified is being updated to reflect the upcoming changes to default allow and block baselines. Previously, all Verified bots were allowed by default, which was reflected in our basic Bot Fight Mode offering to block unwanted automatic traffic and in our rule templates for Enterprise Bot Management customers. 

Starting today, we’re adjusting this to add nuance: non-verified bots are still default blocked, but we are no longer viewing Verified as “default allowed.” Now, the Verified label makes a bot allowable with its relevant category, meaning the allowed category (e.g., allowing Search) will determine what is allowed to access a website.

To balance this change, we’re opening up the process of becoming a Verified bot, and making it more transparent, too. To “Verify” a bot, a bot operator needs to show two things: that you represent yourself honestly, and you don’t abuse the access that honesty earns. And to make this easier on bot operators, we’re currently building management tools for bot operators to better ensure they are accurately represented by Cloudflare’s classification system (to be announced in the near future). 


A preview screenshot of the upcoming platform built directly for bot operators who are part of or want to be a part of BotBase, the next generation of the Cloudflare Bots Directory.

Experimenting with transitive trust

One more piece: The bot (or agent) at your door increasingly isn’t run by the company that built it. A platform like Cloudflare’s Developer Platform runs automations for thousands of different operators at once, ranging from enterprises to a developer you’ve never heard of. You might trust Stripe, but you don’t necessarily trust everyone who wired Stripe’s tools into a weekend project.

We call the case of (site owner → bot owning company → end user) a matter of transitive trust, and we’re proposing to utilize the existing Forwarded header as defined in RFC 7239 that rides along with the request and allows “proxy components to disclose information lost in the proxying process.” 

This is similar to what X-Forwarded-For does for IP addresses, or X-Forwarded-Host does to preserve the original Host header. So when a website owner says, “Allow this operator,” that preference will hold, whether the operator comes to you directly or through three layers of intermediaries that are trusted. More details can be found in our documentation, with a brief example to show the format below.

Forwarded: for="openai"

Adding the extension with content-use discussed above, the header addition would look something like the below, specifying how the operator says they will use the content they access:

Forwarded: for="openai";use="reference"

This also lines up the incentive model we want to foster. Losing trusted status across the more than 20% of web domains that sit behind Cloudflare is a deterrent with teeth. Trust becomes something you can carry with you, and something you can lose.

However, as bot traffic blends with human traffic, it’s possible that this system of transitive trust doesn’t carry beyond the users who can afford to be identifiable. The measures we are proposing today help to convey trust, but they won’t fit the entire web for all time. Small sources of traffic need privacy, and companies that want to preserve their own privacy commitments should be able to explore fair building blocks for the future of an agentic Internet, such as private rate limiting.

Set your terms today

These are small changes that move in the same direction: site owners get more control over who uses their content, and how. We believe the new defaults we discussed today and will soon implement are ones that encourage transparency and are more reflective of where the world is going.

Of course, the ebbs and flows of the web will continue shifting under us, and we’ll keep adjusting with it. But the direction won’t change, because it’s the one Cloudflare started with: a web ecosystem built around trust. Where the people who make things can decide how they’re used — and one where being honest about what you do earns you more access, not less.

These new options to manage AI traffic are live now, and can be configured by all existing customers in their zone Settings. Not on Cloudflare yet? Start for free to set the traffic controls that you want today.

Happy Content Independence Day.


Magnifica Humanitas, или за кожата на един изкуствен интелект

Post Syndicated from Йовко Ламбрев original https://www.toest.bg/magnifica-humanitas-ili-za-kozhata-na-edin-izkustven-intelekt/

Magnifica Humanitas, или за кожата на един изкуствен интелект

В първите дни на юни прочетох призива на Anthropic за координирано забавяне на темпото на развитие на изкуствения интелект (ИИ), тъй като рискът човечеството да загуби контрола върху тези технологии ставал все по-реален. В главата ми зазвуча трескавият ритъм на летния шлагер „По-полека“ на Стефан Вълдобрев и „Обичайните заподозрени“ – и най-вече онази част:

По-велико. По-красиво. Малко по-такова.
По-дебело. По-широко. Опаковано. Готово. (…)
По-високо. По-нагоре. Повече по много.
По-богато. По-голямо… По-добре по-ново.

Досещате се защо…

Anthropic е американска компания за ИИ, разработваща една от най-известните серии от големи езикови модели – Claude. Според нея съвсем скоро ИИ ще започне да се самоусъвършенства, без да има нужда от хора да му помагат за това (както е в момента).

От Anthropic не за първи път отправят предупреждения за един или друг риск. И това се вписва добре в публичния образ, който компанията преследва, защото една от изтъкваните ключови причини неколцина съоснователи на OpenAI да напуснат изследователската лаборатория през 2021 г., за да правят отделен стартъп, е стремежът към по-голяма безопасност в разработването на ИИ. Трогателен морален ангажимент, дума да няма.

ИИ: Първородният грях

Ако и на вас е омръзнало да четете екзалтирани или дистопични статии за изкуствения интелект и се чувствате загубени в понятията и темата, един умерен, информативен и критичен поглед върху развитието на тези технологии към днешна дата ви предлага Йовко Ламбрев. От човеколюбива гледна точка.

По принцип Anthropic обичат да мятат медийни бомби с декларации и предсказания на ръба на здравия разум. Лидерската им роля в този сегмент им дава основание да „изпускат“ информация от кухнята с легитимността на някой, който предполагаемо знае нещо повече от останалите. Това пък си е чудесен подход за поддържане на интереса на публиката и инвеститорите.

Впрочем тази декларация за забавяне на развоя беше направена на 4 юни. И на 9 юни, съвсем напук на своя призив, Anthropic обяви поредния си по-велик голям езиков модел Claude Fable 5, който само няколко дни по-късно, на 12 юни, беше спрян от американското правителство с аргументи за опазване на националната сигурност. И това ако не е двойна доза ирония на съдбата…

Claude Fable 5 всъщност беше анонсиран като „обезопасена версия“ на най-могъщия (за момента) модел на Anthropic – Claude Mythos 5, който според маркетинговите описания на създателите му притежава изключителни способности за откриване на софтуерни уязвимости, а може да се ползва и за сложни анализи в областта на биологията, медицината и здравеопазването. Заради огромния си капацитет за злонамерена употреба (например за кибератаки) използването на Mythos 5 е възможно само по специална програма за доверен достъп под надзора на правителството на САЩ, наречена Glasswing. А Fable 5 реално е версия на Mythos 5 за масовия потребител и бизнеса, с вградени филтри за сигурност, които да разпознават и блокират потенциално опасни запитвания и злонамерена употреба.

На 11 юни, или два дни след публикуването на модела, изпълнителният директор на Amazon Анди Джаси изразил безпокойство пред Белия дом относно евентуалните възможности за заобикаляне на предпазните механизми на най-новото творение на Anthropic. За пълнота е важно да се уточни, че Amazon е инвеститор в Anthropic (с около 13 млрд. долара до момента и ангажимент за още 20 млрд.), но е още по-голям инвеститор в OpenAI (с уговорена рамка за 50 млрд. долара). Източници на няколко медии потвърждават информацията, че Анди Джаси се е обадил директно на министъра на финансите Скот Бесент във връзка с предполагаемите уязвимости, което е изиграло съществена роля в „стряскането“ на администрацията.

До сутринта на следващия ден паниката в Белия дом вече е стигнала най-високите нива. След няколко разговора между администрацията и Anthropic, в които от компанията отричат да има реален проблем, и настояват, че казусът е „недоразумение“, а Белият дом твърди, че констатациите на Amazon са били прегледани и потвърдени от Агенцията за национална сигурност на САЩ, Anthropic е била притисната доброволно да спре предлагането на Fable 5. Компанията е отказала да го направи, без да получи повече информация и време, за да я анализира.

В крайна сметка, недоволен от хода на разговорите, Белият дом се е позовал на регулациите за контрол на износа и опазване на националната сигурност и е разпоредил Anthropic да спре достъпа на всички чужди граждани до Fable 5 и Mythos 5 – както в страната, така и извън нея, в т.ч. и на собствените си служители, които не са американци. И всичко това заради хипотетична уязвимост, която не е конкретно описана нито в разпореждането на правителството, нито по някакъв друг начин.

Да разпоредиш една технология да не бъде достъпна за чужденци в днешния глобален свят не е рестрикция, а дърпане на щепсела.

Защото на практика това няма как да бъде изпълнено. Затова и Anthropic реши да спре достъпа на всички – просто е по-лесно.

Любопитен факт е, че драмата с Anthropic в САЩ даде повод и за шеговита рекламна кампания на френските разработчици на ИИ Mistral AI, която социалните мрежи така охотно поеха и усилиха през последните две седмици, че голяма част от публиката започна да се чуди дали моделът Le Chaton Fat („дебелото коте“) е реалност, или шега.

Magnifica Humanitas, или за кожата на един изкуствен интелект
Шеговит анонс, че Европа забранява модела „Дебелото коте“, защото е твърде голям. Около час по-късно анонсът беше свален

Все пак две седмици по-късно, на 26 юни, правителството на САЩ смекчи ограниченията върху Mythos 5, позволявайки на Anthropic да предостави достъп до него на стотина американски организации, включително големи корпорации и правителствени агенции. Организациите, които са одобрени да използват Mythos 5, вече можеха да позволят на своите служители с чуждо гражданство да използват модела. Това обаче върна ситуацията към времето отпреди пускането на Fable 5, защото Mythos 5 така или иначе вече беше достъпен при тези условия за организациите в рамките на програмата Glasswing.

Междувременно стана известно и че само няколко дни преди спирането на достъпа до Fable 5 правителството на САЩ е поискало от Anthropic да извади от програмата Glasswing най-големия мобилен южнокорейски оператор SK Telecom заради подозрения за връзки с Китай. Няма данни компанията да се е подчинила. В този ред на мисли, дали Amazon не е влязла в ролята на полезния идиот, за да даде повод на администрацията на Тръмп да се възпали по темата, е въпрос, чийто отговор вероятно никога няма да научим.

В крайна сметка на 30 юни наложените рестрикции бяха отменени и Anthropic обявиха, че Fable 5 ще бъде отново достъпен от 1 юли. Но сагата повдигна по-широки въпроси относно общата посока на американската политика в областта на ИИ и по-конкретно до каква степен администрацията на Тръмп ще се стреми да контролира бъдещи нови модели. На същия 26 юни, когато правителството смекчи ограниченията върху Mythos 5, от OpenAI обявиха, че отлагат премиерата на GPT 5.6 по настояване на правителството.

Ако преди цялата тази история „безопасност“ в разработването на ИИ значеше моделът да отказва помощ на злонамерени потребители, сега вече може да означава своеобразна проверка на паспорта и глобално изключване заради уязвимост, която никой не назовава. Ако се огледаме около себе си, ще видим, че мнозина от нас работим в компании, пълни със способни хора, които не са американци. Аз също нямам американски паспорт, затова нищо не ми пречи да говоря направо: да ограничаваш достъп до даден инструмент въз основа на месторождение не е стратегия, нито законосъобразност, а чист митнически контрол.

ИИ: Нова надежда или невидима заплаха

В продължението на своя текст за изкуствения интелект Йовко Ламбрев засяга темата за предизвикателствата и възможностите, пред които ИИ ни изправя. И отново ни подканва да проведем разговор, в който да се включат и хора извън сферата на технологиите.

Темата със сигурността е щекотлива. Първо, защото двете страни на този спор твърдят различни неща, които обаче нямат намерение да доказват, и за конкретиката може само да се спекулира. При всички случаи обаче такива крайни решения не трябва да бъдат вземани без задълбочена и прозрачна техническа дискусия съгласно установените стандарти и добри практики, в която казусът да се разнищи детайлно. Противното означава, че с политическо решение може да се спре всяка технология. И това не е управление на сигурността, а е упражняване на власт.

И второ… повечето специалисти по сигурността така или иначе отдавна смятат, че заложените в ИИ защитни бариери са нещо като легналите „полицаи“ по пътищата – може и да забавят някого, но не са никаква гаранция за сигурност, ако насреща има злонамерен противник.

Уви, в този скандал няма изцяло добри герои.

Какво обаче означава това развитие на събитията за Европа и всички останали, които сега и в бъдеще няма да разполагаме с американски паспорти?

Ние в ЕС малко прекаляваме с отчаяната мантра за нашата зависимост от американските технологични гиганти. Зависимост има, но тя е установена по инерция, а не от нужда, и може да бъде минимизирана. Защото европейски алтернативи има. Има дори български, някои от които изключителни. Има забележителни немски и френски технологии, включително в сферата на сигурността, както и ИИ инструменти, които печелят стабилна база клиенти в конкуренция с най-големите. Без медийни бомби и без налудничави декларации и предсказания.

Стремежът на ЕС към по-голям дигитален суверенитет, което пък е стратегически важно за нашата обща европейска сигурност и икономическо развитие, всъщност в бъдеще ще дава допълнителни предимства на европейските технологични стартъпи. Дори когато са малко по-бавни или отстъпват с някой процент в различни класации. Европейският бизнес е прагматичен и знае от опит, че не е необходимо всеки да кара кадилак, когато и шкода върши същата работа.

Най-добрата новина е, че в основата на т.нар. технологичен суверенитет на Европа застава Стратегията на ЕС за отворения код заедно със солидна допълнителна законодателна рамка, част от която са и предложенията за:

Щеше да е чудесно, ако всичко това беше започнало едно или даже две десетилетия по-рано, но… сега поне не е нужно да се убеждаваме взаимно колко е важно и наложително.

Европа се ориентира правилно и бързо и по отношение на дистанцирането от американски технологични компании със съмнителна репутация – като Palantir, които рязко загубиха договора си с германското правителство през май, а с френското – през юни тази година. При това Франция прекрати договор, който току-що беше преподписала (през декември 2025 г.).

Случайно или не, но изпълнителните директори на три от големите ИИ компании – Дарио Амодей (Anthropic), Сам Олтман (OpenAI) и Артур Менш (Mistral AI) – бяха сред поканените на работен обяд в Елисейския дворец ден преди срещата на Г-7 в Париж, на който са се обсъждали регулациите в сферата на ИИ. Има непотвърдени медийни спекулации, че след този обяд са обсъждани и идеи за стратегическо присъствие на Anthropic на територията на ЕС.

За съжаление, важният разговор около ИИ за човека и човечеството продължава да е неглижиран. Затова пък само няколко дни преди гореописаните драми в технологичния свят папа Лъв XIV върна темата на дневен ред, като представи в Синодалната зала във Ватикана първата си енциклика, озаглавена Magnifica Humanitas. За защитата на човешкото достойнство в ерата на изкуствения интелект“. На събитието е присъствал и Крис Ола, един от основателите на Anthropic, което е симптом за необичайността на времето, в което живеем. Колко често виждаме институция, която традиционно мисли в мащаба на векове, да коментира индустрия, която планира в хоризонта на тримесечия?

Вместо за забавяне или регулации обаче папа Лъв XIV призовава за етично управление на ИИ, настоявайки за подновен фокус върху справедливостта, човешкото достойнство и „нов хуманизъм“ в дигиталната епоха.

Основното му опасение не е, че ИИ е зло, а че дори един морален ИИ е безполезен, „ако този морал се определя от неколцина“. Той посочва, че движещите сили на технологията са частни транснационални компании, които надминават капацитета на много правителства, и иска тази технология да бъде в услуга на хуманността, преди да е превърнала хората в зъбни колелца на машина за ефективност. Не е нужно да сме религиозни, за да признаем, че това е по-ясно описание на проблема с властта от повечето неща, които самата технологична индустрия казва за себе си.

Призовани сме да се замислим за големите „строителни площадки“ на нашата епоха и да се попитаме: какво градим? Тъй като технологичното развитие бързо трансформира езиците, взаимоотношенията, институциите и формите на власт, ние, вярващите, трябва и можем да изберем по кои проекти да работим и по какъв начин, така че да защитим и оценим величието на човечеството, което ни е дадено като дар. Това е избор не само за нашето бъдеще, но и за нашето настояще, тъй като изкуственият интелект и другите нововъзникващи технологии вече са част от нашето ежедневие.

Папа Лъв XIV, Magnifica Humanitas, 2026 г.

Бъдещето не е нещо, което ИИ причинява на хората. То е такова, каквото именно ние, хората, по един или друг начин избираме да изградим. Проблемът е, че добрата версия на това бъдеще не ни е гарантирана по подразбиране.

Крис Ола от Anthropic от същата онази трибуна до папата признава, че ИИ компаниите оперират „в рамките на стимули, които могат да влязат в конфликт с правилното поведение“. Затова те не могат да бъдат модел как да се използва ИИ. Това е отговорност, която никой не бива да сваля от плещите си. Никой сам не би могъл да повлияе глобално на ИИ, но всеки може да влияе на своя работен процес, на своя продукт, на конкретна фирмена политика или на потенциала на екипа си. И винаги е за предпочитане това да бъде съзнателен избор и поета отговорност.

Пасивната критика не носи ползи. Ефектно и лесно е да призоваваме за пауза, мораториум или забавяне. Или да отхвърляме напълно ползването на ИИ. Сигурно носи някакво усещане за чиста съвест, но е бягство от реалността, което не променя нищо, защото финансирането, стимулите и геополитиката вече са задействани. Не можем да управляваме нещо, което отказваме да докоснем. А както пише папата в енцикликата, технологията приема характеристиките на хората, които я проектират, финансират, регулират и използват. Трябва да изберем дали да сме част от тези хора, или ще стоим отстрани и ще оставим другите да оплескат всичко.

Papa Johns Surveillance-Based Advertising

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2026/07/papa-johns-surveillance-based-advertising.html

Papa Johns is spying on people’s buying activities to predict when they are low on food:

The pizza chain recently tapped NBCUniversal, Instacart and the dentsu-owned media agency Carat for help reaching consumers when they’re low on groceries—and thus more likely to be swayed by a mouth-watering ad. The idea is to reach hungry consumers by “knowing what is in their fridge without being too creepy,” said Carrie Drinkwater, chief investment officer at Carat.

To achieve that goal, NBCU and Instacart created a custom audience of shoppers who regularly purchase grocery staples on Instacart, such as eggs, milk, meat and produce. Based on that data, Papa Johns can determine which days of the week certain consumers are likely to run out of groceries and serve them an ad on NBCU streaming content accordingly. The brand served custom creatives to consumers based on their food preferences—such as whether they buy meat regularly—with QR codes and calls to action such as, “Light on groceries?” or “Empty fridge?”

Back in 2012, we learned (from Target and its campaign that detects when someone is pregnant) that the trick is to hide the knowledge in other, wrong, information. So the way for Papa John’s to not be “too creepy” is to deliberately get it wrong sometimes.

But still, ugh.

Discover More with Zabbix Marketplace

Post Syndicated from Michael Kammer original https://blog.zabbix.com/discover-more-with-zabbix-marketplace/33191/

What if extending Zabbix was as easy as browsing an app store? Zabbix Marketplace is a new, centralized hub built to help users quickly discover integrations, extensions, templates, and observability solutions.

Zabbix users will soon be able to access a growing catalog of ready-to-use solutions created by the global Zabbix community and our official technology partners, accelerating deployment and simplifying complex monitoring challenges.

At the same time, Marketplace will help solution creators reach a wider audience by making both commercial subscription-based offerings and free open source solutions easier to publish and promote, while giving Zabbix users faster access to tools that solve real operational challenges.

Think of Marketplace as a one-stop shop for integrations, extensions, templates, dashboards, visualization components, automation tools, and incident response enhancements that extend the capabilities of Zabbix across cloud infrastructure, applications, IoT, enterprise environments, and third-party systems. It will focus on several key extension categories, including:

  • Widgets. Custom visualization components and dashboard enhancements will provide new ways to display metrics, alerts, and operational data.
  • UI Modules. Frontend extensions add functionality, simplify workflows, and tailor the Zabbix interface for specific use cases and industries.
  • Webhooks. Ready-to-use integrations for notifications, ticketing systems, messaging platforms, and incident management tools.
  • Plugins and integrations. Extensions that will connect Zabbix with external platforms, cloud services, infrastructure tools, DevOps pipelines, and observability ecosystems.

How will Zabbix Marketplace work?

Our goal is to create a trusted, reliable ecosystem where developers, technology partners, system integrators, and community contributors can publish solutions that help organizations customize and expand their Zabbix environments faster and with less effort.

Designed with a clean and intuitive interface inspired by the familiar Zabbix documentation and frontend experience, Marketplace will be immediately accessible to both experienced users and Zabbix newcomers. Meanwhile, a rich visual browsing experience with screenshots, previews, detailed descriptions, and installation guidance will help users quickly evaluate solutions.

Security, transparency, and usability are also important priorities for Marketplace. We’re working toward a submission and review process that helps us guarantee that extensions will meet quality and compatibility standards while remaining simple for contributors to publish and maintain.

While the exact commercial model is still being finalized, the integrations portfolio will remain as is and we are actively exploring revenue-sharing approaches that support sustainable development, ongoing maintenance, and long-term support for high-quality integrations and extensions.

Stay tuned – more details as well as previews and early announcements are coming soon! And if you’ve already got an idea for a Zabbix-based solution, we’re definitely ready to hear about it. Fill out this form to register your interest and be among the first companies featured on Zabbix Marketplace!

The post Discover More with Zabbix Marketplace appeared first on Zabbix Blog.

Unmasking the crawls with Attribution Business Insights

Post Syndicated from Jin-Hee Lee original https://blog.cloudflare.com/attribution-business-insights/

Original content is the lifeblood of conversations and curiosities. Imagine a world without it: we could find a thousand ways to regurgitate the same material that’s already been created, but we would witness the decline of fresh ideas and arguments.

Website owners fuel the ecosystem of ideas, news, and interesting tidbits, but they face the increasingly complex challenge of managing traffic to their websites and being paid for their content. While some bot traffic is clearly malicious, it isn’t always obvious when a particular AI crawler is helping or harming your business. To answer this, site owners need granular, reliable data to differentiate between traffic that provides value, and traffic that strains resources while eroding the foundation of their business model: actual humans consuming their content. 

At Cloudflare, we hold a core belief: website owners have the right to control access to their content. We want to help website owners maintain their high-quality content and regulate AI traffic.

To provide much-needed clarity and help website owners take control, we’re excited to announce the new Attribution Business Insights dashboard — designed with business decision-makers and publishers in mind.

The new economics of the Internet

For decades, the business model of the Internet relied on a straightforward, unspoken agreement: website owners allowed search engines to crawl their content and, in return, search engines sent readers back to their pages. This symbiotic relationship, where traditional search engines operated with a balanced “crawl-to-referral” ratio, generated the pageviews needed to sustain advertising, affiliate revenue, and subscriptions. Search index crawlers would scan your content a couple of times for each referral sent, so making your website available to crawlers had a clear pipeline to additional revenue. We can think of this as the SEO (Search Engine Optimization) era.

Today, the explosive rise of AI crawlers and agents has broken this contract, plunging the digital publishing industry into an unprecedented crisis. The Internet is risking a transition into a “zero-click” ecosystem where AI chatbots scrape original content to synthesize instant answers — completely bypassing the original sources. We’ve already seen a marked shift from the SEO-only world into an AEO (Answer Engine Optimization) world, and now conversations around GEO (Generative Engine Optimization) are taking center stage.

The imbalance of this new reality is made clear by the crawl-to-referral ratios we see across the Internet today. While traditional search engines had a more balanced ratio of crawls to legitimate visitors referred, major AI crawlers operate on a drastically different, extractive scale. Bots from leading AI companies have been observed with a range of crawl-to-referral ratios: we noted ratios of 118:1 up to nearly 50,000:1 around the time of our Content Independence Day in 2025. In other words, an AI crawler might have crawled your premium content tens of thousands of times just to send back a single visitor. This ratio is fundamentally unfair.

For publishers, this creates a double hit: first, they’re losing out on the crucial referral traffic, ad impressions, and direct audience relationships that fund content creation and journalism. Second, they’re forced to bear the rising infrastructure costs of hosting and serving content to automated bots that offer no commercial value in return. The era in which it makes sense to allow all crawlers in the hopes of being discovered is over.

Introducing Attribution Business Insights

We want website owners to have the facts — the cold, hard numbers to understand which bots are helping their business and which bots are harming it. We also want to make this analysis easier than ever, which is why we’ve designed Attribution Business Insights to cut the noise, focusing on the details that our customers have told us are most important. 

Today, the Attribution Business Insights dashboard is available to all Cloudflare Bot Management customers. The new dashboard is designed to deliver a targeted view of bot traffic flowing to your website; unlike traditional analytics tools that may require extensive manual filtering, this dashboard provides you with key insights right away.

We set out to answer the most pressing questions for site owners today: How should you think about AI traffic on your websites? What is the value of different audiences — including humans, non-AI bots, and AI bots? And most importantly, what is your data being used for? 


The new Attribution Business Insights dashboard view, which includes insights about bot traffic overall, a site-wide crawl-to-referral ratio, and the distribution of AI bot traffic vs. organic traffic. 

To answer these questions, the dashboard displays a powerful array of data and insights:

  • Bot traffic to content pages: View your overall bot vs. human traffic, as well as the volume of all bots successfully accessing content.

  • Crawl-to-referral ratios: See your site-wide crawl-to-referral ratio on the scale of 24 hours, seven days, or 30 days. You can also see crawl-to-referral ratios per bot operator (per company that owns one or more bots).

  • Top bots breakdown: A list of top bots by volume, including their country of origin, bandwidth they take up on your website, and whether you’re currently blocking or allowing them.

  • Updated classification based on crawler behavior: We go beyond a generic label of “AI Crawler” by classifying crawlers with our updated taxonomy, whether it’s Training (i.e., training the next version of an LLM chatbot), Search (i.e., refreshing databases for Retrieval-Augmented Generation), or Agent (i.e., used in agentic interaction to return answers to an end user).  

From data to business strategy

You shouldn’t have to be a security expert to understand how AI crawlers affect your business. If website owners want to spend just a few minutes ingesting the high-level insights, they can walk away with a clear temperature check of the effectiveness of their content security policy.

For those who want to do a little more digging to understand how AI companies are making use of their content — or collect information to guide how they want their relationships with AI companies to develop — we show a more granular view organized by bot operator.


Breakdown of bot activity on a website, with important details for each bot such as type, crawl-to-referral ratio, and current action. 

By having a consolidated view of companies seeking to access content on your website, you can develop a better baseline of crawler activity. We want this data to equip our customers to step into any business conversation with the facts on their side. Tell Company1 that their crawl volume is twenty times that of Company4’s, and that Company4 is already compensating you for content. Revisit the way that Company2 licenses your content based on their recent activity. This new dashboard propels business conversations to move forward. 

How does this new layer of visibility tie into the existing tools you have to protect your website from abuse? In line with other features of Bot Management, the action step still happens in Security rules. To avoid adding noise to the control plane, Attribution Business Insights is intended to be a hub for thoughtful, filtered analytics, rather than another place to take action. This dashboard serves as a central source of information, allowing you to investigate before then taking an action in the same rule engine that governs other abuse mitigations. We also want to be loud and clear about inviting business decision-makers into this dashboard, acknowledging that conversations around AI traffic have a wider set of stakeholders than only security-specialized users.

What’s next

The Attribution Business Insights dashboard is the next critical step in providing website owners with the transparency and control they need to manage evolving AI bot threats, and more broadly, shape the new dynamics of the Internet. We’re already investigating the next iteration with close publishing partners to create a visibility plane that covers security from the perspective of the website owner with valuable, original content to share. 

A sneak preview below includes a new view to dissect crawler activity per-article to reveal the appetite that AI companies have for different pieces of content, different campaigns, and so on.


Breakdown of most popular articles, according to traffic volume. Shows key metrics such as AI bot traffic vs. other bot traffic vs. human traffic, both direct and from a referral.  

Visibility is the first piece, and there’s more to come to empower website owners to take control of their content in this new age. We encourage all customers of Cloudflare Bot Management — especially those driving business conversations — to access this today for a fresh take on analytics. 

Ship infrastructure faster with CloudFormation and CDK pre-deployment validation on every stack operation

Post Syndicated from Idriss Laouali Abdou original https://aws.amazon.com/blogs/devops/ship-infrastructure-faster-with-cloudformation-and-cdk-pre-deployment-validation-on-every-stack-operation/

AWS CloudFormation helps you model and provision cloud infrastructure as code using JSON or YAML templates, or through tools like the AWS Cloud Development Kit (CDK). Whether you create stacks directly, use change sets for preview, or deploy through CI/CD pipelines and AI agents, fast feedback on template errors is critical to development velocity.

Previously, CloudFormation introduced pre-deployment validation during change set creation, catching property syntax errors, resource name conflicts, and S3 bucket emptiness constraints before execution.

Today, we are announcing that pre-deployment validation now runs automatically on every CreateStack and UpdateStack operation, so every deployment path benefits from pre-deployment checks with no configuration required. We are also introducing three new validation checks (Service Quotas limit exceeded, AWS Config Recorder conflicts, and ECR repository delete readiness), a new DisableValidation parameter for operation-level control, and the cdk validate command that leverages CloudFormation pre-deployment validation as part of the CDK developer experience.

In this blog post, we will walk you through how these capabilities work in practice. You will learn how to:

  • Catch property syntax errors and resource name conflicts on CreateStack and UpdateStack before any resources are provisioned
  • Review new WARN-mode validations (service quotas, Config Recorder, ECR delete readiness) during change set creation
  • Use cdk validate to get a validation report with construct-level source tracing
  • Control validation behavior with the DisableValidation parameter when you need to skip checks

Key Capabilities

  • Pre-deployment validation on all stack operations: Property syntax validation and resource name conflict detection (Resource Already Exists) now run in hard-fail mode on CreateStack and UpdateStack, in addition to CreateChangeSet. Errors are caught before any resources are provisioned.
  • Three new validation types: Service Quota validation, AWS Config Recorder conflict detection, and ECR Repository delete readiness checks are now available as warnings during change set creation.
  • CDK validate command: The cdk validate command leverages CloudFormation pre-deployment validation and provides a report with construct-level source tracing that maps errors back to your CDK code.
  • DisableValidation parameter: Operation-level control to skip pre-deployment validation when you need to prioritize deployment speed or bypass a known issue.

How It Works

Understanding Validation Modes

CloudFormation pre-deployment validation operates in two modes that determine how validation failures are handled:

  • FAIL mode stops the stack operation when validation detects errors, ensuring problematic templates cannot proceed to deployment. This applies to property syntax errors and resource name conflicts on CreateStack, UpdateStack, and CreateChangeSet operations.
  • WARN mode allows the operation to proceed despite validation findings, providing warnings that you can review and address before execution. This applies to service quota limits, AWS Config Recorder conflicts, and ECR repository delete readiness checks on CreateChangeSet.

What happens when validation fails:

  • CreateStack: Operation stops before any resources are provisioned.
  • UpdateStack: Operation stops, stack remains in its current state with no resources modified.
  • CreateChangeSet: Change set is not executable. Change set status shows FAILED.

The following scenarios demonstrate how pre-deployment validation works across different stack operations.

Scenario 1: Property Validation on CreateStack

CloudFormation evaluates each resource property definition before provisioning begins. The following template contains several common resource property errors:

Template (dashboard-stack.yaml)

AWSTemplateFormatVersion: "2010-09-09"
Description: Dashboard stack with property validation errors

Resources:
  Dashboard04:
    Type: "AWS::CloudWatch::Dashboard"
    Properties:
      DashboardName: "MyDashboard"

  LogStream08:
    Type: "AWS::Logs::LogStream"
    Properties:
      LogGroupName: "/aws/my-app"
      LogStreamName:                        # Expected string, found JSONArray
        - "stream-1"
        - "stream-2"

  MetricFilter03:
    Type: "AWS::Logs::MetricFilter"
    Properties:
      LogGroupName: "/aws/my-app"
      SomeUnsupportedProperty: "value"      # Unsupported property
      MetricTransformations:
        - MetricName: "ErrorCount"
          MetricNamespace: "MyApp"
          MetricValue: "1"

Step 1: Create Stack

aws cloudformation create-stack \
    --stack-name "dashboard-stack" \
    --template-body file://dashboard-stack.yaml

The command returns the stack ARN and operation begins. Pre-deployment validation runs automatically before any resources are provisioned.

Step 2: Check Validation Results

Use the describe-events API to review validation results:

aws cloudformation describe-events \
    --stack-name "dashboard-stack"

Example output:

The stack creation stopped before any resources were provisioned. Each validation error includes the logical resource ID, resource type, and a precise status reason describing the property issue.

{
    "OperationEvents": [
        {
            "EventId": "ed0f6cc4-3f85-4ad9-abc3-1f9aad2ab931",
            "StackId": "arn:aws:cloudformation:us-west-1:1234:stack/dashboard-stack/6877f3c0-73e6-11f1-a1e1-02ff57e5af93",
            "OperationId": "68790530-73e6-11f1-a1e1-02ff57e5af93",
            "OperationType": "CREATE_STACK",
            "EventType": "VALIDATION_ERROR",
            "LogicalResourceId": "MetricFilter03",
            "PhysicalResourceId": "",
            "ResourceType": "AWS::Logs::MetricFilter",
            "Timestamp": "2026-06-29T18:14:49.255000+00:00",
            "ValidationFailureMode": "FAIL",
            "ValidationName": "PROPERTY_VALIDATION",
            "ValidationStatus": "FAILED",
            "ValidationStatusReason": "Unsupported property [SomeUnsupportedProperty]",
            "ValidationPath": "/Resources/MetricFilter03/Properties/SomeUnsupportedProperty"
        },
        {
            "EventId": "4f9f12ce-498c-4d79-af31-730238b85139",
            "StackId": "arn:aws:cloudformation:us-west-1:1234:stack/dashboard-stack/6877f3c0-73e6-11f1-a1e1-02ff57e5af93",
            "OperationId": "68790530-73e6-11f1-a1e1-02ff57e5af93",
            "OperationType": "CREATE_STACK",
            "EventType": "VALIDATION_ERROR",
            "LogicalResourceId": "LogStream08",
            "PhysicalResourceId": "",
            "ResourceType": "AWS::Logs::LogStream",
            "Timestamp": "2026-06-29T18:14:49.255000+00:00",
            "ValidationFailureMode": "FAIL",
            "ValidationName": "PROPERTY_VALIDATION",
            "ValidationStatus": "FAILED",
            "ValidationStatusReason": "Property [LogStreamName] expected type: String, found: JSONArray",
            "ValidationPath": "/Resources/LogStream08/Properties/LogStreamName"
        },
    ]
}

Console Experience

In the CloudFormation console, navigate to your stack’s Events tab and click the operation ID (or the link in the banner or status reason column) to open the Operation view page. The page will open directly on the Deployment validations tab to see the validation results table:

  • LogStream08 (AWS::Logs::LogStream) – FAIL: Property [LogStreamName] expected string, found: JSONArray
  • MetricFilter03 (AWS::Logs::MetricFilter) – FAIL: Unsupported property [SomeUnsupportedProperty]
Figure 1 - Deployment validations tab showing property validation failures on CreateStack

Figure 1: Deployment validations tab showing property validation failures on CreateStack

Figure 2 Deployment validations tab showing property validation failures on CreateStack

Figure 2: Deployment validations tab showing property validation failures on CreateStack

Scenario 2: Resource Name Conflict on UpdateStack

Resource name conflict detection (RAE) identifies when your template specifies a resource name that already exists in your account. This validation now runs on CreateStack and UpdateStack operations in addition to CreateChangeSet.

Template (update-bucket.yaml)

AWSTemplateFormatVersion: "2010-09-09"
Description: Update stack adding a bucket with a conflicting name

Resources:
  ExistingFunction:
    Type: "AWS::Lambda::Function"
    Properties:
      FunctionName: "my-existing-function"
      Runtime: "python3.12"
      Handler: "index.handler"
      Role: !Sub "arn:aws:iam::${AWS::AccountId}:role/lambda-role"
      Code:
        ZipFile: |
          def handler(event, context):
              return {"statusCode": 200}

  ConflictingBucket:
    Type: "AWS::S3::Bucket"
    Properties:
      BucketName: "production-data-bucket"   # Already exists in the account

Update Stack

aws cloudformation update-stack \     
     --stack-name "my-app-stack" \     
     --template-body file://update-bucket.yaml

Validation output (via describe-events):

{
    "OperationEvents": [
        ...
        {
            "EventId": "bde0f986-3b47-48d8-91bc-f384195f842a",
            "StackId": "arn:aws:cloudformation:us-west-1:1234:stack/my-app-stack-blog-test/164ff580-73e5-11f1-ab70-026546ec19e3",
            "OperationId": "65e641d0-73e5-11f1-abdb-06073274cc09",
            "OperationType": "UPDATE_STACK",
            "EventType": "VALIDATION_ERROR",
            "LogicalResourceId": "ConflictingBucket",
            "PhysicalResourceId": "",
            "ResourceType": "AWS::S3::Bucket",
            "Timestamp": "2026-06-29T18:07:36.139000+00:00",
            "ValidationFailureMode": "FAIL",
            "ValidationName": "NAME_CONFLICT_VALIDATION",
            "ValidationStatus": "FAILED",
            "ValidationStatusReason": "Resource of type 'AWS::S3::Bucket' with identifier 'production-data-bucket-blog-test-208004920468' already exists.",
            "ValidationPath": "/Resources/ConflictingBucket"
        },
        ...
    ]
}

The update stops before any resources are modified. You can either rename the resource in your template or remove the existing resource that causes the conflict.

Figure 3 - Resource Name Conflict on UpdateStack Event tab

The deployment validation view below provide moe detail about the error, include status reason and path to the resource.

Figure 3 - Resource Name Conflict on UpdateStack Deployment validation

Figure 3: Resource Name Conflict on UpdateStack

Scenario 3: Service Quota Warning on CreateChangeSet

Service Quota validation is one of three new warning-mode validations available during change set creation. It checks whether creating or updating resources would exceed your AWS service quotas.

Create Change Set

aws cloudformation create-change-set \
    --stack-name "vpc-stack" \
    --change-set-name "add-subnets" \
    --template-body file://vpc-with-many-subnets.yaml

Validation output:

{
    "EventId": "3ba6f27b-4d3c-4e73-bac2-8d8cbf71a6d3",
    "StackId": "arn:aws:cloudformation:us-west-1:1234:stack/vpc-quota-test/492a84d0-73ee-11f1-a714-02e5a60ac85d",
    "OperationId": "b55e1e8f-28d1-4cbe-a6c3-59c92707e180",
    "OperationType": "CREATE_CHANGESET",
    "EventType": "VALIDATION_ERROR",
    "LogicalResourceId": "VPC1",
    "PhysicalResourceId": "",
    "ResourceType": "AWS::EC2::VPC",
    "Timestamp": "2026-06-29T19:11:12.727000+00:00",
    "ValidationFailureMode": "WARN",
    "ValidationName": "SERVICE_QUOTA_VALIDATION",
    "ValidationStatus": "FAILED",
    "ValidationStatusReason": "Service quota will be exceeded: AWS::EC2::VPC current usage 1/5, creating 6 would exceed limit",
    "ValidationPath": "/Resources/VPC1"
}

Because this validation operates in WARN mode, the change set is created successfully. You can review the warning, request a quota increase through the Service Quotas console, and then proceed with execution. The two other new warning validations (AWS Config Recorder conflict detection and ECR Repository delete readiness) follow the same pattern.

Figure 4 - Service Quotas Warning on CreateChangeSet

Figure 4: Service Quota Warning on CreateChangeSet

Scenario 4: CDK Validate Experience

The cdk validate command provides a unified validation experience that combines multiple validation sources into a single report with construct-level source tracing. Under the hood, cdk validate synthesizes your CDK app, creates a change set to invoke server-side pre-deployment validation, collects the results via DescribeEvents, and produces a report that maps errors back to your CDK source code with construct-level tracing.

Each error traces back to the specific construct and source file location in your CDK code, not just the CloudFormation logical resource ID. This construct-level tracing is what makes cdk validate uniquely valuable: you see the exact line in your code that needs to change.

Scenario 5: Controlling Validation with DisableValidation

Pre-deployment validation is enabled by default on all stack operations. If you need to skip validation for a specific operation, use the DisableValidation parameter.

When to disable validation:

  • When you have already validated your template through other means (cdk validate, cfn-lint, CI/CD checks)
  • When you need to minimize operation latency for time-sensitive deployments

CLI usage:

# Skip validation on create-stack
aws cloudformation create-stack \
    --stack-name "my-stack" \
    --template-body file://template.yaml \
    --disable-validation

# Skip validation on update-stack
aws cloudformation update-stack \
    --stack-name "my-stack" \
    --template-body file://template.yaml \
    --disable-validation

Important: Disabling validation means common errors will not be caught until resource provisioning is attempted. Use this option only when you understand the trade-off between deployment speed and early error detection.

AI Agents and Automated Workflows

Pre-deployment validation gives AI agents and automation tools the fast feedback loop they need to self-correct. When an agent provisions infrastructure and the template has an error, validation returns a structured error in seconds rather than waiting minutes for a full provision-and-rollback cycle to complete. The agent can parse the error, fix the template, and retry immediately.

With cdk validate, agents get construct-level source tracing that maps errors directly to the line of CDK code that needs to change, enabling fully automated fix-and-retry loops without human intervention.

To get started with the agent experience, install the CloudFormation agent skill from the AWS Agent Toolkit. This skill gives AI agents the ability to create stacks, validate templates, and iterate on errors using pre-deployment validation feedback.

Getting Started

Pre-deployment validation runs automatically on all CreateStack, UpdateStack, and CreateChangeSet operations with no configuration required. To start benefiting:

  • Create or update a stack as you normally would. Validation runs automatically.
  • Review validation results using the DescribeEvents API, the CloudFormation Console Events tab (click the operation ID, then the Deployment validations tab), or the cdk validate command.
  • Fix identified issues in your template and retry the operation.
  • Optionally disable validation using --disable-validation for specific operations

Required IAM permissions for validation checks

Validation on CreateStack and UpdateStack (property syntax validation and resource name conflict detection) requires no additional IAM permissions beyond what is needed for the stack operation itself. For the new validation checks available during change set creation, your IAM role needs the following additional permissions:

Service Quota Check:

  • cloudwatch:GetMetricData
  • lambda:GetAccountSettings
  • servicequotas:GetServiceQuota
  • ec2:DescribeSecurityGroups
  • iam:GetAccountSummary

Config Recorder Check:

  • config:ListConfigurationRecorders

S3 Bucket Empty Check:

  • s3:ListBucketV2

ECR Repository Delete Readiness Check:

  • ecr:ListImages

If these permissions are not granted, the corresponding validation checks will be skipped without blocking the operation.

For CDK users:

# Run unified validation before deploying 
cdk validate

Best Practices

  • Use cdk validate as your primary pre-deployment check. It leverages CloudFormation pre-deployment validation in a single command, giving you comprehensive coverage before any deployment is attempted.
  • Place CreateChangeSet as the first pipeline stage. For pipelines that use change sets, this ensures pre-deployment validation fires at the pipeline entry point. CDK Pipelines integrates this by default.
  • Let validation run by default. The few seconds of validation time pay for themselves by preventing full provision-and-rollback cycles that take minutes or longer.
  • Use DisableValidation intentionally. Reserve it for cases where you have already validated through other means or need to bypass a known false positive. Do not disable validation globally.
  • Integrate validation into PR/CI workflows. Run cdk validate or cfn-lint as part of your pull request checks to catch errors before code is merged, preventing invalid templates from reaching deployment pipelines.
  • Monitor validation warnings. WARN-mode validations (service quota, Config Recorder, ECR delete readiness) indicate potential issues that may cause failures at execution time. Address them proactively.

Conclusion

Pre-deployment validation on all stack operations represents a significant step forward in CloudFormation’s shift-left validation strategy. By catching common deployment errors in seconds before any resources are provisioned, this capability eliminates unnecessary rollback cycles and accelerates development workflows across the board.

Combined with the cdk validate command, which provides a unified validation experience with construct-level tracing, and the DisableValidation parameter for operation-level control, teams now have a complete toolkit for managing the trade-off between validation coverage and deployment speed. AI agents and automated pipelines benefit from structured, machine-readable feedback that enables immediate self-correction, turning what were once multi-minute debugging cycles into second-level iteration loops.

Pre-deployment validation is available in all AWS Regions where CloudFormation is supported. No configuration or opt-in is required. To learn more, visit the Validate stack deployments User Guide.

Blog Authors Bio:

Idriss Laouali Abdou

Idriss is a Sr. Product Manager Technical on the AWS Infrastructure-as-Code team based in Seattle. He focuses on improving developer productivity through AWS CloudFormation and StackSets Infrastructure provisioning experiences. Outside of work, you can find him creating educational content for thousands of students, cooking, or dancing.

Olivia Biswas

Olivia is a Software Development Manager on the AWS Infrastructure-as-Code team based in Seattle, where she leads developer productivity initiatives through CloudFormation. During her tenure at Amazon, she has built several customer-obsessed software solutions within Alexa and Buy With Prime. Outside of work, she is a globe trotter who enjoys baking, dancing, reading, and watching documentaries.

Subha Velayutham

Subha is a Senior Software Engineer on the AWS Infrastructure-as-Code team, where she builds features to improve developer productivity. Outside of work, she enjoys reading, traveling, and experimenting with new creative hobbies.

Accelerate your infrastructure deployments by up to 4x with AWS CloudFormation Express mode

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/accelerate-your-infrastructure-deployments-by-up-to-4x-with-aws-cloudformation-express-mode/

Today, we’re announcing AWS CloudFormation Express mode, a new deployment mode that accelerates deployments for developers and AI tools iterating on infrastructure. Express mode accelerates deployments by completing when CloudFormation confirms resource configuration is applied, rather than waiting for extended stabilization checks. This reduces deployment time by up to 4 times for iterative development workflows and production scenarios.

How it works
Every CloudFormation deployment performs stabilization checks after resource configuration is applied. These checks serve an important purpose when you need to confirm resources can serve traffic before shifting load.

However, many workflows do not require full stabilization to proceed. Express mode benefits two primary use cases: iterative development workflows and production scenarios where you are comfortable with eventual stabilization. These use cases include iterating on infrastructure configurations during development, testing individual components of your application, and AI-assisted infrastructure development that benefits from sub-minute feedback loops.

With Express mode, CloudFormation completes deployments when resource configuration is applied, without waiting for stabilization checks. Resources continue becoming operational in the background. CloudFormation automatically retries dependent resources that encounter transient failures during provisioning within the same stack, without requiring any customer intervention. This built-in resilience handles timing issues between resources as they stabilize. Express mode changes when the deployment completes, not how resources are provisioned.

For example, when I create an Amazon Simple Queue Service (SQS) queue with a dead letter queue (DLQ), Standard mode takes 64 seconds, but Express mode completes in up to 10 seconds. In the case of deleting an AWS Lambda function with network interface attachment, Standard mode takes 20–30 minutes, but Express mode completes in up to 10 seconds based on my benchmarking test.

Get started with CloudFormation Express mode
When you create a CloudFormation stack in the AWS Management Console, choose Enable in the Express mode under Stack deployment options.

You can also use AWS Command Line Interface (AWS CLI), AWS SDKs, or IaC tools like AWS Cloud Development Kit (CDK), and AI tools such as Kiro.

Activate Express mode by setting the --deployment-config parameter to EXPRESS when creating, updating, or deleting stacks. No template changes are required. Express mode disables rollback by default for the fastest iteration experience. To re-enable rollback, set disableRollback to false in the deployment-config for production environments, or implement monitoring/cleanup mechanisms for failed deployments.

aws cloudformation create-stack \ 
   --stack-name my-app \ 
   --template-body file://template.yaml \ 
   --deployment-config '{"mode": "EXPRESS", "disableRollback": true}' \

For example, use the Express mode when you build infrastructure incrementally, adding resources one at a time. Ensure your IAM role templates follow the principle of least privilege.

# Iteration 1: Deploy IAM role
aws cloudformation create-stack \
--stack-name my-microservice \
--template-body file://iteration1-iam.yaml \
--deployment-config '{"mode": "EXPRESS"}' \
--capabilities CAPABILITY_IAM
--role-arn arn:aws:iam::123456789012:role/CloudFormationDeployRole

# Iteration 2: Add Lambda function
aws cloudformation update-stack \
--stack-name my-microservice \
--template-body file://iteration2-lambda.yaml \
--deployment-config '{"mode": "EXPRESS"}' \
--capabilities CAPABILITY_IAM
--role-arn arn:aws:iam::123456789012:role/CloudFormationDeployRole

# Iteration 3: Add SQS queue and event source mapping
aws cloudformation update-stack \
--stack-name my-microservice \
--template-body file://iteration3-sqs.yaml \
--deployment-config '{"mode": "EXPRESS"}' \
--capabilities CAPABILITY_IAM
--role-arn arn:aws:iam::123456789012:role/CloudFormationDeployRole

For AWS CDK, activate Express mode with the cdk deploy --express command when you deploy your CDK stack. This command retrieves your generated CloudFormation template and deploys it through the CloudFormation Express mode, which provisions your resources as part of a CloudFormation stack.

Express mode works with all existing CloudFormation templates and supports all CloudFormation features including change sets and nested stacks. When you enable Express mode on a parent stack, all nested stacks also use Express mode. If you need resources to be fully operational before proceeding with traffic or testing, continue using the default deployment behavior, which performs stabilization checks before completing.

Now available
AWS CloudFormation Express mode is available today in all AWS commercial Regions at no additional cost. For Regional availability and a future roadmap, visit the AWS Capabilities by Region. If you want to call APIs, search documentation, find regional availability, and check troubleshooting about this new feature, try using the AWS MCP Server and plugins with your preferred AI tool. To learn more, visit the CloudFormation documentation.

Start accelerating your deployments today, and send feedback to AWS re:Post for AWS CloudFormation or through your usual AWS Support contacts.

Channy

Amazon EC2 C9g and C9gd instances powered by AWS Graviton5 processors are now available

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amazon-ec2-c9g-and-c9gd-instances-powered-by-aws-graviton5-processors-are-now-available/

When you run compute-intensive workloads like real-time analytics, batch processing, video encoding, scientific modeling, or CPU-based machine learning inference, every percentage point of performance matters. You need instances that deliver higher throughput per vCPU, faster memory access, and more network bandwidth, all while keeping your costs in check.

Today I am happy to announce the general availability of Amazon Elastic Compute Cloud (Amazon EC2) C9g and C9gd instances, powered by AWS Graviton5 processors. C9g instances are compute-optimized and deliver up to 25% higher performance per vCPU compared to previous-generation C8g instances. They feature the fastest memory of any processor instance in the cloud, with DDR5 8800MT/s DIMMs, 5x more L3 cache, and up to 3x higher packet-processing performance compared to Graviton4-based instances. The faster memory and larger caches mean your workloads spend less time waiting on data, translating into higher throughput for in-memory analytics, faster agentic loops, and more responsive real-time applications.

C9g instances are ideal for batch jobs, video encoding pipelines, or distributed analytics that can utilize Amazon Elastic Block Store (Amazon EBS) for storage. It is also a natural fit for agentic AI workloads, where concurrent environments and CPU-bound reasoning steps benefit from Graviton5’s higher core count and larger caches. As AI shifts from answering questions to taking actions, running code, and orchestrating multi-step tasks, the demand for CPU compute is growing, and C9g instances are built for this shift.

Some workloads also need fast local storage alongside that compute power. Choose C9gd when your application benefits from high-speed, low-latency local NVMe SSD storage, for example scratch space during HPC simulations, temporary caches for ML inference, or local buffers for ad-serving engines.

Graviton5-based instances with NVMe instance store volumes also support detailed performance statistics, providing high-resolution I/O metrics, including latency histograms broken down by I/O size, up to 1-second granularity and accessible via Amazon CloudWatch or nvme-cli at no additional cost.

C9g and C9gd instances at a glance
C9g and C9gd instances are available in 11 sizes ranging from medium to 48xlarge, plus a bare metal option. They offer up to 15% higher network bandwidth and 20% higher EBS bandwidth on average across sizes compared to the previous generation, with the largest 48xlarge size delivering up to 100 Gbps of network bandwidth and up to 72 Gbps of EBS bandwidth, a 2x increase.

C9g vCPUs Memory
(GiB)
Network Bandwidth
(Gbps)
EBS Bandwidth
(Gbps)
medium 1 2 Up to 15 Up to 12
large 2 4 Up to 15 Up to 12
xlarge 4 8 Up to 15 Up to 12
2xlarge 8 16 Up to 17 Up to 12
4xlarge 16 32 Up to 17 Up to 12
8xlarge 32 64 17 12
12xlarge 48 96 25 18
16xlarge 64 128 34 24
24xlarge 96 192 50 36
48xlarge 192 384 100 72
metal-48xl 192 384 100 72

C9gd instances add local NVMe SSD storage with up to 30% higher storage performance compared to previous-generation local storage instances.

C9gd vCPUs Memory
(GiB)
Instance Storage
(GB)
Network Bandwidth
(Gbps)
EBS Bandwidth
(Gbps)
medium 1 2 1 x 59 Up to 15 Up to 12
large 2 4 1 x 118 Up to 15 Up to 12
xlarge 4 8 1 x 237 Up to 15 Up to 12
2xlarge 8 16 1 x 474 Up to 17 Up to 12
4xlarge 16 32 1 x 950 Up to 17 Up to 12
8xlarge 32 64 1 x 1900 17 12
12xlarge 48 96 3 x 950 25 18
16xlarge 64 128 1 x 3800 34 24
24xlarge 96 192 3 x 1900 50 36
48xlarge 192 384 3 x 3800 100 72
metal-48xl 192 384 3 x 3800 100 72

Both families are well-suited for high-performance computing (HPC), batch processing, gaming, video encoding, scientific modeling, distributed analytics, CPU-based machine learning inference, and ad serving.

Here are some additional capabilities:

  • Instance Bandwidth Configuration (IBC) lets you adjust the allocation of bandwidth between Amazon EBS and Amazon VPC networking by up to 25%, helping you optimize performance for workloads with specific bandwidth requirements such as databases and caching.
  • ENA Express support for enhanced networking.
  • Up to 128 EBS volumes can be attached to virtual instances.
  • Support for Savings Plans, On-Demand, Spot Instances, Dedicated Instances, and Dedicated Hosts.

Nitro Isolation Engine
C9g and C9gd instances are the first compute optimized Amazon EC2 instances to feature the AWS Nitro Isolation Engine, a new capability of the AWS Nitro System. The Nitro Isolation Engine is a purpose-built component of the Nitro Hypervisor, implemented in Rust, that enforces isolation between virtual machines. It mediates all access to VM memory, CPU register state, and I/O devices through a minimal set of APIs.

To learn more about the Nitro Isolation Engine, visit the blog post. For details on the formal verification results, including scope and assumptions, see our technical white paper.

Now available
Amazon EC2 C9g and C9gd instances are now available in US East (Ohio, N. Virginia), US West (Oregon), and Europe (Frankfurt). Additional regions will follow.

You can launch C9g and C9gd instances today using the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDKs. For pricing information, visit the Amazon EC2 Pricing page.

To learn more, visit the Amazon EC2 C9g and C9gd instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— seb

Automate public TLS certificate issuance with ACME support in AWS Certificate Manager

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/automate-public-tls-certificate-issuance-with-acme-support-in-aws-certificate-manager/

If you manage TLS certificates for your applications, you know the challenge: certificates expire, and when they do, your customers see errors or your service goes down. As certificate validity periods get shorter (the Certification Authority (CA)/Browser Forum mandates reduced maximum validity to 100 days starting March 2027, and to 47 days by 2029), manual renewal processes become untenable. You need automation.

Automatic Certificate Management Environment (ACME) is an open protocol for requesting, renewing, and revoking TLS certificates without human intervention. It’s the same protocol behind Let’s Encrypt, and it’s supported by dozens of clients across every platform.

Today we’re announcing ACME support for public certificates in AWS Certificate Manager (ACM). ACM now provides a fully managed ACME server endpoint that works with any ACMEv2-compatible client, such as Certbot, cert-manager for Kubernetes, acme.sh, or any other client you already use. You can issue public TLS certificates from Amazon Trust Services through the standard ACME protocol.

Before today, if you wanted automated certificate management using the ACME protocol, you relied on external certificate authorities alongside ACM, leading to a fragmented visibility experience. Some certificates lived in ACM, others were managed externally with no central dashboard. PKI administrators had limited ability to control who could request certificates or which domains were allowed.

With ACME support in ACM, you can now set up one or more managed ACME endpoint that allows you to centrally manage and monitor ACME certificate usage across your organization.

As a PKI administrator, you get centralized controls that go beyond basic certificate issuance. You can bind IAM roles to ACME accounts for fine-grained access control over which domains each client can request. You can define domain scopes at the endpoint level to enforce organization-wide policies. And you get centralized monitoring and visibility in the same place: AWS CloudTrail logs every certificate request for auditability, Amazon CloudWatch tracks operational metrics, and ACM sends expiry notifications when certificates are approaching renewal. Using ACM, your PKI team can search all certificates, whether issued through the ACM console, an API call, or ACME.

How it works
To get started, you first set up a dedicated ACME endpoint, configure authorization controls using External Account Binding (EAB), validate which domains the endpoint can issue certificates for, and point your existing ACME clients to the new endpoint.

The domain validation step is important: it separates who can set up certificate issuance from who can request certificates. The PKI administrator validates domains once at the endpoint level, using DNS credentials that stay with the admin. Application owners who need certificates never touch DNS. They register with an EAB credential, and the endpoint enforces which domains and scopes they’re allowed to request. This means you can distribute certificate automation broadly across your organization without distributing DNS keys along with it.

I start this demo from the ACME certificates page in the AWS Certificate Manager console.

ACME Console

I already have a few endpoints and certificates in this account, I walk you through creating a new one from scratch. First, I select Create ACME endpoint.

ACME - Ceeate endpoint 1

I give my endpoint a name. The Endpoint type is Public. ACME clients will connect over the public internet. The Certificate type is Public. The certificate will be issued by Amazon Trust Services and trusted by browsers and operating systems by default. For the certificate key type, I keep the default ECDSA P-256. RSA 2048 and ECDSA P-384 are also available if your clients require them.

ACME - Ceeate endpoint 2

Scrolling down, I configure the domain. I enter my domain name and select the domain scope. The scope controls exactly what certificate patterns your ACME clients are allowed to request for this domain. If I check only Exact domain, clients can only request certificates for that specific domain name. Adding Subdomains allows certificates for any subdomain (for example, api.example.com or dev.example.com). Adding Wildcards allows wildcard certificates (*.example.com). By leaving a scope unchecked, you prevent any client using this endpoint from requesting that type of certificate, even if their ACME request is otherwise valid. For a production endpoint, you might enable only Exact domain and Subdomains while leaving Wildcards unchecked to enforce a stricter security posture.

I also select my Amazon Route 53 hosted zone from the drop down menu. ACM then automatically creates the DNS CNAME records needed for domain validation, so I don’t have to do it manually. When my domain is hosted outside of Route 53, I manually create the provided CNAME record at my DNS provider instead. This is a meaningful difference from typical ACME setups where each client handles its own domain verification independently.

These centralized controls give PKI administrators a single place to authenticate domains, restrict which certificate types (ECDSA or RSA) clients can request, and further limit wildcard issuance. Having these governance capabilities built in means you don’t need to purchase a separate certificate lifecycle management product or invest in building a custom policy layer yourself, both of which come at significant cost and operational overhead.

I select Create ACME endpoint

ACME - DNS configuration

After a few seconds, the endpoint is created. The console shows a Setup progress tracker with the next steps. My domain shows a “Validating” status. The validation method is DNS validation, where ACM verifies that you control the domain by checking for a specific CNAME record. Because I selected my Route 53 hosted zone during creation, I select Create records in Route 53 to let ACM handle the DNS validation automatically.

ACME - DNS successThe validation completes in a few seconds and the status changes to Success.

ACME - External Account Binding 1

Now I need to create External Account Binding (EAB) credentials. EAB credentials are a key identifier and HMAC key pair that lets your ACME client register an account with the ACME server. Once registered, the client generates its own asymmetric key pair, which is then used to authenticate all subsequent certificate requests. On the endpoint details page, I select the External account binding tab, then select Create EAB. I give the credential a name and optionally set an expiration time, ideally no longer than needed to complete client registration.

ACME - External Account Binding 2

ACME - end of configuration - show key

After I select Create EAB credential, the console shows the Key ID and HMAC Key. I note these values because I need them to configure my ACME client. The setup progress now shows four green checkmarks.

ACME - end of configuration - success

I’m ready to request a certificate. On the endpoint details page, I expand the CLI reference section. The console provides ready-to-use command examples for both Certbot and acme.sh. I copy the Certbot command and run it inside a container using the certbot/certbot image.

certbot certonly --standalone --non-interactive --agree-tos \
    --email <EMAIL> \
    --server https://acm-acme-enroll.us-east-1.api.aws/<ENDPOINT_ID>/directory \
    --eab-kid <EAB_KID> \
    --eab-hmac-key <EAB_HMAC_KEY> \
    --issuance-timeout <ISSUANCE_TIMEOUT> \
    -d <DOMAIN>

I replace the placeholders with my endpoint URL, EAB credentials, and domain name. The --eab-kid and --eab-hmac-key arguments are how Certbot registers with your ACME endpoint using the External Account Binding credentials I generated earlier. Each ACME client has its own syntax for this step, so check your client’s documentation for the exact flags.

Certbot contacts the ACME endpoint and returns a valid certificate signed by Amazon Trust Services.

Certbot to obtain a certificate through ACME

I use openssl to view the certificate before installing it.

openssl to view the certificate

The certificate is now visible in the ACM console under the ACME certificates tab, alongside any certificates issued through the console or API.

Certoficate view in the ACME console

Availability and pricing
ACME support in AWS Certificate Manager is available today in all commercial AWS Regions and will be available in AWS GovCloud (US), the China Regions, and the AWS European Sovereign Cloud partitions at a later date.

Pricing is per domain included in each certificate at the time of issuance, with a different price for fully qualified domain names and wildcards. Volume tiers are calculated based on total domain occurrences across all certificates issued per month in your AWS account. For details, see the ACM pricing page.

To get started, visit the ACM section on the AWS console or read the documentation.

— seb

Creative Commons founders’ fireside chat (Creative Commons blog)

Post Syndicated from jzb original https://lwn.net/Articles/1080518/

Dee Harris has published a summary
of the recent “fireside chat” featuring Creative Commons founders Hal
Abelson, Lawrence (Larry) Lessig, Molly Van Houweling, and Glenn Otis
Brown. The chat was to mark the 25th anniversary
of Creative Commons
and included a look back at its history as
well as a look at the landscape today:

Twenty-five years ago, a small group of people made a bet. They
believed that if you gave creators a simple set of tools and licenses
in language that a lawyer, a machine, and a human could all read,
millions of people might choose to share their work with the world
instead of locking it down.

The video
of the chat is available on YouTube.

The collective thoughts of the interwebz