If you’re here on the Cloudflare blog, chances are you already understand AI pretty well. But step outside our circle, and you’ll find a surprising number of people who still don’t know what it really is — or why it matters.
We wanted to come up with a way to make AI intuitive, something you can actually see and touch to get what’s going on. Hands on, not just hand-wavy.
The idea we landed on is simple: nothing comes into the world fully formed. Like us, and like the Internet, AI didn’t show up fully formed. So we asked ourselves: what if we told the story of AI as it learns and grows?
Episode by episode, we’d give it new capabilities, explain how those capabilities work, and explore how they change the way AI interacts with the world. Giving it a voice. Letting it see. Helping it learn. And maybe even letting it imagine the future.
So we made AI Avenue, a show where I (Craig) explore the fun, human, and sometimes surprising sides of AI… with a little help from my co-host Yorick, a robot hand with a knack for comic timing and the occasional eye-roll. Together, we travel, talk to incredible people, and get hands-on with AI to show it’s not just something to read about. It’s something you can touch, try, and enjoy.
The idea behind AI Avenue
We wanted to make something that would strip away the jargon and make AI approachable, friendly, and most importantly, fun.
In AI Avenue, we address people’s fears, show them the art of the possible, and highlight the positive human stories where AI is augmenting — not replacing — what people can do. And yes, we even let people touch AI themselves. Also yes, the previous paragraphs “intentionally included” a few em-dashes.
The result? A fast-paced, playful series that mixes demos, interviews, and real-world examples, all showing AI as something you can explore, question, and use in ways that matter to you.
You can sign up now to be notified when each episode drops and learn more about the journey ataiavenue.show.
Who we worked with
We had an absolute blast partnering with some of the most exciting players in the space:
Anthropic — on building safe, aligned AI models.
Engineered Arts — creators of the humanoid robot Ameca, who makes several appearances throughout the series.
ElevenLabs — powering lifelike voice synthesis.
HeyGen — creating realistic AI-generated video avatars and translations.
Roboflow — enabling computer vision projects with powerful image datasets and tools.
Be My Eyes — using AI and volunteers to make the world more accessible for people who are blind or have low vision.
Writer — bringing enterprise-grade generative AI into real-world workflows.
Episodes: One Ability at a Time
Across six episodes, we follow Yorick’s upgrades and occasional misadventures as he learns to talk, see, think, and even imagine the future.
Episode 1: Voice — We start in London where Yorick gets his voice and immediately starts chiming in on everything.
Episode 2: Vision — In San Francisco, Yorick tries computer vision for the first time. We watch someone go shopping for the first time.
Episode 3: Thinking — Hosting a live trivia stream online, Yorick begins confidently spouting answers that aren’t quite true. We head to New York City to meet someone whose life was saved by ChatGPT.
Episode 4: Learning — Yorick discovers generative AI and decides he can make the show himself, spawning multiple Craig clones and raising questions about ethics and creativity.
Episode 5: Doing — It turns out everyone we talk to just wants a robot to do their dishes. We dig into what “doing” means in AI and robotics and whether Yorick is on board.
Episode 6: Smell — In our finale, we explore the agentic AI future, quantum computing, and big sci-fi dreams, then hang out with a 9-year-old vibe coder because, well, the children are the future.
Get hands-on
Every episode is paired with developer tutorials so you can experiment with the same AI tools that we feature. No matter your skill level, you can tinker, build, and see for yourself what AI can do. We strongly believe the most important thing you can do right now is to touch AI, play with it. Now is the time.
Follow along the avenue
Yorick and I will be releasing each episode of AI Avenue as it’s ready, and we’d love to have you along for the ride.
Sign up to be notified when new episodes launch and explore more about the show at aiavenue.show.
We are witnessing in real time as AI fundamentally changes how people work across every industry. Customer support agents can respond to ten times the tickets. Software engineers are reviewers of AI generated code instead of spending hours pounding out boiler plate code. Salespeople can get back to focusing on building relationships instead of tedious follow up and administration.
This technology feels magical, and Cloudflare is committed to helping companies build world class AI-driven experiences for their employees and customers.
There is a but, however. Any time a brand new technology with such widespread appeal emerges, the technology often outpaces the tools in place to govern, secure and control the technology. We’re already starting to see stories of vibe coded apps leaking all their users’ details. LLM chats that were intended to only be shared between colleagues, are actually out on the web, being indexed by search engines for all the world to see. AI Agents are being given the keys to the application kingdom, enabling them to work autonomously across an organization — but without proper tracking and control. And then there’s the risk of a well-meaning employee uploading confidential company or customer data into an LLM, which then uses it to train future models.
Beyond internal data used for LLM training, content creators and media companies are also faced with a decision about how they want LLM scrapers and information retrieval bots to interact with their content. Cloudflare has found that it can be hundreds, or even thousands, of times harder to generate site traffic (and therefore ad revenue) from an AI response versus a search engine result.
We’re hearing more and more of these stories from CISOs, CIOs, Creators, and even CEOs. These leaders are faced with a difficult choice: clamping down on all AI usage and bots — or letting them run wild. There needs to be something in between. And for that to be a real option, the tools to manage and secure AI need to catch up to AI itself.
This week, that’s what Cloudflare is focused on. Welcome to AI Week! Over the coming week, we will focus on four core areas to help companies secure and deliver AI experiences safely and securely:
Securing AI environments and workflows: AI is incredibly powerful. The problem is, innovation is outpacing control — we want to change that. And as one of the few zero trust providers also building out AI infrastructure for the web, we’re uniquely positioned to be able to do so.
Protecting original content from misuse by AI: AI Companies are devouring organic content as quickly as it’s created… and creators aren’t seeing any benefit. We want to give content creators control over the content that they have worked so hard to develop.
Helping developers build world-class, secure, AI experiences: the possibilities for developers to create new applications on top of (or even building with) AI are endless. We want to allow developers to create AI driven applications that are as close to users as possible, with security controls built-in from day one.
Making Cloudflare better for you with AI: AI is changing the nature of interfaces. For example, finding and mitigating issues buried in thousands and millions of logs and events across website, employee, and email usage is something that used to be tedious — but now with AI, it can be made easy. We’re working day and night to integrate AI into Cloudflare itself to make things more efficient for ourselves and our customers.
Securing AI environments and workflows
As Artificial Intelligence innovation continues to accelerate at an unprecedented pace, the speed of its development is increasingly outpacing the implementation of robust security controls. This rapid advancement, while promising immense benefits, simultaneously introduces novel and complex security challenges that traditional measures are often ill-equipped to address. Organizations are finding themselves grappling with the inherent risks of adopting powerful AI tools without adequate safeguards, leading to vulnerabilities such as Shadow AI and the uncontrolled proliferation of AI models, making the development of specialized AI security paramount.
As we look around the zero trust space, none of the other providers are moving fast enough to keep up with AI’s pace of innovation. This is something we know a thing or two about — and after this week, if you’re worried about governing AI usage inside your organization, we will have you covered.
We will be announcing new and powerful controls to detect Shadow AI and control unauthorized AI usage. Additionally, we’ve built options for teams to establish the “paved path” of AI tooling in an organization to supercharge employee productivity without sacrificing security. Finally, we’ll be announcing new ways of protecting your own models from poisoning or attacks.
Protecting original content from AI
The explosion of Large Language Models (LLMs) has also created a new challenge for content creators: the unauthorized scraping and training of their valuable content. Cloudflare recognizes the critical need for creators to maintain control over their intellectual property. That’s why we’ve introduced Crawl Control, a groundbreaking initiative designed to empower content owners to manage how their content is accessed and used by AI models.
In the past two months, we’ve seen incredible progress with Crawl Control. We’ve significantly expanded the number of participating content providers, allowing more creators to leverage this innovative protection. We’ve also refined our detection mechanisms to more accurately identify AI crawlers and ensure that only authorized access occurs. Furthermore, we’ve streamlined the integration process, making it easier for new publishers to onboard and begin protecting their content within minutes. Our goal remains to provide content creators with the tools they need to thrive in the age of AI, ensuring they are compensated and acknowledged for the content they produce.
Helping you build world-class, secure, AI experiences
We believe that AI experiences should have security controls by default. This is why we are heavily investing in both our developer platform’s AI Gateway and the associated security controls for those products. This two pronged approach allows developers to iterate and test new ideas without the fear of painful or embarrassing security issues.
The Cloudflare AI Gateway allows developers to deploy AI-driven applications with unparalleled speed and efficiency, ensuring that these applications are as close to end-users as possible. This proximity minimizes latency and maximizes performance, delivering a seamless and responsive user experience that is critical in today’s fast-paced digital landscape.
This week, we’re announcing significant enhancements to the AI Gateway, further solidifying its position as the premier platform for AI application deployment. These improvements include advanced caching mechanisms that reduce redundant model calls, leading to faster response times and lower operational costs. We are also introducing expanded observability features, providing developers with deeper insights into their AI model’s performance and usage patterns, which will enable more effective debugging and optimization. Furthermore, new integrations with popular AI frameworks and services will simplify the development workflow, allowing developers to leverage the AI Gateway’s benefits with even greater ease. Our commitment is to provide developers with the tools to innovate and deliver cutting-edge AI experiences to their users.
Making Cloudflare better with AI
We’re integrating AI across our entire product suite to enhance the Cloudflare experience itself. From intelligent threat detection that adapts to emerging attack patterns, to AI-powered optimizations that fine-tune network performance, our goal is to leverage AI to make our platform more intuitive, efficient, and secure. We envision a future where Cloudflare’s products proactively anticipate user needs, automate complex tasks, and deliver unparalleled insights, all powered by seamlessly embedded AI. This commitment to internal AI integration ensures that as the digital landscape evolves, Cloudflare remains at the forefront of innovation, continuously delivering superior value to our users.
We cannot wait to share these updates and announcements with you. Follow our AI Week hub page for all the latest releases from our blog and CloudflareTV.
Think of the Web as a digital territory with its own social contract. In 2014, Tim Berners-Lee called for a “Magna Carta for the Web” to restore the balance of power between individuals and institutions. This mirrors the original charter’s purpose: ensuring that those who occupy a territory have a meaningful stake in its governance.
Web 3.0—the distributed, decentralized Web of tomorrow—is finally poised to change the Internet’s dynamic by returning ownership to data creators. This will change many things about what’s often described as the “CIA triad” of digital security: confidentiality, integrity, and availability. Of those three features, data integrity will become of paramount importance.
When we have agency in digital spaces, we naturally maintain their integrity—protecting them from deterioration and shaping them with intention. But in territories controlled by distant platforms, where we’re merely temporary visitors, that connection frays. A disconnect emerges between those who benefit from data and those who bear the consequences of compromised integrity. Like homeowners who care deeply about maintaining the property they own, users in the Web 3.0 paradigm will become stewards of their personal digital spaces.
This will be critical in a world where AI agents don’t just answer our questions but act on our behalf. These agents may execute financial transactions, coordinate complex workflows, and autonomously operate critical infrastructure, making decisions that ripple through entire industries. As digital agents become more autonomous and interconnected, the question is no longer whether we will trust AI but what that trust is built upon. In the new age we’re entering, the foundation isn’t intelligence or efficiency—it’s integrity.
What Is Data Integrity?
In information systems, integrity is the guarantee that data will not be modified without authorization, and that all transformations are verifiable throughout the data’s life cycle. While availability ensures that systems are running and confidentiality prevents unauthorized access, integrity focuses on whether information is accurate, unaltered, and consistent across systems and over time.
It’s a new idea. The undo button, which prevents accidental data loss, is an integrity feature. So is the reboot process, which returns a computer to a known good state. Checksums are an integrity feature; so are verifications of network transmission. Without integrity, security measures can backfire. Encrypting corrupted data just locks in errors. Systems that score high marks for availability but spread misinformation just become amplifiers of risk.
All IT systems require some form of data integrity, but the need for it is especially pronounced in two areas today. First: Internet of Things devices interact directly with the physical world, so corrupted input or output can result in real-world harm. Second: AI systems are only as good as the integrity of the data they’re trained on, and the integrity of their decision-making processes. If that foundation is shaky, the results will be too.
Integrity manifests in four key areas. The first, input integrity, concerns the quality and authenticity of data entering a system. When this fails, consequences can be severe. In 2021, Facebook’s global outage was triggered by a single mistaken command—an input error missed by automated systems. Protecting input integrity requires robust authentication of data sources, cryptographic signing of sensor data, and diversity in input channels for cross-validation.
The second issue is processing integrity, which ensures that systems transform inputs into outputs correctly. In 2003, the U.S.-Canada blackout affected 55 million people when a control-room process failed to refresh properly, resulting in damages exceeding US $6 billion. Safeguarding processing integrity means formally verifying algorithms, cryptographically protecting models, and monitoring systems for anomalous behavior.
Storage integrity covers the correctness of information as it’s stored and communicated. In 2023, the Federal Aviation Administration was forced to halt all U.S. departing flights because of a corrupted database file. Addressing this risk requires cryptographic approaches that make any modification computationally infeasible without detection, distributed storage systems to prevent single points of failure, and rigorous backup procedures.
Finally, contextual integrity addresses the appropriate flow of information according to the norms of its larger context. It’s not enough for data to be accurate; it must also be used in ways that respect expectations and boundaries. For example, if a smart speaker listens in on casual family conversations and uses the data to build advertising profiles, that action would violate the expected boundaries of data collection. Preserving contextual integrity requires clear data-governance policies, principles that limit the use of data to its intended purposes, and mechanisms for enforcing information-flow constraints.
As AI systems increasingly make critical decisions with reduced human oversight, all these dimensions of integrity become critical.
The Need for Integrity in Web 3.0
As the digital landscape has shifted from Web 1.0 to Web 2.0 and now evolves toward Web 3.0, we’ve seen each era bring a different emphasis in the CIA triad of confidentiality, integrity, and availability.
Returning to our home metaphor: When simply having shelter is what matters most, availability takes priority—the house must exist and be functional. Once that foundation is secure, confidentiality becomes important—you need locks on your doors to keep others out. Only after these basics are established do you begin to consider integrity, to ensure that what’s inside the house remains trustworthy, unaltered, and consistent over time.
Web 1.0 of the 1990s prioritized making information available. Organizations digitized their content, putting it out there for anyone to access. In Web 2.0, the Web of today, platforms for e-commerce, social media, and cloud computing prioritize confidentiality, as personal data has become the Internet’s currency.
Somehow, integrity was largely lost along the way. In our current Web architecture, where control is centralized and removed from individual users, the concern for integrity has diminished. The massive social media platforms have created environments where no one feels responsible for the truthfulness or quality of what circulates.
Web 3.0 is poised to change this dynamic by returning ownership to the data owners. This is not speculative; it’s already emerging. For example, ActivityPub, the protocol behind decentralized social networks like Mastodon, combines content sharing with built-in attribution. Tim Berners-Lee’s Solid protocol restructures the Web around personal data pods with granular access controls.
These technologies prioritize integrity through cryptographic verification that proves authorship, decentralized architectures that eliminate vulnerable central authorities, machine-readable semantics that make meaning explicit—structured data formats that allow computers to understand participants and actions, such as “Alice performed surgery on Bob”—and transparent governance where rules are visible to all. As AI systems become more autonomous, communicating directly with one another via standardized protocols, these integrity controls will be essential for maintaining trust.
Why Data Integrity Matters in AI
For AI systems, integrity is crucial in four domains. The first is decision quality. With AI increasingly contributing to decision-making in health care, justice, and finance, the integrity of both data and models’ actions directly impact human welfare. Accountability is the second domain. Understanding the causes of failures requires reliable logging, audit trails, and system records.
The third domain is the security relationships between components. Many authentication systems rely on the integrity of identity information and cryptographic keys. If these elements are compromised, malicious agents could impersonate trusted systems, potentially creating cascading failures as AI agents interact and make decisions based on corrupted credentials.
Finally, integrity matters in our public definitions of safety. Governments worldwide are introducing rules for AI that focus on data accuracy, transparent algorithms, and verifiable claims about system behavior. Integrity provides the basis for meeting these legal obligations.
The importance of integrity only grows as AI systems are entrusted with more critical applications and operate with less human oversight. While people can sometimes detect integrity lapses, autonomous systems may not only miss warning signs—they may exponentially increase the severity of breaches. Without assurances of integrity, organizations will not trust AI systems for important tasks, and we won’t realize the full potential of AI.
How to Build AI Systems With Integrity
Imagine an AI system as a home we’re building together. The integrity of this home doesn’t rest on a single security feature but on the thoughtful integration of many elements: solid foundations, well-constructed walls, clear pathways between rooms, and shared agreements about how spaces will be used.
We begin by laying the cornerstone: cryptographic verification. Digital signatures ensure that data lineage is traceable, much like a title deed proves ownership. Decentralized identifiers act as digital passports, allowing components to prove identity independently. When the front door of our AI home recognizes visitors through their own keys rather than through a vulnerable central doorman, we create resilience in the architecture of trust.
Formal verification methods enable us to mathematically prove the structural integrity of critical components, ensuring that systems can withstand pressures placed upon them—especially in high-stakes domains where lives may depend on an AI’s decision.
Just as a well-designed home creates separate spaces, trustworthy AI systems are built with thoughtful compartmentalization. We don’t rely on a single barrier but rather layer them to limit how problems in one area might affect others. Just as a kitchen fire is contained by fire doors and independent smoke alarms, training data is separated from the AI’s inferences and output to limit the impact of any single failure or breach.
Throughout this AI home, we build transparency into the design: The equivalent of large windows that allow light into every corner is clear pathways from input to output. We install monitoring systems that continuously check for weaknesses, alerting us before small issues become catastrophic failures.
But a home isn’t just a physical structure, it’s also the agreements we make about how to live within it. Our governance frameworks act as these shared understandings. Before welcoming new residents, we provide them with certification standards. Just as landlords conduct credit checks, we conduct integrity assessments to evaluate newcomers. And we strive to be good neighbors, aligning our community agreements with broader societal expectations. Perhaps most important, we recognize that our AI home will shelter diverse individuals with varying needs. Our governance structures must reflect this diversity, bringing many stakeholders to the table. A truly trustworthy system cannot be designed only for its builders but must serve anyone authorized to eventually call it home.
That’s how we’ll create AI systems worthy of trust: not by blindly believing in their perfection but because we’ve intentionally designed them with integrity controls at every level.
A Challenge of Language
Unlike other properties of security, like “available” or “private,” we don’t have a common adjective form for “integrity.” This makes it hard to talk about it. It turns out that there is a word in English: “integrous.” The Oxford English Dictionary recorded the word used in the mid-1600s but now declares it obsolete.
We believe that the word needs to be revived. We need the ability to describe a system with integrity. We must be able to talk about integrous systems design.
The Road Ahead
Ensuring integrity in AI presents formidable challenges. As models grow larger and more complex, maintaining integrity without sacrificing performance becomes difficult. Integrity controls often require computational resources that can slow systems down—particularly challenging for real-time applications. Another concern is that emerging technologies like quantum computingthreaten current cryptographic protections. Additionally, the distributed nature of modern AI—which relies on vast ecosystems of libraries, frameworks, and services—presents a large attack surface.
Beyond technology, integrity depends heavily on social factors. Companies often prioritize speed to market over robust integrity controls. Development teams may lack specialized knowledge for implementing these controls, and may find it particularly difficult to integrate them into legacy systems. And while some governments have begun establishing regulations for aspects of AI, we need worldwide alignment on governance for AI integrity.
Addressing these challenges requires sustained research into verifying and enforcing integrity, as well as recovering from breaches. Priority areas include fault-tolerantalgorithms for distributed learning, verifiable computation on encrypted data, techniques that maintain integrity despite adversarial attacks, and standardized metrics for certification. We also need interfaces that clearly communicate integrity status to human overseers.
As AI systems become more powerful and pervasive, the stakes for integrity have never been higher. We are entering an era where machine-to-machine interactions and autonomous agents will operate with reduced human oversight and make decisions with profound impacts.
The good news is that the tools for building systems with integrity already exist. What’s needed is a shift in mind-set: from treating integrity as an afterthought to accepting that it’s the core organizing principle of AI security.
The next era of technology will be defined not by what AI can do, but by whether we can trust it to know or especially to do what’s right. Integrity—in all its dimensions—will determine the answer.
Sidebar: Examples of Integrity Failures
Ariane 5 Rocket (1996) Processing integrity failure
A 64-bit velocity calculation was converted to a 16-bit output, causing an error called overflow. The corrupted data triggered catastrophic course corrections that forced the US $370 million rocket to self-destruct.
NASA Mars Climate Orbiter (1999) Processing integrity failure
Lockheed Martin’s software calculated thrust in pound-seconds, while NASA’s navigation software expected newton-seconds. The failure caused the $328 million spacecraft to burn up in the Mars atmosphere.
Microsoft’s Tay Chatbot (2016) Processing integrity failure
Released on Twitter, Microsoft‘s AI chatbot was vulnerable to a “repeat after me” command, which meant it would echo any offensive content fed to it.
Boeing 737 MAX (2018) Input integrity failure
Faulty sensor data caused an automated flight-control system to repeatedly push the airplane’s nose down, leading to a fatal crash.
SolarWinds Supply-Chain Attack (2020) Storage integrity failure
Russian hackers compromised the process that SolarWinds used to package its software, injecting malicious code that was distributed to 18,000 customers, including nine federal agencies. The hack remained undetected for 14 months.
ChatGPT Data Leak (2023) Storage integrity failure
A bug in OpenAI’s ChatGPT mixed different users’ conversation histories. Users suddenly had other people’s chats appear in their interfaces with no way to prove the conversations weren’t theirs.
Midjourney Bias (2023) Contextual integrity failure
Users discovered that the AI image generator often produced biased images of people, such as showing white men as CEOs regardless of the prompt. The AI tool didn’t accurately reflect the context requested by the users.
Prompt Injection Attacks (2023–) Input integrity failure
Attackers embedded hidden prompts in emails, documents, and websites that hijacked AI assistants, causing them to treat malicious instructions as legitimate commands.
CrowdStrike Outage (2024) Processing integrity failure
A faulty software update from CrowdStrike caused 8.5 million Windows computers worldwide to crash—grounding flights, shutting down hospitals, and disrupting banks. The update, which contained a software logic error, hadn’t gone through full testing protocols.
Voice-Clone Scams (2024) Input and processing integrity failure
Scammers used AI-powered voice-cloning tools to mimic the voices of victims’ family members, tricking people into sending money. These scams succeeded because neither phone systems nor victims identified the AI-generated voice as fake.
This essay was written with Davi Ottenheimer, and originally appeared in IEEE Spectrum.
During Developer Week 2024, we introduced AI face cropping in private beta. This feature automatically crops images around detected faces, and marks the first release in our upcoming suite of AI image manipulation capabilities.
AI face cropping is now available in Images for everyone. To bring this feature to general availability, we moved our CPU-based prototype to a GPU-based implementation in Workers AI, enabling us to address a number of technical challenges, including memory leaks that could hamper large-scale use.
We developed face cropping with two particular use cases in mind:
Social media platforms and AI chatbots. We observed a lot of traffic from customers who use Images to turn unedited images of people into smaller profile pictures in neat, fixed shapes.
E-commerce platforms. The same product photo might appear in a grid of thumbnails on a gallery page, then again on an individual product page with a larger view. The following example illustrates how cropping can change the emphasis from the model’s shirt to their sunglasses.
When handling high volumes of media content, preparing images for production can be tedious. With Images, you don’t need to manually generate and store multiple versions of the same image. Instead, we serve copies of each image, each optimized to your specifications, while you continue to store only the original image.
Crop everything, everywhere, all at once
Cloudflare provides a library of parameters to manipulate how an image is served to the end user. For example, you can crop an image to a square by setting its width and height dimensions to 100×100.
By default, images are cropped toward the center coordinates of the original image. The gravity parameter can affect how an image gets cropped by changing its focal point. You can specify coordinates to use as the focal point of an image or allow Cloudflare to automatically determine a new focal point.
The gravity=auto option uses a saliency algorithm to pick the most optimal focal point of an image. Saliency detection identifies the parts of an image that are most visually important; the cropping operation is then applied toward this region of interest. Our algorithm analyzes images using visual cues such as color, luminance, and texture, but doesn’t consider context within an image. While this setting works well on images with inanimate objects like plants and skyscrapers, it doesn’t reliably account for subjects as contextually meaningful as people’s faces.
And yet, images of people comprise the majority of bandwidth usage for many applications, such as an AI chatbot platform that uses Images to serve over 45 million unique transformations each month. This presented an opportunity for us to improve how developers can optimize images of people.
AI face cropping can be performed by using the gravity=face option, which automatically detects which pixels represent the face (or faces) and uses this information to crop the image. You can also affect how closely the image is cropped toward the face; the zoom parameter controls the threshold for how much of the surrounding area around the face will be included in the image.
We carefully designed our model pipeline with privacy and confidentiality top of mind. This feature doesn’t support facial identification or recognition. In other words, when you optimize with Cloudflare, we’ll never know that two different images depict the same person, or identify the specific people in a given image. Instead, AI face cropping with Images is intentionally limited to face detection, or identifying the pixels that represent a human face.
From pixels to people
Our first step was to select an open-source model that met our requirements. Behind the scenes, our AI face cropping uses RetinaFace, a convolutional neural network model that classifies images with human faces.
A neural network is a type of machine learning process that loosely resembles how the human brain works. A basic neural network has three parts: an input layer, one or more hidden layers, and an output layer. Nodes in each layer form an interconnected network to transmit and process data, where each input node is connected to nodes in the next layer.
A fully connected layer passes data from one layer to the next.
Data enters through the input layer, where it is analyzed before being passed to the first hidden layer. All of the computation is done in the hidden layers, where a result is eventually delivered through the output layer.
A convolutional neural network (CNN) mirrors how humans look at things. When we look at other people, we start with abstract features, like the outline of their body, before we process specific features, like the color of their eyes or the shape of their lips.
Similarly, a CNN processes an image piece-by-piece before delivering the final result. Earlier layers look for abstract features like edges and colors and lines; subsequent layers become more complex and are each responsible for identifying the various features that comprise a human face. The last fully connected layer combines all categorized features to produce one final classification of the entire image. In other words, if an image contains all of the individual features that define a human face (e.g. eyes, nose), then the CNN concludes that the image contains a human face.
We needed a model that could determine whether an image depicts a person (image classification), as well as exactly where they are in the image (object detection). When selecting a model, some factors we considered were:
Performance on the WIDERFACE dataset. This is the state-of-the-art face detection benchmark dataset, which contains 32,203 images of 393,703 labeled faces with a high degree of variability in scale, pose, and occlusion.
Speed (in frames per second). Most of our image optimization requests occur on delivery (rather than before an image gets uploaded to storage), so we prioritized performance for end-user delivery.
Model size. Smaller model sizes run more efficiently.
Quality. The performance boost from smaller models often gets traded for the quality—the key is balancing speed with results.
Our initial test sample contained 500 images with varying factors like the number of faces in the image, face size, lighting, sharpness, and angle. We tested various models, including BlazeFast, R-CNN (and its successors Fast R-CNN and Faster R-CNN), RetinaFace, and YOLO (You Only Look Once).
Two-stage detectors like BlazeFast and R-CNN propose potential object locations in an image, then identify objects in those regions of interest. One-stage detectors like RetinaFace and YOLO predict object locations and classes in a single pass. In our research, we observed that two-stage detector methods provided higher accuracy, but performed too slowly to be practical for real traffic. On the other hand, one-stage detector methods were efficient and performant while still highly accurate.
Ultimately, we selected RetinaFace, which showed the highest precision of 99.4% and performed faster than other models with comparable values. We found that RetinaFace delivered strong results even with images containing multiple blurry faces:
Inference—the process of using training models to make decisions—can be computationally demanding, especially with very large images. To maintain efficiency, we set a maximum size limit of 1024×1024 pixels when sending images to the model.
We pass images within these dimensions directly to the model for analysis. But if either width or height dimension exceeds 1024 pixels, then we instead create an inference image to send to the model; this is a smaller copy that retains the same aspect ratio as the original image and does not exceed 1024 pixels in either dimension. For example, a 125×2000 image will be downscaled to 64×1024. Creating this resized, temporary version reduces the amount of data that the model needs to analyze, enabling faster processing.
The model draws all of the bounding boxes, or the regions within an image that define the detected faces. From there, we construct a new, outer bounding box that encompasses all of the individual boxes, calculating its top-left and bottom-right points based on the boxes that are closest to the top, left, bottom, and right edges of the image.
The top-left point uses the x coordinate from the left-most box and the y coordinate from the top-most box. Similarly, the bottom-right point uses the x coordinate from the right-most box and the y coordinate from the bottom-most box. These coordinates can be taken from the same bounding boxes; if a single box is closest to both the top and left edges, then we would use its top-left corner as the top-left point of the outer bounding box.
AI face cropping identifies regions that represent faces, then determines an outer bounding box and focal point based on the top-most, left-most, right-most, and bottom-most bounding boxes.
Once we define the outer bounding box, we use its center coordinates as the focal point when cropping the image. From our experiments, we found that this produced better and more balanced results for images with multiple faces compared to other methods, like establishing the new focal point around the largest detected face.
The cropped image area is calculated based on the dimensions of the outer bounding box (“d”) and a specified zoom level (“z”) in the formula (1 ÷ z) × d. The zoom parameter accepts floating points between 0 and 1, where we crop the image to the bounding box when zoom=1 and include more of the area around the box as zoom trends toward 0.
Consider an original image that is 2048×2048. First, we create an inference image that is 1024×1024 to meet our size limits for face detection. Second, we define the outer bounding box using the model’s predictions—we’ll use 100×500 for this example. At zoom=0.5, our formula generates a crop area that is twice as large as the bounding box, with new width (“w”) and height (“h”) dimensions of 200×1000:
We also apply a min function that chooses the smaller number between the input dimensions and the calculated dimensions, ensuring that the new width and height never exceed the dimensions of the image itself. In other words, if you try to zoom out too much, then we use the full width or height of the image instead of defining a crop area that will extend beyond the edge of the image. For example, at zoom=0.25, our formula yields an initial crop area of 400×2000. Here, since the calculated height (2000) is larger than the input height (1024), we use the input height to set the crop area to 400×1024.
Finally, we need to scale the crop area back to the size of the original image. This applies only when a smaller inference image is created.
We initially downscaled the original 2048×2048 image by a factor of 2 to create the 1024×1024 inference image. This means that we need to multiply the dimensions of the crop area—400×1024 in our latest example—by 2 to produce our final result: a cropped image that is 800×2048.
The architecture behind the earliest build
In the beta version, we rewrote the model using TensorFlow Rust to make it compatible with our existing Rust-based stack. All of the computations for inference—where the model classifies and locates human faces—were executed on CPUs within our network.
Initially, this worked well and we saw near-realtime results.
However, the underlying limitations of our implementation became apparent when we started receiving consistent alerts that our underlying Images service was nearing its limits for memory usage. The increased memory usage didn’t line up with any recent deployments around this time, but a hunch led us to discover that the face cropping compute time graph had an uptick that matched the uptick in memory usage. Further tracing confirmed that AI face cropping was at the root of the problem.
When a service runs out of memory, it terminates its processes to free up memory and prevent the system from crashing. Since CPU-based implementations share RAM with other processes, this can potentially cause errors for other image optimization operations. In response, we switched our memory allocator from glibc malloc to jemalloc. This allowed us to use less memory at runtime, saving about 20 TiB of RAM globally. We also started culling the number of face cropping requests to limit CPU usage.
At this point, AI face cropping was already limited to our own internal uses and a small number of beta customers. These steps only temporarily reduced our memory consumption. They weren’t sufficient for handling global traffic, so we looked toward a more scalable design for long-term use.
Doing more with less (memory)
With memory usage alerts looming in the distance, it became clear that we needed to move to a GPU-based approach.
Unlike with CPUs, a GPU-based implementation avoids contention with other processes because memory access is typically dedicated and managed more tightly. We partnered with the Workers AI team, who created a framework for internal teams to integrate payloads into their model catalog for GPU access.
Some Workers AI models have their own standalone containers; this isn’t practical for every model, as routing traffic to multiple containers can be expensive. When using a GPU through Workers AI, the data needs to travel over the network, which can introduce latency. This is where model size is especially relevant, as network transport overhead becomes more noticeable with larger models.
To address this, Workers AI wraps smaller models in a single container and utilizes a latency-sensitive routing algorithm to identify the best instance to serve each payload. This means that models can be offloaded when there is no traffic.
A scheduler is used to optimize how—and when—models in the same container interact with GPUs.
RetinaFace runs on 1 GB of VRAM on the smallest GPU; it’s small enough that it can be hot swapped at runtime alongside similarly sized models. If there is a call for the RetinaFace model, then the Python code will be loaded into the environment and executed.
As expected, we saw a significant drop in memory usage after we moved the feature to Workers AI. Now, each instance of our Images service consumes about 150 MiB of memory.
With this new approach, memory leaks pose less concern to the overall availability of our service. Workers AI executes models within containers, so they can be terminated and restarted as needed without impacting other processes. Since face cropping runs separately from our Images service, restarting it won’t halt our other image optimization operations.
Applying AI face cropping to our blog
As part of our beta launch, we updated the Cloudflare blog to apply AI face cropping on author images.
Authors can submit their own images, which appear as circular profile pictures in both the main blog feed and individual blog posts. By default, CSS centers images within their containers, making off-centered head positions more obvious. When two profile pictures include different amounts of negative space, this can also lead to a visual imbalance where authors’ faces appear at different scales:
AI face cropping makes posts with multiple authors appear more balanced.
In the example above, Austin’s original image is cropped tightly around his face. On the other hand, Taylor’s original image includes his torso and a larger margin of the background. As a result, Austin’s face appears larger and closer to the center than Taylor’s does. After we applied AI face cropping to profile pictures on the blog, their faces appear more similar in size, creating more balance and cohesion on their co-authored post.
A new era of image editing, now in Images
Many developers already use Images to build scalable media pipelines. Our goal is to accelerate image workflows by automating rote, manual tasks.
For the Images team, this is only the beginning. We plan to release new AI capabilities, including features like background removal and generative upscale. You can try AI face cropping for free by enabling transformations in the Images dashboard.
In this input integrity attack against an AI system, researchers were able to fool AIOps tools:
AIOps refers to the use of LLM-based agents to gather and analyze application telemetry, including system logs, performance metrics, traces, and alerts, to detect problems and then suggest or carry out corrective actions. The likes of Cisco have deployed AIops in a conversational interface that admins can use to prompt for information about system performance. Some AIOps tools can respond to such queries by automatically implementing fixes, or suggesting scripts that can address issues.
These agents, however, can be tricked by bogus analytics data into taking harmful remedial actions, including downgrading an installed package to a vulnerable version.
Abstract: AI for IT Operations (AIOps) is transforming how organizations manage complex software systems by automating anomaly detection, incident diagnosis, and remediation. Modern AIOps solutions increasingly rely on autonomous LLM-based agents to interpret telemetry data and take corrective actions with minimal human intervention, promising faster response times and operational cost savings.
In this work, we perform the first security analysis of AIOps solutions, showing that, once again, AI-driven automation comes with a profound security cost. We demonstrate that adversaries can manipulate system telemetry to mislead AIOps agents into taking actions that compromise the integrity of the infrastructure they manage. We introduce techniques to reliably inject telemetry data using error-inducing requests that influence agent behavior through a form of adversarial reward-hacking; plausible but incorrect system error interpretations that steer the agent’s decision-making. Our attack methodology, AIOpsDoom, is fully automated—combining reconnaissance, fuzzing, and LLM-driven adversarial input generation—and operates without any prior knowledge of the target system.
To counter this threat, we propose AIOpsShield, a defense mechanism that sanitizes telemetry data by exploiting its structured nature and the minimal role of user-generated content. Our experiments show that AIOpsShield reliably blocks telemetry-based attacks without affecting normal agent performance.
Ultimately, this work exposes AIOps as an emerging attack vector for system compromise and underscores the urgent need for security-aware AIOps design.
Researchers have managed to eavesdropon cell phone voice conversations by using radar to detect vibrations. It’s more a proof of concept than anything else. The radar detector is only ten feet away, the setup is stylized, and accuracy is poor. But it’s a start.
Here’s an interesting story about a failure being introduced by LLM-written code. Specifically, the LLM was doing some code refactoring, and when it moved a chunk of code from one file to another it changed a “break” to a “continue.” That turned an error logging statement into an infinite loop, which crashed the system.
This is an integrity failure. Specifically, it’s a failure of processing integrity. And while we can think of particular patches that alleviate this exact failure, the larger problem is much harder to solve.
There is a really great series of online events highlighting cool uses of AI in cybersecurity, titled Prompt||GTFO. Videos from the firstthreeevents are online. And here’s where to register to attend, or participate, in the fourth.
The government of China has accused Nvidia of inserting a backdoor into their H20 chips:
China’s cyber regulator on Thursday said it had held a meeting with Nvidia over what it called “serious security issues” with the company’s artificial intelligence chips. It said US AI experts had “revealed that Nvidia’s computing chips have location tracking and can remotely shut down the technology.”
OpenAI has just announced their latest open-weight models — and we are excited to share that we are working with them as a Day 0 launch partner to make these models available in Cloudflare’s Workers AI. Cloudflare developers can now access OpenAI’s first open model, leveraging these powerful new capabilities on our platform. The new models are available starting today at @cf/openai/gpt-oss-120b and @cf/openai/gpt-oss-20b.
Workers AI has always been a champion for open models and we’re thrilled to bring OpenAI’s new open models to our platform today. Developers who want transparency, customizability, and deployment flexibility can rely on Workers AI as a place to deliver AI services. Enterprises that need the ability to run open models to ensure complete data security and privacy can also deploy with Workers AI. We are excited to join OpenAI in fulfilling their mission of making the benefits of AI broadly accessible to builders of any size.
The technical model specs
The OpenAI models have been released in two sizes: a 120 billion parameter model and a 20 billion parameter model. Both of them are Mixture-of-Experts models – a popular architecture for recent model releases – that allow relevant experts to be called for a query instead of running through all the parameters of the model. Interestingly, these models run natively at an FP4 quantization, which means that they have a smaller GPU memory footprint than a 120 billion parameter model at FP16. Given the quantization and the MoE architecture, the new models are able to run faster and more efficiently than more traditional dense models of that size.
These models are text-only; however, they have reasoning capabilities, tool calling, and two new exciting features with Code Interpreter and Web Search (support coming soon). We’ve implemented Code Interpreter on top of Cloudflare Containers in a novel way that allows for stateful code execution (read on below).
The model on Workers AI
We’re landing these new models with a few tweaks: supporting the new Responses API format as well as the historical Chat Completions API format (coming soon). The Responses API format is recommended by OpenAI to interact with their models, and we’re excited to support that on Workers AI.
If you call the model through:
Workers Binding, it will accept/return Responses API – env.AI.run(“@cf/openai/gpt-oss-120b”)
REST API on /run endpoint, it will accept/return Responses API – https://api.cloudflare.com/client/v4/accounts/<account_id>/ai/run/@cf/openai/gpt-oss-120b
REST API on new /responses endpoint, it will accept/return Responses API – https://api.cloudflare.com/client/v4/accounts/<account_id>/ai/v1/responses
REST API for OpenAI Compatible endpoint, it will return Chat Completions (coming soon)– https://api.cloudflare.com/client/v4/accounts/<account_id>/ai/v1/chat/completions
Code Interpreter + Cloudflare Sandboxes = the perfect fit
To effectively answer user queries, Large Language Models (LLMs) often struggle with logical tasks such as mathematics or coding. Instead of attempting to reason through these problems, LLMs typically utilize a tool call to execute AI-generated code that solves these problems. OpenAI’s new models are specifically trained for stateful Python code execution and include a built-in feature called Code Interpreter, designed to address this challenge.
We’re particularly excited about this. Cloudflare not only has an inference platform (Workers AI), but we also have an ecosystem of compute and storage products that allow people to build full applications on top of our Developer Platform. This means that we are uniquely suited to support the model’s Code Interpreter capabilities, not only for one-time code execution, but for stateful code execution as well.
We’ve built support for Code Interpreter on top of Cloudflare’s Sandbox product that allows for a secure environment to run AI-generated code. The Sandbox SDK is built on our latest Containers product and Code Interpreter is the perfect use case to bring all these products together. When you use Code Interpreter, we spin up a Sandbox container scoped to your session that stays alive for 20 minutes, so the code can be edited for subsequent queries to the model. We’ve also pre-warmed Sandboxes for Code Interpreter to ensure the fastest start up times.
We’ll be publishing an example of how you can use the gpt-oss model on Workers AI and Sandboxes with the OpenAI SDK to make calls to Code Interpreter on our Developer Docs.
Give it a try!
We’re beyond excited for OpenAI’s new open models, and we hope you are too. Super grateful to our friends from vLLM and HuggingFace for supporting efficient model serving on launch day. Read up on the Developer Docs to learn more about the details and how to get started on building with these new models and capabilities.
We are observing stealth crawling behavior from Perplexity, an AI-powered answer engine. Although Perplexity initially crawls from their declared user agent, when they are presented with a network block, they appear to obscure their crawling identity in an attempt to circumvent the website’s preferences. We see continued evidence that Perplexity is repeatedly modifying their user agent and changing their source ASNs to hide their crawling activity, as well as ignoring — or sometimes failing to even fetch — robots.txtfiles.
The Internet as we have known it for the past three decades is rapidly changing, but one thing remains constant: it is built on trust. There are clear preferences that crawlers should be transparent, serve a clear purpose, perform a specific activity, and, most importantly, follow website directives and preferences. Based on Perplexity’s observed behavior, which is incompatible with those preferences, we have de-listed them as a verified bot and added heuristics to our managed rules that block this stealth crawling.
How we tested
We received complaints from customers who had both disallowed Perplexity crawling activity in their robots.txt files and also created WAF rules to specifically block both of Perplexity’s declared crawlers: PerplexityBot and Perplexity-User. These customers told us that Perplexity was still able to access their content even when they saw its bots successfully blocked. We confirmed that Perplexity’s crawlers were in fact being blocked on the specific pages in question, and then performed several targeted tests to confirm what exact behavior we could observe.
We created multiple brand-new domains, similar to testexample.com and secretexample.com. These domains were newly purchased and had not yet been indexed by any search engine nor made publicly accessible in any discoverable way. We implemented a robots.txt file with directives to stop any respectful bots from accessing any part of a website:
We conducted an experiment by querying Perplexity AI with questions about these domains, and discovered Perplexity was still providing detailed information regarding the exact content hosted on each of these restricted domains. This response was unexpected, as we had taken all necessary precautions to prevent this data from being retrievable by their crawlers.
Obfuscating behavior observed
Bypassing Robots.txt and undisclosed IPs/User Agents
Our multiple test domains explicitly prohibited all automated access by specifying in robots.txt and had specific WAF rules that blocked crawling from Perplexity’s public crawlers. We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked.
Declared
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user)
20-25m daily requests
Stealth
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36
3-6m daily requests
Both their declared and undeclared crawlers were attempting to access the content for scraping contrary to the web crawling norms as outlined in RFC 9309.
This undeclared crawler utilized multiple IPs not listed in Perplexity’s official IP range, and would rotate through these IPs in response to the restrictive robots.txt policy and block from Cloudflare. In addition to rotating IPs, we observed requests coming from different ASNs in attempts to further evade website blocks. This activity was observed across tens of thousands of domains and millions of requests per day. We were able to fingerprint this crawler using a combination of machine learning and network signals.
An example:
Of note: when the stealth crawler was successfully blocked, we observed that Perplexity uses other data sources — including other websites — to try to create an answer. However, these answers were less specific and lacked details from the original content, reflecting the fact that the block had been successful.
How well-meaning bot operators respect website preferences
In contrast to the behavior described above, the Internet has expressed clear preferences on how good crawlers should behave. All well-intentioned crawlers acting in good faith should:
Be transparent. Identify themselves honestly, using a unique user-agent, a declared list of IP ranges or Web Bot Auth integration, and provide contact information if something goes wrong.
Be well-behaved netizens. Don’t flood sites with excessive traffic, scrape sensitive data, or use stealth tactics to try and dodge detection.
Serve a clear purpose. Whether it’s powering a voice assistant, checking product prices, or making a website more accessible, every bot has a reason to be there. The purpose should be clearly and precisely defined and easy for site owners to look up publicly.
Separate bots for separate activities. Perform each activity from a unique bot. This makes it easy for site owners to decide which activities they want to allow. Don’t force site owners to make an all-or-nothing decision.
Follow the rules. That means checking for and respecting website signals like robots.txt, staying within rate limits, and never bypassing security protections.
OpenAI is an example of a leading AI company that follows these best practices. They clearly outline their crawlers and give detailed explanations for each crawler’s purpose. They respect robots.txt and do not try to evade either a robots.txt directive or a network level block. And ChatGPT Agent is signing http requests using the newly proposed open standard Web Bot Auth.
When we ran the same test as outlined above with ChatGPT, we found that ChatGPT-User fetched the robots file and stopped crawling when it was disallowed. We did not observe follow-up crawls from any other user agents or third party bots. When we removed the disallow directive from the robots entry, but presented ChatGPT with a block page, they again stopped crawling, and we saw no additional crawl attempts from other user agents. Both of these demonstrate the appropriate response to website owner preferences.
How can you protect yourself?
All the undeclared crawling activity that we observed from Perplexity’s hidden User Agent was scored by our bot management system as a bot and was unable to pass managed challenges. Any bot management customer who has an existing block rule in place is already protected. Customers who don’t want to block traffic can set up rules to challenge requests, giving real humans an opportunity to proceed. Customers with existing challenge rules are already protected. Lastly, we added signature matches for the stealth crawler into our managed rule that blocks AI crawling activity. This rule is available to all customers, including our free customers.
What’s next?
We announced Content Independence Day almost one month ago, giving content creators and publishers more control over how their content is accessed. Today, over two and a half million websites have chosen to completely disallow AI training through our managed robots.txt feature or our managed rule blocking AI Crawlers. Every Cloudflare customer is now able to selectively decide which declared AI crawlers are able to access their content in accordance with their business objectives.
We expected a change in bot and crawler behavior based on these new features, and we expect that the techniques bot operators use to evade detection will continue to evolve. Once this post is live the behavior we saw will almost certainly change, and the methods we use to stop them will keep evolving as well.
Cloudflare is actively working with technical and policy experts around the world, like the IETF efforts to standardize extensions to robots.txt, to establish clear and measurable principles that well-meaning bot operators should abide by. We think this is an important next step in this quickly evolving space.
We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same phenomenon can transmit misalignment through data that appears completely benign. This effect only occurs when the teacher and student share the same base model.
Interesting security implications.
I am more convinced than ever that we need serious research into AI integrity if we are ever going to have trustworthy AI.
The collective thoughts of the interwebz
Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.