Tag Archives: AI

Connect any React application to an MCP server in three lines of code

Post Syndicated from Dina Kozlov original https://blog.cloudflare.com/connect-any-react-application-to-an-mcp-server-in-three-lines-of-code/

You can deploy a remote Model Context Protocol (MCP) server on Cloudflare in just one-click. Don’t believe us? Click the button below.

This will get you started with a remote MCP server that supports the latest MCP standards and is the reason why thousands of remote MCP servers have been deployed on Cloudflare, including ones from companies like Atlassian, Linear, PayPal, and more

But deploying servers is only half of the equation — we also wanted to make it just as easy to build and deploy remote MCP clients that can connect to these servers to enable new AI-powered service integrations. That’s why we built use-mcp, a React library for connecting to remote MCP servers, and we’re excited to contribute it to the MCP ecosystem to enable more developers to build remote MCP clients.

Today, we’re open-sourcing two tools that make it easy to build and deploy MCP clients:

  1. use-mcp — A React library that connects to any remote MCP server in just 3 lines of code, with transport, authentication, and session management automatically handled. We’re excited to contribute this library to the MCP ecosystem to enable more developers to build remote MCP clients. 

  2. The AI Playground — Cloudflare’s AI chat interface platform that uses a number of LLM models to interact with remote MCP servers, with support for the latest MCP standard, which you can now deploy yourself. 

Whether you’re building an AI-powered chat bot, a support agent, or an internal company interface, you can leverage these tools to connect your AI agents and applications to external services via MCP. 

Ready to get started? Click on the button below to deploy your own instance of Cloudflare’s AI Playground to see it in action.

use-mcp: a React library for building remote MCP clients

use-mcp is a React library that abstracts away all the complexity of building MCP clients. Add the useMCP() hook into any React application to connect to remote MCP servers that users can interact with. 

Here’s all the code you need to add to connect to a remote MCP server: 

mport { useMcp } from 'use-mcp/react'
function MyComponent() {
  const { state, tools, callTool } = useMcp({
    url: 'https://mcp-server.example.com'
  })
  return <div>Your actual UI code</div>
}

Just specify the URL, and you’re instantly connected. 

Behind the scenes, use-mcp handles the transport protocols (both Streamable HTTP and Server-Sent Events), authentication flows, and session management. It also includes a number of features to help you build reliable, scalable, and production-ready MCP clients. 

Connection management 

Network reliability shouldn’t impact user experience. use-mcp manages connection retries and reconnections with a backoff schedule to ensure your client can recover the connection during a network issue and continue where it left off. The hook exposes real-time connection states (“connecting”, “ready”, “failed”), allowing you to build responsive UIs that keep users informed without requiring you to write any custom connection handling logic. 

const { state } = useMcp({ url: 'https://mcp-server.example.com' })

if (state === 'connecting') {
  return <div>Establishing connection...</div>
}
if (state === 'ready') {
  return <div>Connected and ready!</div>
}
if (state === 'failed') {
  return <div>Connection failed</div>
}

Authentication & authorization

Many MCP servers require some form of authentication in order to make tool calls. use-mcp supports OAuth 2.1 and handles the entire OAuth flow.  It redirects users to the login page, allows them to grant access, securely stores the access token returned by the OAuth provider, and uses it for all subsequent requests to the server. The library also provides methods for users to revoke access and clear stored credentials. This gives you a complete authentication system that allows you to securely connect to remote MCP servers, without writing any of the logic. 

const { clearStorage } = useMcp({ url: 'https://mcp-server.example.com' })

// Revoke access and clear stored credentials
const handleLogout = () => {
  clearStorage() // Removes all stored tokens, client info, and auth state
}

Dynamic tool discovery

When you connect to an MCP server, use-mcp fetches the tools it exposes. If the server adds new capabilities, your app will see them without any code changes. Each tool provides type-safe metadata about its required inputs and functionality, so your client can automatically validate user input and make the right tool calls.

Debugging & monitoring capabilities

To help you troubleshoot MCP integrations, use-mcp exposes a log array containing structured messages at debug, info, warn, and error levels, with timestamps for each one. You can enable detailed logging with the debug option to track tool calls, authentication flows, connection state changes, and errors. This real-time visibility makes it easier to diagnose issues during development and production. 

Future-proofed & backwards compatible

MCP is evolving rapidly, with recent updates to transport mechanisms and upcoming changes to authorization. use-mcp supports both Server-Sent Events (SSE) and the newer Streamable HTTP transport, automatically detecting and upgrading to newer protocols, when supported by the MCP server. 

As the MCP specification continues to evolve, we’ll keep the library updated with the latest standards, while maintaining backwards compatibility. We are also excited to contribute use-mcp to the MCP project, so it can grow with help from the wider community.

MCP Inspector, built with use-mcp

In use-mcp’s examples directory, you’ll see a minimal MCP Inspector that was built with the use-mcp hook. . Enter any MCP server URL to test connections, see available tools, and monitor interactions through the debug logs. It’s a great starting point for building your own MCP clients or something you can use to debug connections to your MCP server. 


Open-sourcing the AI Playground 

We initially built the AI Playground to give users a chat interface for testing different AI models supported by Workers AI. We then added MCP support, so it could be used as a remote MCP client to connect to and test MCP servers. Today, we’re open-sourcing the playground, giving you the complete chat interface with the MCP client built in, so you can deploy it yourself and customize it to fit your needs. 

The playground comes with built-in support for the latest MCP standards, including both Streamable HTTP and Server-Sent Events transport methods, OAuth authentication flows that allow users to sign-in and grant permissions, as well as support for bearer token authentication for direct MCP server connections.


How the AI Playground works

The AI Playground is built on Workers AI, giving you access to a full catalog of large language models (LLMs) running on Cloudflare’s network, combined with the Agents SDK and use-mcp library for MCP server connections.

The AI Playground uses the use-mcp library to manage connections to remote MCP servers. When the playground starts up, it initializes the MCP connection system with const{tools: mcpTools} = useMcp(), which provides access to all tools from connected servers. At first, this list is empty because it’s not connected to any MCP servers, but once a connection to a remote MCP server is established, the tools are automatically discovered and populated into the list. 

Once connected, the playground immediately has access to any tools that the MCP server exposes. The use-mcp library handles all the protocol communication and tool discovery, and maintains the connection state. If the MCP server requires authentication, the playground handles OAuth flows through a dedicated callback page that uses onMcpAuthorization from use-mcp to complete the authentication process.

When a user sends a chat message, the playground takes the mcpTools from the use-mcp hook and passes them directly to Workers AI, enabling the model to understand what capabilities are available and invoke them as needed. 

const stream = useChat({
  api: "/api/inference",
  body: {
    model: params.model,
    tools: mcpTools, // Tools from connected MCP servers
    max_tokens: params.max_tokens,
    system_message: params.system_message,
  },
})

Debugging and monitoring

To monitor and debug connections to MCP servers, we’ve added a Debug Log interface to the playground. This displays real-time information about the MCP server connections, including connection status, authentication state, and any connection errors. 

During the chat interactions, the debug interface will show the raw message exchanged between the playground and the MCP server, including the tool invocation and its result. This allows you to monitor the JSON payload being sent to the MCP server, the raw response returned, and track whether the tool call succeeded or failed. This is especially helpful for anyone building remote MCP servers, as it allows you to see how your tools are behaving when integrated with different language models. 

Contributing to the MCP ecosystem

One of the reasons why MCP has evolved so quickly is that it’s an open source project, powered by the community. We’re excited to contribute the use-mcp library to the MCP ecosystem to enable more developers to build remote MCP clients. 

If you’re looking for examples of MCP clients or MCP servers to get started with, check out the Cloudflare AI GitHub repository for working examples you can deploy and modify. This includes the complete AI Playground source code, a number of remote MCP servers that use different authentication & authorization providers, and the MCP Inspector

We’re also building the Cloudflare MCP servers in public and welcome contributions to help make them better. 

Whether you’re building your first MCP server, integrating MCP into an existing application, or contributing to the broader ecosystem, we’d love to hear from you. If you have any questions, feedback, or ideas for collaboration, you can reach us via email at [email protected].


Where AI Provides Value

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/06/where-ai-provides-value.html

If you’ve worried that AI might take your job, deprive you of your livelihood, or maybe even replace your role in society, it probably feels good to see the latest AI tools fail spectacularly. If AI recommends glue as a pizza topping, then you’re safe for another day.

But the fact remains that AI already has definite advantages over even the most skilled humans, and knowing where these advantages arise—and where they don’t—will be key to adapting to the AI-infused workforce.

AI will often not be as effective as a human doing the same job. It won’t always know more or be more accurate. And it definitely won’t always be fairer or more reliable. But it may still be used whenever it has an advantage over humans in one of four dimensions: speed, scale, scope and sophistication. Understanding these dimensions is the key to understanding AI-human replacement.

Speed

First, speed. There are tasks that humans are perfectly good at but are not nearly as fast as AI. One example is restoring or upscaling images: taking pixelated, noisy or blurry images and making a crisper and higher-resolution version. Humans are good at this; given the right digital tools and enough time, they can fill in fine details. But they are too slow to efficiently process large images or videos.

AI models can do the job blazingly fast, a capability with important industrial applications. AI-based software is used to enhance satellite and remote sensing data, to compress video files, to make video games run better with cheaper hardware and less energy, to help robots make the right movements, and to model turbulence to help build better internal combustion engines.

Real-time performance matters in these cases, and the speed of AI is necessary to enable them.

Scale

The second dimension of AI’s advantage over humans is scale. AI will increasingly be used in tasks that humans can do well in one place at a time, but that AI can do in millions of places simultaneously. A familiar example is ad targeting and personalization. Human marketers can collect data and predict what types of people will respond to certain advertisements. This capability is important commercially; advertising is a trillion-dollar market globally.

AI models can do this for every single product, TV show, website and internet user. This is how the modern ad-tech industry works. Real-time bidding markets price the display ads that appear alongside the websites you visit, and advertisers use AI models to decide when they want to pay that price—thousands of times per second.

Scope

Next, scope. AI can be advantageous when it does more things than any one person could, even when a human might do better at any one of those tasks. Generative AI systems such as ChatGPT can engage in conversation on any topic, write an essay espousing any position, create poetry in any style and language, write computer code in any programming language, and more. These models may not be superior to skilled humans at any one of these things, but no single human could outperform top-tier generative models across them all.

It’s the combination of these competencies that generates value. Employers often struggle to find people with talents in disciplines such as software development and data science who also have strong prior knowledge of the employer’s domain. Organizations are likely to continue to rely on human specialists to write the best code and the best persuasive text, but they will increasingly be satisfied with AI when they just need a passable version of either.

Sophistication

Finally, sophistication. AIs can consider more factors in their decisions than humans can, and this can endow them with superhuman performance on specialized tasks. Computers have long been used to keep track of a multiplicity of factors that compound and interact in ways more complex than a human could trace. The 1990s chess-playing computer systems such as Deep Blue succeeded by thinking a dozen or more moves ahead.

Modern AI systems use a radically different approach: Deep learning systems built from many-layered neural networks take account of complex interactions—often many billions—among many factors. Neural networks now power the best chess-playing models and most other AI systems.

Chess is not the only domain where eschewing conventional rules and formal logic in favor of highly sophisticated and inscrutable systems has generated progress. The stunning advance of AlphaFold2, the AI model of structural biology whose creators Demis Hassabis and John Jumper were recognized with the Nobel Prize in chemistry in 2024, is another example.

This breakthrough replaced traditional physics-based systems for predicting how sequences of amino acids would fold into three-dimensional shapes with a 93 million-parameter model, even though it doesn’t account for physical laws. That lack of real-world grounding is not desirable: No one likes the enigmatic nature of these AI systems, and scientists are eager to understand better how they work.

But the sophistication of AI is providing value to scientists, and its use across scientific fields has grown exponentially in recent years.

Context matters

Those are the four dimensions where AI can excel over humans. Accuracy still matters. You wouldn’t want to use an AI that makes graphics look glitchy or targets ads randomly—yet accuracy isn’t the differentiator. The AI doesn’t need superhuman accuracy. It’s enough for AI to be merely good and fast, or adequate and scalable. Increasing scope often comes with an accuracy penalty, because AI can generalize poorly to truly novel tasks. The 4 S’s are sometimes at odds. With a given amount of computing power, you generally have to trade off scale for sophistication.

Even more interestingly, when an AI takes over a human task, the task can change. Sometimes the AI is just doing things differently. Other times, AI starts doing different things. These changes bring new opportunities and new risks.

For example, high-frequency trading isn’t just computers trading stocks faster; it’s a fundamentally different kind of trading that enables entirely new strategies, tactics and associated risks. Likewise, AI has developed more sophisticated strategies for the games of chess and Go. And the scale of AI chatbots has changed the nature of propaganda by allowing artificial voices to overwhelm human speech.

It is this “phase shift,” when changes in degree may transform into changes in kind, where AI’s impacts to society are likely to be most keenly felt. All of this points to the places that AI can have a positive impact. When a system has a bottleneck related to speed, scale, scope or sophistication, or when one of these factors poses a real barrier to being able to accomplish a goal, it makes sense to think about how AI could help.

Equally, when speed, scale, scope and sophistication are not primary barriers, it makes less sense to use AI. This is why AI auto-suggest features for short communications such as text messages can feel so annoying. They offer little speed advantage and no benefit from sophistication, while sacrificing the sincerity of human communication.

Many deployments of customer service chatbots also fail this test, which may explain their unpopularity. Companies invest in them because of their scalability, and yet the bots often become a barrier to support rather than a speedy or sophisticated problem solver.

Where the advantage lies

Keep this in mind when you encounter a new application for AI or consider AI as a replacement for or an augmentation to a human process. Looking for bottlenecks in speed, scale, scope and sophistication provides a framework for understanding where AI provides value, and equally where the unique capabilities of the human species give us an enduring advantage.

This essay was written with Nathan E. Sanders, and originally appeared in The Conversation.

AI security strategies from Amazon and the CIA: Insights from AWS Summit Washington, DC

Post Syndicated from Danielle Ruderman original https://aws.amazon.com/blogs/security/ai-security-strategies-from-amazon-and-the-cia-insights-from-aws-summit-washington-dc/

Speakers during AWS Summit Washington, DC 2025 on June 10, 2025.

At this year’s AWS Summit in Washington, DC, I had the privilege of moderating a fireside chat with Steve Schmidt, Amazon’s Chief Security Officer, and Lakshmi Raman, the CIA’s Chief Artificial Intelligence Officer. Our discussion explored how AI is transforming cybersecurity, threat response, and innovation across the public and private sectors. The conversation highlighted several key themes: how organizations can leverage AI to improve security outcomes, the rise of agentic AI and its impact on security, the importance of maintaining human oversight in AI systems, workforce development strategies, and practical approaches to implementing AI securely in enterprise environments. Below are a few excerpts from our conversation.

On leveraging AI to improve security outcomes

Steve Schmidt: “We’ve applied AI internally at Amazon in a couple of places that led to some significant benefits, including in the application security review process. By training our large language models internally on prior security reviews that we’ve done, it has allowed us to apply the knowledge and learning that our more senior staff have embodied in the documents that the LLM was trained on and expose that to our more junior staff. It really raises the bar on the absolute level of security that we can offer.”

Lakshmi Raman: “In the cybersecurity realm, we’re thinking about how AI helps us in our accreditation and authorization process, helping us ensure that the process to get systems accredited is going as quickly as possible, because the industry is moving so fast. Another area that we’re applying AI and machine learning is triaging data. We have vast amounts of data that comes in at an exponential rate, so we need to be able to go through it quickly so that we can surface insights. You can imagine a cybersecurity analyst who traditionally has gone through network data manually in order to think about blocking suspicious IP addresses or connections. Now there’s an opportunity to do all of that really efficiently and let the security analysts make the decision.”

On the rise of agentic AI and its implications for security

Steve Schmidt: “The biggest change we’re seeing right now in AI is the rise of agentic AI. The reason agentic AI is particularly interesting is that it brings with it a set of challenges about ensuring the software is taking actions within the context of the person who’s asking it…Think about that in the context of a government organization, where you have sets of information that are restricted to certain populations, there are classification decisions, access control limitations, and reasons that you can access certain data that have to be present before you can do so. Agentic AI brings opportunities—you can take actions using software automatically—but also challenges: how do we make sure that the software is doing exactly the right thing every single time, and more importantly, that we can prove what it did to stakeholders and regulators?”

Lakshmi Raman: “AI agents definitely have an opportunity to transform enterprise automation. Leveraging them to do complex multi-step workflows—to do tool calling across a variety of databases and other foundational tools—has tremendous potential, with a human as a crucial step to review what’s going on.”

On the importance of maintaining human oversight with AI

Lakshmi Raman: “In my world, I spend a lot of time thinking about how AI is impacting the workforce. One of the areas we’re looking at is the intersection between AI and our people. AI is able to speed up the processing and do automation, but at the end of the day, it’s really about who is taking on the risk, or deciding the intents and making the decisions. Whatever the machine output happens to be, really it’s about the human who’s deciding the level of oversight, the risk to take, and even whether to intervene.”

Steve Schmidt: “One thing that many people don’t realize about AI systems is that they’re nondeterministic. What nondeterminism means is you can ask an AI model the same question 100 times, and you will not get the same answer every time. So, having a human who can make a judgment about what the AI comes up with is critically important. We look at it this way: if you’re just asking a question and getting an answer, that may be one set of scrutiny that you have to get assistance. But if you’re going to take an action, you’ve got to be really sure the AI is correct. There has to be that skilled person that Lakshmi spoke about, at the end of the AI use process saying, ‘Yes, this is the right thing to do at this point in time with this context.’”

On building an AI-savvy security workforce

Steve Schmidt: “There’s a real problem in our industry: we don’t have enough security people. We simply can’t hire enough people with the right skills to do this job. What we’ve we found is that AI allows us to do a lot of the heavy lifting for the security staff, using tooling that used to have to be done by humans. Our staff is actually materially happier with their jobs if we remove a lot of that grunt work from them, which is super important. You want to keep the employees you have, so you give them tooling that helps them get the job done more efficiently, and they enjoy their job.”

Lakshmi Raman: “We’re looking for people who can live between the intersection of technology and social intelligence, people who can understand how those two areas can potentially interact around human behavior and how to think about future activities. When we’re thinking about analysts, for example, we’re thinking about people who have critical thinking skills, who can demonstrate analytic rigor, who can think multiple steps ahead with incomplete information. We’re also looking for people who have digital acumen with an understanding of cloud and cyber and AI, so that we have those technical skills in house. And finally, people who are interested in lifelong learning and curiosity, because threats change over the years. We need people who understand and are willing to learn about that.”

On advice for security leaders as AI accelerates

Steve Schmidt: “When you’re looking at making a decision, ask the person who’s bringing the information to you: ‘Why can’t AI do this?’ And if they don’t have an answer, ask ‘When will it be able to and under what condition?’ Move it into the now, the probable, the possible, and make it real for all of your staff all the time. If they’re not intentionally making that decision, they’re missing an opportunity.”

Lakshmi Raman: “You’ve got to get training out there for your users. We think of it at three different levels. First is our general workforce—which might be the most important user base—people who are sitting side by side with our AI practitioners and can help describe the workflows that need automation. Then we think about it for our practitioners, so they are keeping up with the latest. And then finally, our senior executives, who can think about how they can transform their organization with AI and generate that buy-in from the top level.”

AI is not just changing what we can do, but how we work. As Steve and Lakshmi emphasized, the most successful AI implementations will be those that thoughtfully balance automation with human oversight, focusing on use cases that deliver tangible value while managing risks appropriately. For security professionals, understanding both the technical and human dimensions of AI will be critical as we navigate this changing space.

Danielle Ruderman

Danielle Ruderman

Danielle is a Senior Manager for the AWS Worldwide Security Specialist Organization, where she leads a team that enables global CISOs and security leaders to better secure their cloud environments. Danielle is passionate about improving security by building company security culture that starts with employee engagement.

Not Just for Oreos and Tailers AMD Helios Next-Gen AI Racks Go Double-Wide

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/not-just-for-oreos-and-tailers-amd-helios-next-gen-ai-racks-go-double-wide/

The AMD Helios with the MI450 will scale to 31TB of memory and 1.4PB/s of memory bandwidth per rack. It is so big, it needs bigger racks

The post Not Just for Oreos and Tailers AMD Helios Next-Gen AI Racks Go Double-Wide appeared first on ServeTheHome.

AMD Vulcano 800G NIC Coming As AMD Outlines its UALink and UEC Scale Plans

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/amd-vulcano-800g-nic-coming-as-amd-outlines-its-ualink-and-uec-scale-plans/

The AMD Vulcano is an 800G NIC for the PCIe Gen6 and UEC era coming in 2026 for AMD’s scale-out networking solution

The post AMD Vulcano 800G NIC Coming As AMD Outlines its UALink and UEC Scale Plans appeared first on ServeTheHome.

Google has New NVIDIA RTX Pro 6000 8-GPU Instances with AMD EPYC 9005 CPUs

Post Syndicated from Cliff Robinson original https://www.servethehome.com/google-has-new-nvidia-rtx-pro-6000-8-gpu-instances-with-amd-epyc-9005-cpus/

The new Google G4 instances house eight NVIDIA RTX Pro 6000 GPUs, combining for 768GB of memory, along with two AMD EPYC Turin CPUs

The post Google has New NVIDIA RTX Pro 6000 8-GPU Instances with AMD EPYC 9005 CPUs appeared first on ServeTheHome.

CXL AI and Liquid Cooled Gigabyte Servers at Computex 2025

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/cxl-ai-and-liquid-cooled-gigabyte-servers-at-computex-2025/

We saw AI servers, CXL servers, liquid cooled servers for both AI and HPC compute, and multi-node servers at GIGABYTE’s Computex 2025 booth

The post CXL AI and Liquid Cooled Gigabyte Servers at Computex 2025 appeared first on ServeTheHome.

Hearing on the Federal Government and AI

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/06/hearing-on-the-federal-government-and-ai.html

On Thursday I testified before the House Committee on Oversight and Government Reform at a hearing titled “The Federal Government in the Age of Artificial Intelligence.”

The other speakers mostly talked about how cool AI was—and sometimes about how cool their own company was—but I was asked by the Democrats to specifically talk about DOGE and the risks of exfiltrating our data from government agencies and feeding it into AIs.

My written testimony is here. Video of the hearing is here.

Building an AI Agent that puts humans in the loop with Knock and Cloudflare’s Agents SDK

Post Syndicated from Chris Bell (Guest author) original https://blog.cloudflare.com/building-agents-at-knock-agents-sdk/

This is a guest post by Chris Bell, CTO of Knock

There’s a lot of talk right now about building AI agents, but not a lot out there about what it takes to make those agents truly useful.

An Agent is an autonomous system designed to make decisions and perform actions to achieve a specific goal or set of goals, without human input.

No matter how good your agent is at making decisions, you will need a person to provide guidance or input on the agent’s path towards its goal. After all, an agent that cannot interact or respond to the outside world and the systems that govern it will be limited in the problems it can solve.

That’s where the “human-in-the-loop” interaction pattern comes in. You’re bringing a human into the agent’s loop and requiring an input from that human before the agent can continue on its task.


In this blog post, we’ll use Knock and the Cloudflare Agents SDK to build an AI Agent for a virtual card issuing workflow that requires human approval when a new card is requested.

You can find the complete code for this example in the repository.

What is Knock?

Knock is messaging infrastructure you can use to send multi-channel messages across in-app, email, SMS, push, and Slack, without writing any integration code.

With Knock, you gain complete visibility into the messages being sent to your users while also handling reliable delivery, user notification preferences, and more.

You can use Knock to power human-in-the-loop flows for your agents using Knock’s Agent Toolkit, which is a set of tools that expose Knock’s APIs and messaging capabilities to your AI agents.

Using the Agent SDK as the foundation of our AI Agent

The Agents SDK provides an abstraction for building stateful, real-time agents on top of Durable Objects that are globally addressable and persist state using an embedded, zero-latency SQLite database.

Building an AI agent outside of using the Agents SDK and the Cloudflare platform means we need to consider WebSocket servers, state persistence, and how to scale our service horizontally. Because a Durable Object backs the Agents SDK, we receive these benefits for free, while having a globally addressable piece of compute with built-in storage, that’s completely serverless and scales to zero.

In the example, we’ll use these features to build an agent that users interact with in real-time via chat, and that can be paused and resumed as needed. The Agents SDK is the ideal platform for powering asynchronous agentic workflows, such as those required in human-in-the-loop interactions.

Setting up our Knock messaging workflow

Within Knock, we design our approval workflow using the visual workflow builder to create the cross-channel messaging logic. We then make the notification templates associated with each channel to which we want to send messages.

Knock will automatically apply the user’s preferences as part of the workflow execution, ensuring that your user’s notification settings are respected.


You can find an example workflow that we’ve already created for this demo in the repository. You can use this workflow template via the Knock CLI to import it into your account.

Building our chat UI

We’ve built the AI Agent as a chat interface on top of the AIChatAgent abstraction from Cloudflare’s Agents SDK (docs). The Agents SDK here takes care of the bulk of the complexity, and we’re left to implement our LLM calling code with our system prompt.

// src/index.ts

import { AIChatAgent } from "agents/ai-chat-agent";
import { openai } from "@ai-sdk/openai";
import { createDataStreamResponse, streamText } from "ai";

export class AIAgent extends AIChatAgent {
  async onChatMessage(onFinish) {
    return createDataStreamResponse({
      execute: async (dataStream) => {
        try {
          const stream = streamText({
            model: openai("gpt-4o-mini"),
            system: `You are a helpful assistant for a financial services company. You help customers with credit card issuing.`,
            messages: this.messages,
            onFinish,
            maxSteps: 5,
          });

          stream.mergeIntoDataStream(dataStream);
        } catch (error) {
          console.error(error);
        }
      },
    });
  }
}

On the client side, we’re using the useAgentChat hook from the agents/ai-react package to power the real-time user-to-agent chat.

We’ve modeled our agent as a chat per user, which we set up using the useAgent hook by specifying the name of the process as the userId.

// src/index.ts

import { useAgent } from "agents/react";
import { useAgentChat } from "agents/ai-react";

function Chat({ userId }: { userId: string }) {
  const agent = useAgent({ agent: "AIAgent", name: userId });
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useAgentChat({ agent });
  // ... 
}

This means we have an agent process, and therefore a durable object, per-user. For our human-in-the-loop use case, this becomes important later on as we talk about resuming our deferred tool call.

Deferring the tool call to Knock

We give the agent our card issuing capability through exposing an issueCard tool. However, instead of writing the approval flow and cross-channel logic ourselves, we delegate it entirely to Knock by wrapping the issue card tool in our requireHumanInput method.

Now when the user asks to request a new card, we make a call out to Knock to initiate our card request, which will notify the appropriate admins in the organization to request an approval.


To set this up, we need to use Knock’s Agent Toolkit, which exposes methods to work with Knock in our AI agent and power cross-channel messaging.

import { createKnockToolkit } from "@knocklabs/agent-toolkit/ai-sdk";
import { tool } from "ai";
import { z } from "zod";

import { AIAgent } from "./index";
import { issueCard } from "./api";
import { BASE_URL } from "./constants";

async function initializeToolkit(agent: AIAgent) {
  const toolkit = await createKnockToolkit({ serviceToken: agent.env.KNOCK_SERVICE_TOKEN });

  const issueCardTool = tool({
    description: "Issue a new credit card to a customer.",
    parameters: z.object({
      customerId: z.string(),
    }),
    execute: async ({ customerId }) => {
      return await issueCard(customerId);
    },
  });

  const { issueCard } = toolkit.requireHumanInput(
    { issueCard: issueCardTool },
    {
      workflow: "approve-issued-card",
      actor: agent.name,
      recipients: ["admin_user_1"],
      metadata: {
        approve_url: `${BASE_URL}/card-issued/approve`,
        reject_url: `${BASE_URL}/card-issued/reject`,
      },
    }
  );
  
  return { toolkit, tools: { issueCard } };  
}

There’s a lot going on here, so let’s walk through the key parts:

  • We wrap our issueCard tool in the requireHumanInput method, exposed from the Knock Agent Toolkit

  • We want the messaging workflow to be invoked to be our approve-issued-card workflow

  • We pass the agent.name as the actor of the request, which translates to the user ID

  • We set the recipient of this workflow to be the user admin_user_1

  • We pass the approve and reject URLs so that they can be used in our message templates

  • The wrapped tool is then returned as issueCard

Under the hood, these options are passed to the Knock workflow trigger API to invoke a workflow per-recipient. The set of the recipients listed here could be dynamic, or go to a group of users through Knock’s subscriptions API.

We can then pass the wrapped issue card tool to our LLM call in the onChatMessage method on the agent so that the tool call can be called as part of the interaction with the agent.

export class AIAgent extends AIChatAgent {
  // ... other methods

  async onChatMessage(onFinish) {
    const { tools } = await initializeToolkit(this);

    return createDataStreamResponse({
      execute: async (dataStream) => {
        const stream = streamText({
          model: openai("gpt-4o-mini"),
          system: "You are a helpful assistant for a financial services company. You help customers with credit card issuing.",
          messages: this.messages,
          onFinish,
          tools,
          maxSteps: 5,
        });

        stream.mergeIntoDataStream(dataStream);
      },
    });
  }
}

Now when the agent calls the issueCardTool, we invoke Knock to send our approval notifications, deferring the tool call to issue the card until we receive an approval. Knock’s workflows take care of sending out the message to the set of recipient’s specified, generating and delivering messages according to each user’s preferences.

Using Knock workflows for our approval message makes it easy to build cross-channel messaging to reach the user according to their communication preferences. We can also leverage delays, throttles, batching, and conditions to orchestrate more complex messaging.

Handling the approval

Once the message has been sent to our approvers, the next step is to handle the approval coming back, bringing the human into the agent’s loop.

The approval request is asynchronous, meaning that the response can come at any point in the future. Fortunately, Knock takes care of the heavy lifting here for you, routing the event to the agent worker via a webhook that tracks the interaction with the underlying message. In our case, that’s a click to the “approve” or “reject” button.

First, we set up a message.interacted webhook handler within the Knock dashboard to forward the interactions to our worker, and ultimately to our agent process.


In our example here, we route the approval click back to the worker to handle, appending a Knock message ID to the end of the approve_url and reject_url to track engagement against the specific message sent. We do this via liquid inside of our message templates in Knock: {{ data.approve_url }}?messageId={{ current_message.id }} . One caveat here is that if this were a production application, we’re likely going to handle our approval click in a different application than this agent is running. We co-located it here for the purposes of this demo only.

Once the link is clicked, we have a handler in our worker to mark the message as interacted using Knock’s message interaction API, passing through the status as metadata so that it can be used later.

import Knock from '@knocklabs/node';
import { Hono } from "hono";

const app = new Hono();
const client = new Knock();

app.get("/card-issued/approve", async (c) => {
  const { messageId } = c.req.query();
  
  if (!messageId) return c.text("No message ID found", { status: 400 });

  await client.messages.markAsInteracted(messageId, {
    status: "approved",
  });

  return c.text("Approved");
});

The message interaction will flow from Knock to our worker via the webhook we set up, ensuring that the process is fully asynchronous. The payload of the webhook includes the full message, including metadata about the user that generated the original request, and keeps details about the request itself, which in our case contains the tool call.

import { getAgentByName, routeAgentRequest } from "agents";
import { Hono } from "hono";

const app = new Hono();

app.post("/incoming/knock/webhook", async (c) => {
  const body = await c.req.json();
  const env = c.env as Env;

  // Find the user ID from the tool call for the calling user
  const userId = body?.data?.actors[0];

  if (!userId) {
    return c.text("No user ID found", { status: 400 });
  }

  // Find the agent DO for the user
  const existingAgent = await getAgentByName(env.AIAgent, userId);

  if (existingAgent) {
    // Route the request to the agent DO to process
    const result = await existingAgent.handleIncomingWebhook(body);

    return c.json(result);
  } else {
    return c.text("Not found", { status: 404 });
  }
});

We leverage the agent’s ability to be addressed by a named identifier to route the request from the worker to the agent. In our case, that’s the userId. Because the agent is backed by a durable object, this process of going from incoming worker request to finding and resuming the agent is trivial.

Resuming the deferred tool call

We then use the context about the original tool call, passed through to Knock and round tripped back to the agent, to resume the tool execution and issue the card.

export class AIAgent extends AIChatAgent {
  // ... other methods

  async handleIncomingWebhook(body: any) {
    const { toolkit } = await initializeToolkit(this);

    const deferredToolCall = toolkit.handleMessageInteraction(body);

    if (!deferredToolCall) {
      return { error: "No deferred tool call given" };
    }

    // If we received an "approved" status then we know the call was approved 
    // so we can resume the deferred tool call execution
    if (result.interaction.status === "approved") {
      const toolCallResult = 
	      await toolkit.resumeToolExecution(result.toolCall);

      const { response } = await generateText({
        model: openai("gpt-4o-mini"),
        prompt: `You were asked to issue a card for a customer. The card is now approved. The result was: ${JSON.stringify(toolCallResult)}.`,
      });

      const message = responseToAssistantMessage(
        response.messages[0],
        result.toolCall,
        toolCallResult
      );

      // Save the message so that it's displayed to the user
      this.persistMessages([...this.messages, message]);
    }

    return { status: "success" };
  }
}

Again, there’s a lot going on here, so let’s step through the important parts:

  • We attempt to transform the body, which is the webhook payload from Knock, into a deferred tool call via the handleMessageInteraction method

  • If the metadata status we passed through to the interaction call earlier has an “approved” status then we resume the tool call via the resumeToolExecution method

  • Finally, we generate a message from the LLM and persist it, ensuring that the user is informed of the approved card

With this last piece in place, we can now request a new card be issued, have an approval request be dispatched from the agent, send the approval messages, and route those approvals back to our agent to be processed. The agent will asynchronously process our card issue request and the deferred tool call will be resumed for us, with very little code.

Protecting against duplicate approvals

One issue with the above implementation is that we’re prone to issuing multiple cards if someone clicks on the approve button more than once. To rectify this, we want to keep track of the tool calls being issued, and ensure that the call is processed at most once.

To power this we leverage the agent’s built-in state, which can be used to persist information without reaching for another persistence store like a database or Redis, although we could absolutely do so if we wished. We can track the tool calls by their ID and capture their current status, right inside the agent process.

type ToolCallStatus = "requested" | "approved" | "rejected";

export interface AgentState {
  toolCalls: Record<string, ToolCallStatus>;
}

class AIAgent extends AIChatAgent<Env, AgentState> {
  initialState: AgentState = {
    toolCalls: {},
  };
  
  setToolCallStatus(toolCallId: string, status: ToolCallStatus) {
    this.setState({
      ...this.state,
      toolCalls: { ...this.state.toolCalls, [toolCallId]: status },
    });
  } 
  // ... 
}

Here, we create the initial state for the tool calls as an empty object. We also add a quick setter helper method to make interactions easier.

Next up, we need to record the tool call being made. To do so, we can use the onAfterCallKnock option in the requireHumanInput helper to capture that the tool call has been requested for the user.

const { issueCard }  = toolkit.requireHumanInput(
  { issueCard: issueCardTool },
  {
    // Keep track of the tool call state once it's been sent to Knock
    onAfterCallKnock: async (toolCall) => 
      agent.setToolCallStatus(toolCall.id, "requested"),
    // ... as before
  }
);

Finally, we then need to check the state when we’re processing the incoming webhook, and mark the tool call as approved (some code omitted for brevity).

export class AIAgent extends AIChatAgent {
  async handleIncomingWebhook(body: any) {
    const { toolkit } = await initializeToolkit(this);
    const deferredToolCall = toolkit.handleMessageInteraction(body);
    const toolCallId = result.toolCall.id;

    // Make sure this is a tool call that can be processed
    if (this.state.toolCalls[toolCallId] !== "requested") {
      return { error: "Tool call is not requested" };
    }

    if (result.interaction.status === "approved") {
      const toolCallResult = await toolkit.resumeToolExecution(result.toolCall);
      this.setToolCallStatus(toolCallId, "approved");
      // ... rest as before
    }
  }
}

Conclusion

Using the Agents SDK and Knock, it’s easy to build advanced human-in-the-loop experiences that defer tool calls.

Knock’s workflow builder and notification engine gives you building blocks to create sophisticated cross-channel messaging for your agents. You can easily create escalation flows that send messages through SMS, push, email, or Slack that respect the notification preferences of your users. Knock also gives you complete visibility into the messages your users are receiving.

The Durable Object abstraction underneath the Agents SDK means that we get a globally addressable agent process that’s easy to yield and resume back to. The persistent storage in the Durable Object means we can retain the complete chat history per-user, and any other state that’s required in the agent process to resume the agent with (like our tool calls). Finally, the serverless nature of the underlying Durable Object means we’re able to horizontally scale to support a large number of users with no effort.

If you’re looking to build your own AI Agent chat experience with a multiplayer human-in-the-loop experience, you’ll find the complete code from this guide available in GitHub.

This is the NVIDIA MGX PCIe Switch Board with ConnectX-8 for 8x PCIe GPU Servers

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/this-is-the-nvidia-mgx-pcie-switch-board-with-connectx-8-for-8x-pcie-gpu-servers/

This is the NVIDIA MGX PCIe Switch Board with ConnectX-8 for 8x PCIe GPU servers, marking the most significant architectural shift in a decade in this segment

The post This is the NVIDIA MGX PCIe Switch Board with ConnectX-8 for 8x PCIe GPU Servers appeared first on ServeTheHome.

Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature

Post Syndicated from John Lee original https://www.servethehome.com/intel-xeon-6-priority-cores-as-a-big-nvidia-gpu-ai-server-feature/

The Intel Xeon 6 with priority cores wins big at NVIDIA but there is a lot more going on in the release than meets the eye

The post Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature appeared first on ServeTheHome.

Qualcomm Discrete NPU Spotted at in Dell Pro Max Plus laptop at DTW

Post Syndicated from Will Taillac original https://www.servethehome.com/qualcomm-discrete-qualcomm-npu-spotted-at-in-dell-pro-max-plus-laptop-at-dtw/

At Dell Tech World 2025, the company showed off a new laptop class that had Qualcomm NPUs instead of a dGPU for heavier AI workloads

The post Qualcomm Discrete NPU Spotted at in Dell Pro Max Plus laptop at DTW appeared first on ServeTheHome.

AMD Takes Aim for Workstation AI Market With Radeon AI Pro R9700

Post Syndicated from Ryan Smith original https://www.servethehome.com/amd-takes-aim-for-workstation-ai-market-with-radeon-ai-pro-r9700/

In what has quickly become a busy week for workstation and server class video cards at this year’s Computex trade show, AMD has become the latest video card vendor to join the fray with the announcement of their flagship Radeon AI PRO R9700 video card. Based on AMD’s newest RDNA 4 GPU architecture, AMD is […]

The post AMD Takes Aim for Workstation AI Market With Radeon AI Pro R9700 appeared first on ServeTheHome.