Tag Archives: AI

Hi Claude, build an MCP server on Cloudflare Workers

Post Syndicated from Dina Kozlov original https://blog.cloudflare.com/model-context-protocol/

In late November 2024, Anthropic announced a new way to interact with AI, called Model Context Protocol (MCP). Today, we’re excited to show you how to use MCP in combination with Cloudflare to extend the capabilities of Claude to build applications, generate images and more. You’ll learn how to build an MCP server on Cloudflare to make any service accessible through an AI assistant like Claude with just a few lines of code using Cloudflare Workers. 

A quick primer on the Model Context Protocol (MCP)

MCP is an open standard that provides a universal way for LLMs to interact with services and applications. As the introduction on the MCP website puts it,

“Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.” 

From an architectural perspective, MCP is comprised of several components:

  • MCP hosts: Programs or tools (like Claude) where AI models operate and interact with different services

  • MCP clients: Client within an AI assistant that initiates requests and communicates with MCP servers to perform tasks or access resources

  • MCP servers: Lightweight programs that each expose the capabilities of a service

  • Local data sources: Files, databases, and services on your computer that MCP servers can securely access

  • Remote services: External Internet-connected systems that MCP servers can connect to through APIs

Imagine you ask Claude to send a message in a Slack channel. Before Claude can do this, Slack must communicate which tools are available. It does this by defining tools — such as “list channels”, “post messages”, and “reply to thread” — in the MCP server. Once the MCP client knows what tools it should invoke, it can complete the task. All you have to do is tell it what you need, and it will get it done. 

Allowing AI to not just generate, but deploy applications for you

What makes MCP so powerful? As a quick example, by combining it with a platform like Cloudflare Workers, it allows Claude users to deploy a Cloudflare Worker in just one sentence, resulting in a site like this



But that’s just one example. Today, we’re excited to show you how you can build and deploy your own MCP server to allow your users to interact with your application directly from an LLM like Claude, and how you can do that just by writing a Cloudflare Worker.

Simplifying your MCP Server deployment with workers-mcp

The new workers-mcp tooling handles the translation between your code and the MCP standard, so that you don’t have to do the maintenance work to get it set up.

Once you create your Worker and install the MCP tooling, you’ll get a worker-mcp template set up for you. This boilerplate removes the overhead of configuring the MCP server yourself:

import { WorkerEntrypoint } from 'cloudflare:workers'
import { ProxyToSelf } from 'workers-mcp'
export default class MyWorker extends WorkerEntrypoint<Env> {
  /**
   * A warm, friendly greeting from your new Workers MCP server.
   * @param name {string} the name of the person we are greeting.
   * @return {string} the contents of our greeting.
   */
  sayHello(name: string) {
    return `Hello from an MCP Worker, ${name}!`
  }
  /**
   * @ignore
   **/
  async fetch(request: Request): Promise<Response> {
    return new ProxyToSelf(this).fetch(request)
  }
}

Let’s unpack what’s happening here. This provides a direct link to MCP. The ProxyToSelf logic ensures that your Worker is wired up to respond as an MCP server, without any complex routing or schema definitions. 

It also provides tool definition with JSDoc. You’ll notice that the `sayHello` method is annotated with JSDoc comments describing what it does, what arguments it takes, and what it returns. These comments aren’t just for human readers, but they’re also used to generate documentation that your AI assistant (Claude) can understand. 

Adding image generation to Claude

When you build an MCP server using Workers, adding custom functionality to an LLM is easy. Instead of setting up the server infrastructure, defining request schemas, all you have to do is write the code. Above, all we did was generate a “hello world”, but now let’s power up Claude to generate an image, using Workers AI:

import { WorkerEntrypoint } from 'cloudflare:workers'
import { ProxyToSelf } from 'workers-mcp'

export default class ClaudeImagegen extends WorkerEntrypoint<Env> {
 /**
   * Generate an image using the flux-1-schnell model.
   * @param prompt {string} A text description of the image you want to generate.
   * @param steps {number} The number of diffusion steps; higher values can improve quality but take longer.
   */
  async generateImage(prompt: string, steps: number): Promise<string> {
    const response = await this.env.AI.run('@cf/black-forest-labs/flux-1-schnell', {
      prompt,
      steps,
    });
        // Convert from base64 string
        const binaryString = atob(response.image);
        // Create byte representation
        const img = Uint8Array.from(binaryString, (m) => m.codePointAt(0)!);
        
        return new Response(img, {
          headers: {
            'Content-Type': 'image/jpeg',
          },
        });
      }
  /**
   * @ignore
   */
  async fetch(request: Request): Promise<Response> {
    return new ProxyToSelf(this).fetch(request)
  }
}

Once you update the code and redeploy the Worker, Claude will now be able to use the new image generation tool. All you have to say is: “Hey! Can you create an image of a lava lamp wall that lives in San Francisco?”


If you’re looking for some inspiration, here are a few examples of what you can build with MCP and Workers: 

  • Let Claude send follow-up emails on your behalf using Email Routing

  • Ask Claude to capture and share website previews via Browser Automation

  • Store and manage sessions, user data, or other persistent information with Durable Objects

  • Query and update data from your D1 database 

  • …or call any of your existing Workers directly!

Why use Workers for building your MCP server?

To build out an MCP server without access to Cloudflare’s tooling, you would have to: initialize an instance of the server, define your APIs by creating explicit schemas for every interaction, handle request routing, ensure that the responses are formatted correctly, write handlers for every action, configure how the server will communicate, and more… As shown above, we do all of this for you.

For reference, an implementation may look something like this:

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const server = new Server({ name: "example-server", version: "1.0.0" }, {
  capabilities: { resources: {} }
});

server.setRequestHandler(ListResourcesRequestSchema, async () => {
  return {
    resources: [{ uri: "file:///example.txt", name: "Example Resource" }]
  };
});

server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
  if (request.params.uri === "file:///example.txt") {
    return {
      contents: [{
        uri: "file:///example.txt",
        mimeType: "text/plain",
        text: "This is the content of the example resource."
      }]
    };
  }
  throw new Error("Resource not found");
});

const transport = new StdioServerTransport();
await server.connect(transport);

While this works, it requires quite a bit of code just to get started. Not only do you need to be familiar with the MCP protocol, but you need to complete a fair amount of set up work (e.g. defining schemas) for every action. Doing it through Workers removes all these barriers, allowing you to spin up an MCP server without the complexity.

We’re always looking for ways to simplify developer workflows, and we’re excited about this new standard to open up more possibilities for interacting with LLMs, and building agents.

If you’re interested in setting this up, check out this tutorial which walks you through these examples. We’re excited to see what you build. Be sure to share your MCP server creations with us on Discord, X, or Bluesky!

Bring multimodal real-time interaction to your AI applications with Cloudflare Calls

Post Syndicated from Will Allen original https://blog.cloudflare.com/bring-multimodal-real-time-interaction-to-your-ai-applications-with-cloudflare-calls/

OpenAI announced support for WebRTC in their Realtime API on December 17, 2024. Combining their Realtime API with Cloudflare Calls allows you to build experiences that weren’t possible just a few days earlier.

Previously, interactions with audio and video AIs were largely single-player: only one person could be interacting with the AI unless you were in the same physical room. Now, applications built using Cloudflare Calls and OpenAI’s Realtime API can now support multiple users across the globe simultaneously seeing and interacting with a voice or video AI.

Have your AI join your video calls 

Here’s what this means in practice: you can now invite ChatGPT to your next video meeting:

We built this into our Orange Meets demo app to serve as an inspiration for what is possible, but the opportunities are much broader.

In the not-too-distant future, every company could have a  ‘corporate AI’ they invite to their internal meetings that is secure, private and has access to their company data. Imagine this sort of real-time audio and video interactions with your company’s AI:

“Hey ChatGPT, do we have any open Jira tickets about this?”

“Hey Company AI, who are the competitors in the space doing Y?”

“AI, is XYZ a big customer? How much more did they spend with us vs last year?”

There are similar opportunities if your application is built for consumers: broadcasts and global livestreams can become much more interactive. The murder mystery game in the video above is just one example: you could build your own to play live with your friends in different cities.  

WebRTC vs. WebSockets

These interactive multimedia experiences are enabled by the industry adoption of WebRTC, which stands for Web Real-time Communication.

Many real-time product experiences have historically used Websockets instead of WebRTC. Websockets operate over a single, persistent TCP connection established between a client and server. This is useful for maintaining a data sync for text-based chat apps or maintaining the state of gameplay in your favorite video game. Cloudflare has extensive support for Websockets across our network as well as in our AI Gateway.

If you were building a chat application prior to WebSockets, you would likely have your client-side app poll the server every n seconds to see if there are new messages to be displayed. WebSockets eliminated this need for polling. Instead, the client and the server establish a persistent, long-running connection to send and receive messages.

However, once you have multiple users across geographies simultaneously interacting with voice and video, small delays in the data sync can become unacceptable product experiences. Imagine building an app that does real-time translation of audio. With WebSockets, you would need to chunk the audio input, so each chunk contains 100–500 milliseconds of audio. That chunking size, along with the head-of-line blocking, becomes the latency floor for your ability to deliver a real-time multimodal experience to your users.

WebRTC solves this problem by having native support for audio and video tracks over UDP-based channels directly between users, eliminating the need for chunking. This lets you stream audio and video data to an AI model from multiple users and receive audio and video data back from the AI model in real-time. 

Realtime AI fanout using Cloudflare Calls

Historically, setting up the underlying infrastructure for WebRTC — servers for media routing, TURN relays, global availability — could be challenging.

Cloudflare Calls handles the entirety of this complexity for developers, allowing them to leverage WebRTC without needing to worry about servers, regions, or scaling. Cloudflare Calls works as a single mesh network that automatically connects each user to a server close to them. Calls can connect directly with other WebRTC-powered services such as OpenAI’s, letting you deliver the output with near-zero latency to hundreds or thousands of users.

Privacy and security also come standard: all video and audio traffic that passes through Cloudflare Calls is encrypted by default. In this particular demo, we take it a step further by creating a button that allows you to decide when to allow ChatGPT to listen and interact with the meeting participants, allowing you to be more granular and targeted in your privacy and security posture. 

How we connected Cloudflare Calls to OpenAI’s Realtime API 

Cloudflare Calls has three building blocks: Applications, Sessions, and Tracks:

“A Session in Cloudflare Calls correlates directly to a WebRTC PeerConnection. It represents the establishment of a communication channel between a client and the nearest Cloudflare data center, as determined by Cloudflare’s anycast routing … 

Within a Session, there can be one or more Tracks. … [which] align with the MediaStreamTrack concept, facilitating audio, video, or data transmission.”

To include ChatGPT in our video conferencing demo, we needed to add ChatGPT as a track in an ongoing session. To do this, we connected to the Realtime API in Orange Meets:

// Connect Cloudflare Calls sessions and tracks like a switchboard
async function connectHumanAndOpenAI(
	humanSessionId: string,
	openAiSessionId: string
) {
	const callsApiHeaders = {
		Authorization: `Bearer ${APP_TOKEN}`,
		'Content-Type': 'application/json',
	}
	// Pull OpenAI audio track to human's track
	await fetch(`${callsEndpoint}/sessions/${humanSessionId}/tracks/new`, {
		method: 'POST',
		headers: callsApiHeaders,
		body: JSON.stringify({
			tracks: [
				{
					location: 'remote',
					sessionId: openAiSessionId,
					trackName: 'ai-generated-voice',
					mid: '#user-mic',
				},
			],
		}),
	})
	// Pull human's audio track to OpenAI's track
	await fetch(`${callsEndpoint}/sessions/${openAiSessionId}/tracks/new`, {
		method: 'POST',
		headers: callsApiHeaders,
		body: JSON.stringify({
			tracks: [
				{
					location: 'remote',
					sessionId: humanSessionId,
					trackName: 'user-mic',
					mid: '#ai-generated-voice',
				},
			],
		}),
	})
}

This code sets up the bidirectional routing between the human’s session and ChatGPT, which would allow the humans to hear ChatGPT and ChatGPT to hear the humans.

You can review all the code for this demo app on GitHub

Get started today 

Give the Cloudflare Calls + OpenAI Realtime API demo a try for yourself and review how it was built via the source code on GitHub. Then get started today with Cloudflare Calls to bring real-time, interactive AI to your apps and services.

ASRock Rack 6U8X-EGS2 H200 NVIDIA HGX H200 AI Server Review

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/asrock-rack-6u8x-egs2-h200-nvidia-hgx-h200-ai-server-intel-xeon-review/

We review the ASRock Rack 6U8X-EGS2 H200, an NVIDIA HGX H200 8 GPU design to see how it performs and how GPU servers have evolved since 2015

The post ASRock Rack 6U8X-EGS2 H200 NVIDIA HGX H200 AI Server Review appeared first on ServeTheHome.

Marvell Custom HBM Compute Architecture for Custom Hyper-Scale XPUs

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/marvell-custom-hbm-compute-architecture-for-custom-hyper-scale-xpus/

The Marvell Custom HBM Compute Architecture uses cHBM for higher performance and density while offering lower power for hyper-scale XPUs

The post Marvell Custom HBM Compute Architecture for Custom Hyper-Scale XPUs appeared first on ServeTheHome.

Robotcop: enforcing your robots.txt policies and stopping bots before they reach your website

Post Syndicated from Celso Martinho original https://blog.cloudflare.com/ai-audit-enforcing-robots-txt

Cloudflare’s AI Audit dashboard allows you to easily understand how AI companies and services access your content. AI Audit gives a summary of request counts broken out by bot, detailed path summaries for more granular insights, and the ability to filter by categories like AI Search or AI Crawler.

Today, we’re going one step further. You can now quickly see which AI services are honoring your robots.txt policies, which aren’t, and then programmatically enforce these policies. 

What is robots.txt?

Robots.txt is a plain text file hosted on your domain that implements the Robots Exclusion Protocol, a standard that has been around since 1994. This file tells crawlers like Google, Bing, and many others which parts of your site, if any, they are allowed to access. 

There are many reasons why site owners would want to define which portions of their websites crawlers are allowed to access: they might not want certain content available on search engines or social networks, they might trust one platform more than another, or they might simply want to reduce automated traffic to their servers.

With the advent of generative AI, AI services have started crawling the Internet to collect training data for their models. These models are often proprietary and commercial and are used to generate new content. Many content creators and publishers that want to exercise control over how their content is used have started using robots.txt to declare policies that cover these AI bots, in addition to the traditional search engines.

Here’s an abbreviated real-world example of the robots.txt policy from a top online news site:

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

This policy declares that the news site doesn’t want ChatGPT, Anthropic AI, Google Gemini, or ByteDance’s Bytespider to crawl any of their content.

From voluntary compliance to enforcement

Compliance with the Robots Exclusion Protocol has historically been voluntary. 

That’s where our new feature comes in. We’ve extended AI Audit to give our customers both the visibility into how AI services providers honor their robots.txt policies and the ability to enforce those policies at the network level in your WAF

Your robots.txt file declares your policy, but now we can help you enforce it. You might even call it … your Robotcop.  

How it works

AI Audit takes the robots.txt files from your web properties, parses them, and then matches their rules against the AI bot traffic we see for the selected property. The summary table gives you an aggregated view of the number of requests and violations we see for every Bot across all paths. If you hover your mouse over the Robots.txt column, we will show you the defined policies for each Bot in the tooltip. You can also filter by violations from the top of the page. 


In the “Most popular paths” section, whenever a path in your site gets traffic that has violated your policy, we flag it for visibility. Ideally, you wouldn’t see violations in the Robots.txt column — if you do see them, someone’s not complying.


But that’s not all… More importantly, AI Audit allows you to enforce your robots.txt policy at the network level. By pressing the “Enforce robots.txt rules” button on the top of the summary table, we automatically translate the rules defined for AI Bots in your robots.txt into an advanced firewall rule, redirect you to the WAF configuration screen, and allow you to deploy the rule in our network.

This is how the robots.txt policy mentioned above looks after translation:


Once you deploy a WAF rule built from your robots.txt policies, you are no longer simply requesting that AI services respect your policy, you’re enforcing it.

Conclusion

With AI Audit, we are giving our customers even more visibility into how AI services access their content, helping them define their policies and then enforcing them at the network level.

This feature is live today for all Cloudflare customers. Simply log into the dashboard and navigate to your domain to begin auditing the bot traffic from AI services and enforcing your robots.txt directives.

Trust Issues in AI

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/12/trust-issues-in-ai.html

This essay was written with Nathan E. Sanders. It originally appeared as a response to Evgeny Morozov in Boston Review‘s forum, “The AI We Deserve.”

For a technology that seems startling in its modernity, AI sure has a long history. Google Translate, OpenAI chatbots, and Meta AI image generators are built on decades of advancements in linguistics, signal processing, statistics, and other fields going back to the early days of computing—and, often, on seed funding from the U.S. Department of Defense. But today’s tools are hardly the intentional product of the diverse generations of innovators that came before. We agree with Morozov that the “refuseniks,” as he calls them, are wrong to see AI as “irreparably tainted” by its origins. AI is better understood as a creative, global field of human endeavor that has been largely captured by U.S. venture capitalists, private equity, and Big Tech. But that was never the inevitable outcome, and it doesn’t need to stay that way.

The internet is a case in point. The fact that it originated in the military is a historical curiosity, not an indication of its essential capabilities or social significance. Yes, it was created to connect different, incompatible Department of Defense networks. Yes, it was designed to survive the sorts of physical damage expected from a nuclear war. And yes, back then it was a bureaucratically controlled space where frivolity was discouraged and commerce was forbidden.

Over the decades, the internet transformed from military project to academic tool to the corporate marketplace it is today. These forces, each in turn, shaped what the internet was and what it could do. For most of us billions online today, the only internet we have ever known has been corporate—because the internet didn’t flourish until the capitalists got hold of it.

AI followed a similar path. It was originally funded by the military, with the military’s goals in mind. But the Department of Defense didn’t design the modern ecosystem of AI any more than it did the modern internet. Arguably, its influence on AI was even less because AI simply didn’t work back then. While the internet exploded in usage, AI hit a series of dead ends. The research discipline went through multiple “winters” when funders of all kinds—military and corporate—were disillusioned and research money dried up for years at a time. Since the release of ChatGPT, AI has reached the same endpoint as the internet: it is thoroughly dominated by corporate power. Modern AI, with its deep reinforcement learning and large language models, is shaped by venture capitalists, not the military—nor even by idealistic academics anymore.

We agree with much of Morozov’s critique of corporate control, but it does not follow that we must reject the value of instrumental reason. Solving problems and pursuing goals is not a bad thing, and there is real cause to be excited about the uses of current AI. Morozov illustrates this from his own experience: he uses AI to pursue the explicit goal of language learning.

AI tools promise to increase our individual power, amplifying our capabilities and endowing us with skills, knowledge, and abilities we would not otherwise have. This is a peculiar form of assistive technology, kind of like our own personal minion. It might not be that smart or competent, and occasionally it might do something wrong or unwanted, but it will attempt to follow your every command and gives you more capability than you would have had without it.

Of course, for our AI minions to be valuable, they need to be good at their tasks. On this, at least, the corporate models have done pretty well. They have many flaws, but they are improving markedly on a timescale of mere months. ChatGPT’s initial November 2022 model, GPT-3.5, scored about 30 percent on a multiple-choice scientific reasoning benchmark called GPQA. Five months later, GPT-4 scored 36 percent; by May this year, GPT-4o scored about 50 percent, and the most recently released o1 model reached 78 percent, surpassing the level of experts with PhDs. There is no one singular measure of AI performance, to be sure, but other metrics also show improvement.

That’s not enough, though. Regardless of their smarts, we would never hire a human assistant for important tasks, or use an AI, unless we can trust them. And while we have millennia of experience dealing with potentially untrustworthy humans, we have practically none dealing with untrustworthy AI assistants. This is the area where the provenance of the AI matters most. A handful of for-profit companies—OpenAI, Google, Meta, Anthropic, among others—decide how to train the most celebrated AI models, what data to use, what sorts of values they embody, whose biases they are allowed to reflect, and even what questions they are allowed to answer. And they decide these things in secret, for their benefit.

It’s worth stressing just how closed, and thus untrustworthy, the corporate AI ecosystem is. Meta has earned a lot of press for its “open-source” family of LLaMa models, but there is virtually nothing open about them. For one, the data they are trained with is undisclosed. You’re not supposed to use LLaMa to infringe on someone else’s copyright, but Meta does not want to answer questions about whether it violated copyrights to build it. You’re not supposed to use it in Europe, because Meta has declined to meet the regulatory requirements anticipated from the EU’s AI Act. And you have no say in how Meta will build its next model.

The company may be giving away the use of LLaMa, but it’s still doing so because it thinks it will benefit from your using it. CEO Mark Zuckerberg has admitted that eventually, Meta will monetize its AI in all the usual ways: charging to use it at scale, fees for premium models, advertising. The problem with corporate AI is not that the companies are charging “a hefty entrance fee” to use these tools: as Morozov rightly points out, there are real costs to anyone building and operating them. It’s that they are built and operated for the purpose of enriching their proprietors, rather than because they enrich our lives, our wellbeing, or our society.

But some emerging models from outside the world of corporate AI are truly open, and may be more trustworthy as a result. In 2022 the research collaboration BigScience developed an LLM called BLOOM with freely licensed data and code as well as public compute infrastructure. The collaboration BigCode has continued in this spirit, developing LLMs focused on programming. The government of Singapore has built SEA-LION, an open-source LLM focused on Southeast Asian languages. If we imagine a future where we use AI models to benefit all of us—to make our lives easier, to help each other, to improve our public services—we will need more of this. These may not be “eolithic” pursuits of the kind Morozov imagines, but they are worthwhile goals. These use cases require trustworthy AI models, and that means models built under conditions that are transparent and with incentives aligned to the public interest.

Perhaps corporate AI will never satisfy those goals; perhaps it will always be exploitative and extractive by design. But AI does not have to be solely a profit-generating industry. We should invest in these models as a public good, part of the basic infrastructure of the twenty-first century. Democratic governments and civil society organizations can develop AI to offer a counterbalance to corporate tools. And the technology they build, for all the flaws it may have, will enjoy a superpower that corporate AI never will: it will be accountable to the public interest and subject to public will in the transparency, openness, and trustworthiness of its development.

AI Servers Robot Dogs and Liquid Cooling at the ASUS SC24 Booth

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/asus-sc24-ai-servers-liquid-cooling-cpu-gpu-robot-dog-intel-amd-nvidia/

We take a quick look at some of the unique servers in the ASUS SC24 booth ranging from AI, to storage, to dense compute, and even a robot dog

The post AI Servers Robot Dogs and Liquid Cooling at the ASUS SC24 Booth appeared first on ServeTheHome.

Does AI-assisted coding boost novice programmers’ skills or is it just a shortcut?

Post Syndicated from Isabella Grassl original https://www.raspberrypi.org/blog/does-ai-assisted-coding-boost-novice-programmers-skills-or-is-it-just-a-shortcut/

Artificial intelligence (AI) is transforming industries, and education is no exception. AI-driven development environments (AIDEs), like GitHub Copilot, are opening up new possibilities, and educators and researchers are keen to understand how these tools impact students learning to code. 

In our 50th research seminar, Nicholas Gardella, a PhD candidate at the University of Virginia, shared insights from his research on the effects of AIDEs on beginner programmers’ skills.

Headshot of Nicholas Gardella.
Nicholas Gardella focuses his research on understanding human interactions with artificial intelligence-based code generators to inform responsible adoption in computer science education.

Measuring AI’s impact on students

AI tools are becoming a big part of software development, but what does that mean for students learning to code? As tools like GitHub Copilot become more common, it’s crucial to ask: Do these tools help students to learn better and work more effectively, especially when time is tight?

This is precisely what Nicholas’s research aims to identify by examining the impact of AIDEs on four key areas:

  • Performance (how well students completed the tasks)
  • Workload (the effort required)
  • Emotion (their emotional state during the task)
  • Self-efficacy (their belief in their own abilities to succeed)

Nicholas conducted his study with 17 undergraduate students from an introductory computer science course, who were mostly first-time programmers, with different genders and backgrounds.

Girl in class at IT workshop at university.
By luckybusiness

The students completed programming tasks both with and without the assistance of GitHub Copilot. Nicholas selected the tasks from OpenAI’s human evaluation data set, ensuring they represented a range of difficulty levels. He also used a repeated measures design for the study, meaning that each student had the opportunity to program both independently and with AI assistance multiple times. This design helped him to compare individual progress and attitudes towards using AI in programming.

Less workload, more performance and self-efficacy in learning

The results were promising for those advocating AI’s role in education. Nicholas’s research found that participants who used GitHub Copilot performed better overall, completing tasks with less mental workload and effort compared to solo programming.

Graphic depicting Nicholas' results.
Nicholas used several measures to find out whether AIDEs affected students’ emotional states.

However, the immediate impact on students’ emotional state and self-confidence was less pronounced. Initially, participants did not report feeling more confident while coding with AI. Over time, though, as they became more familiar with the tool, their confidence in their abilities improved slightly. This indicates that students need time and practice to fully integrate AI into their learning process. Students increasingly attributed their progress not to the AI doing the work for them, but to their own growing proficiency in using the tool effectively. This suggests that with sustained practice, students can gain confidence in their abilities to work with AI, rather than becoming overly reliant on it.

Graphic depicting Nicholas' RQ1 results.
Students who used AI tools seemed to improve more quickly than students who worked on the exercises themselves.

A particularly important takeaway from the talk was the reduction in workload when using AI tools. Novice programmers, who often find programming challenging, reported that AI assistance lightened the workload. This reduced effort could create a more relaxed learning environment, where students feel less overwhelmed and more capable of tackling challenging tasks.

However, while workload decreased, use of the AI tool did not significantly boost emotional satisfaction or happiness during the coding process. Nicholas explained that although students worked more efficiently, using the AI tool did not necessarily make coding a more enjoyable experience. This highlights a key challenge for educators: finding ways to make learning both effective and engaging, even when using advanced tools like AI.

AI as a tool for collaboration, not replacement

Nicholas’s findings raise interesting questions about how AI should be introduced in computer science education. While tools like GitHub Copilot can enhance performance, they should not be seen as shortcuts for learning. Students still need guidance in how to use these tools responsibly. Importantly, the study showed that students did not take credit for the AI tool’s work — instead, they felt responsible for their own progress, especially as they improved their interactions with the tool over time.

Seventeen multicoloured post-it notes are roughly positioned in a strip shape on a white board. Each one of them has a hand drawn sketch in pen on them, answering the prompt on one of the post-it notes "AI is...." The sketches are all very different, some are patterns representing data, some are cartoons, some show drawings of things like data centres, or stick figure drawings of the people involved.
Rick Payne and team / Better Images of AI / Ai is… Banner / CC-BY 4.0

Students might become better programmers when they learn how to work alongside AI systems, using them to enhance their problem-solving skills rather than relying on them for answers. This suggests that educators should focus on teaching students how to collaborate with AI, rather than fearing that these tools will undermine the learning process.

Bridging research and classroom realities

Moreover, the study touched on an important point about the limits of its findings. Since the experiment was conducted in a controlled environment with only 17 participants, researchers need to conduct further studies to explore how AI tools perform in real-world classroom settings. For example, the role of internet usage plays a fundamental role. It will be relevant to understand how factors such as class size, prior varying experience, and the age of students affect their ability to integrate AI into their learning.

In the follow-up discussion, Nicholas also demonstrated how AI tools are becoming more accessible within browsers and how teachers can integrate AI-driven development environments more easily into their courses. By making AI technology more readily available, these tools are democratising access to advanced programming aids, enabling students to build applications directly in their web browsers with minimal setup.

The path ahead

Nicholas’s talk provided an insightful look into the evolving relationship between AI tools and novice programmers. While AI can improve performance and reduce workload, it is not a magic solution to all the challenges of learning to code.

Based on the discussion after the talk, educators should support students in developing the skills to use these tools effectively, shaping an environment where they can feel confident working with AI systems. The researchers and educators agreed that more research is needed to expand on these findings, particularly in more diverse and larger-scale educational settings. 

As AI continues to shape the future of programming education, the role of educators will remain crucial in guiding students towards responsible and effective use of these technologies, as we are only at the beginning.

Join our next seminar

In our current seminar series, we are exploring how to teach programming with and without AI technology. Join us at our next seminar on Tuesday, 10 December at 17:00–18:30 GMT to hear Leo Porter (UC San Diego) and Daniel Zingaro (University of Toronto) discuss how they are working to create an introductory programming course for majors and non-majors that fully incorporates generative AI into the learning goals of the course. 

To sign up and take part in the seminar, click the button below — we’ll then send you information about joining. We hope to see you there.

The schedule of our upcoming seminars is online. You can catch up on past seminars on our previous seminars and recordings page.

The post Does AI-assisted coding boost novice programmers’ skills or is it just a shortcut? appeared first on Raspberry Pi Foundation.