Tag Archives: artificial intelligence

5 Things Security Leaders Need to Know About Agentic AI

2025-06-09 Emma Burdett

Post Syndicated from Emma Burdett original https://blog.rapid7.com/2025/06/09/5-things-security-leaders-need-to-know-about-agentic-ai/

5 Things Security Leaders Need to Know About Agentic AI

From writing assistance to intelligent summarization, generative AI has already transformed the way businesses work. But we’re now entering a new phase where AI doesn’t just generate content, but takes independent action on our behalf.

This next evolution is called ‘agentic AI’, and it’s moving fast. Amazon recently announced a dedicated R&D group focused on agentic systems. OpenAI is advancing its Codex Agent SDK to build more capable AI “workers.” And a growing number of businesses are actively experimenting with autonomous agents to handle everything from code generation to system orchestration.

While the potential is significant, so are the risks. These new systems bring fresh challenges for security teams, from unpredictable behavior and decision-making to new forms of supply chain exposure.

Here are five things every security leader needs to know right now.

1. Agentic AI is moving from research to reality

Unlike traditional generative AI, which responds to single prompts, agentic AI systems operate more autonomously, often over longer durations and with less human supervision. They can make decisions, learn from feedback, and complete multi-step tasks using reasoning and planning capabilities.

Some agents even have memory and goal-setting functions, enabling them to adapt to changing conditions and take initiative. This has huge implications for productivity but also opens the door to a new class of operational and security risks.

According to Forrester(1), agentic AI represents a shift “from words to actions,” with agents poised to become embedded across knowledge work, development, cloud operations, and customer-facing systems. Security teams must now consider not just what AI is generating, but what it’s doing.

2. Emerging use cases span development, robotics, and IT automation

Agentic AI has been surrounded by hype, but we’re already seeing practical use cases emerge across development, automation, and robotics.

Amazon’s new R&D group is focused on building AI agents for robotics and software orchestration, aiming to automate real-world tasks with physical and digital components.
OpenAI’s Codex Agent SDK is enabling developers to build custom agents that can interact with APIs, browse the web, and execute instructions without human involvement.
In enterprise IT, some early agentic tools are being used to generate and deploy scripts, configure systems, and resolve tickets across helpdesk platforms.

As these systems become more capable, they also become harder to predict. Agentic AI doesn’t just follow rules; it works toward outcomes. That makes it both valuable and volatile in enterprise environments.

3. The attack surface is expanding in new and subtle ways

One of the most critical risks that agentic AI introduces is decision unpredictability. These systems operate with a degree of autonomy, which means they can take action based on reasoning that isn’t always traceable or transparent. That creates blind spots for traditional controls.

Other risks include:

Prompt injection and manipulation, where attackers feed malicious instructions into agent workflows
Unintended lateral movement, especially when agents interact with APIs or third-party services
Supply chain exposure, as agents increasingly rely on external tools, plugins, and data sources to function

As noted at Infosecurity Europe, many of today’s AI threat models don’t yet account for agents that can generate, interpret, and act on instructions in dynamic environments. Traditional AppSec and identity controls will need to evolve to monitor not just access, but behavior over time.

4. Governance, observability, and containment are critical

As with earlier generations of AI, governance will define how successfully agentic systems can be adopted and secured.

Experts across MIT Sloan and Thoughtworks agree: organizations must rethink how they apply principles like least privilege, role-based access, and anomaly detection in an agentic context. That includes:

Observing how agents reason and make decisions
Restricting the actions they’re allowed to take (especially with sensitive data or infrastructure)
Implementing containment strategies that limit blast radius in case of failure or manipulation

Agent-based systems can’t be treated like static applications. Security teams need tools that provide ongoing insight into agent activity, and the ability to intervene when needed.

This is especially important when agents are integrated into security workflows themselves. If an agent is responsible for triaging alerts or executing playbooks, who’s accountable when it fails? And how do you audit its decisions?

5. Security teams have an opportunity to lead — but the window is narrow

We’re still in the early stages of agentic AI adoption, which gives security leaders a rare opportunity to influence how these systems are implemented from the outset. That includes building safe defaults, engaging with developers early, and applying threat modeling and testing before agents are deployed in production.

At Rapid7, we’ve already begun evaluating agent behavior through the lens of exposure, intent, and exploitability — the same principles that guide how we think about modern attack surfaces. Our goal is to help customers harness the speed and scale of AI without sacrificing visibility or control.

We’ve also introduced AI-powered application coverage in Exposure Command to help customers identify misconfigurations and application-layer weaknesses that could be exploited by or through autonomous tools.

Where security goes from here

Agentic AI represents the next wave of transformation. It’s not just generating output; it’s taking action. And while the business potential is huge, so is the responsibility to deploy it safely.

The attackers of 2025 are not just writing better phishing emails. They’re weaponizing automation, scaling social engineering, and skipping the learning curve. Security teams need to respond with visibility, control, and collaboration. Because when everyone has access to the same technology, it’s those who use it responsibly and defensively that come out ahead.

The time to prepare is now. Agentic AI is moving quickly…and it’s not waiting for security to catch up.

(1) Forrester (2025) With Agentic AI, Generative AI Is Evolving From Words to Actions. [online] Available at: https://reprint.forrester.com/reports/with-agentic-ai-generative-ai-is-evolving-from-words-to-actions-9c6cf2d9/index.html

India’s cyber leaders prepare for AI-driven threats

2025-06-06 Rapid7

Post Syndicated from Rapid7 original https://blog.rapid7.com/2025/06/06/indias-cyber-leaders-prepare-for-ai-driven-threats/

India's cyber leaders prepare for AI-driven threats

As India’s economy rapidly digitizes, cybersecurity challenges are becoming increasingly complex. This May, Rapid7 launched our inaugural Global Security Day series across India, bringing together top security leaders in Mumbai, Delhi, and Bengaluru to address the most pressing cyber threats facing organizations in 2025.

Key insights that emerged

Across all three cities, several critical themes emerged that are shaping India’s cybersecurity landscape:

AI is No Longer Optional: Organizations recognize that AI has become essential for threat detection, exposure management, and SOC operations. The question is no longer whether to adopt AI, but how to implement it effectively.

Attack Surface Explosion: Cloud misconfigurations, insecure APIs, and identity misuse are driving today’s biggest risks. Organizations are struggling to maintain visibility and control across increasingly complex environments.

SOC Modernization is Urgent: Traditional Security Operations Centers need fundamental transformation, with automation and AI at their core to handle the volume of modern threats.

Talent Gap Challenges: Upskilling and reskilling initiatives are critical to closing the cybersecurity talent gap that’s affecting organizations globally, but particularly acutely in India’s booming tech sector.

Regulatory Evolution: India’s evolving cybersecurity regulatory landscape is shaping how organizations approach their security investments and strategy development.

A journey across India’s cyber capital cities

Our three-city roadshow, organized in collaboration with Information Security Media Group (ISMG), focused on the theme “2025 Cyber Threat Predictions: AI-Driven Attacks, Ransomware Evolution, and Expanding Attack Surface.” The response from India’s cybersecurity community was overwhelming, with 138 security leaders and delegates participating across all three cities.

Launching with impact in Mumbai (May 8)

Our Mumbai kickoff set the tone for the entire series, drawing 43 security leaders eager to dive into critical cybersecurity challenges. Rob Dooley, General Manager APJ, welcomed attendees before Regional CTO Robin Long delivered comprehensive insights on:

Global and Asia-Pacific threat landscape trends
The evolution of ransomware from double extortion to hybrid attacks
Expanding attack surfaces driven by cloud misconfigurations and insecure APIs
Next-generation defense strategies leveraging AI and continuous threat exposure management (CTEM)

The highlight was our fireside chat featuring Starlin Ponpandy, CISO of Orion Systems and Rapid7 customer, discussing ‘Building a New-Age SOC: Practical Applications of AI’. The conversation explored choosing the right SOC model, building effective teams, and navigating the complexities of AI trust and explainability.

The main focus of the Q&A was the evolving cyber threat landscape and how organizations can prepare for 2025’s AI-driven, increasingly complex attack environment.

The conversation was dominated by leaders sharing insights on the rise of AI-powered threats, the shift in ransomware tactics to double and hybrid extortion and the urgent need for proactive threat exposure management. Rapid7’s emphasis on real-time, AI-enabled defenses and automated risk management strategies sparked strong engagement.

Strategic dialogue in Delhi (May 13)

Our Delhi event brought together 43 delegates for candid, strategic discussions about 2025’s top cyber threats. Security leaders engaged in deep conversations about AI-powered detection and defense, proactive exposure management, and building resilient SOCs with automation.

The panel discussion on ‘Building a New-Age SOC’ addressed critical challenges including the cybersecurity talent gap and integrating security into DevOps workflows, a thought-provoking conversation examining identity-centric security models and the shift from traditional SOCs to Managed Detection and Response solutions.

Attendees posed incisive questions about upskilling teams in an AI-driven environment, managing tool sprawl, and operationalizing security by design – highlighting the sophisticated thinking of India’s cybersecurity leadership.

Tactical discussions in India’s Silicon Valley – Bengaluru (May 15)

Our Bengaluru finale drew the largest crowd with 52 delegates, including CISOs and cybersecurity executives from across South India. The discussions were highly tactical, focusing on:

Modernizing SOCs through AI-led threat detection
Countering double and triple extortion ransomware
Risk automation and secure cloud transformation

Veteran industry speaker Satish Kumar Dwibhashi joined Robin Long for discussions that reinforced a clear theme: security strategy must evolve in lockstep with attacker innovation.

Building for the future

The success of our India Security Days reflects not just the hunger for cybersecurity knowledge in the region, but also Rapid7’s commitment to supporting India’s digital transformation journey. We’re excited to announce that we’re expanding our presence with aGlobal Capability Center (GCC) in Pune, which will serve as a hub for innovation and home to teams across engineering, business support, and our Security Operations Center (SOC).

This initiative represents more than just business expansion – it’s about building cybersecurity capability and expertise right here in India, that will shape a secure digital future for organizations around the world.

The road ahead

The conversations, connections, and insights from our India Security Days have reinforced our belief that India’s cybersecurity community is among the most forward-thinking globally. The challenges are significant – from AI-powered attacks to evolving ransomware tactics – but so is the talent, innovation, and determination to address them.

As we look toward 2025 and beyond, events like these remind us that cybersecurity is ultimately about people: the security leaders making tough decisions, the practitioners implementing defenses, and the communities sharing knowledge and supporting each other.

Thank you to all the security leaders who joined us in Mumbai, Delhi, and Bengaluru. Your engagement, questions, and insights made these events truly impactful. We look forward to continuing these conversations and supporting India’s cybersecurity community as we navigate the challenges and opportunities ahead.

Interested in joining our growing team in India? Learn more about career opportunities at our new GCC in Pune.

Many voices, one community: Three themes from RSA Conference 2025

2025-06-05 Anne Grahn

Post Syndicated from Anne Grahn original https://aws.amazon.com/blogs/security/many-voices-one-community-three-themes-from-rsa-conference-2025/

RSA Conference (RSAC) 2025 drew 730 speakers, 650 exhibitors, and 44,000 attendees from across the globe to the Moscone Center in San Francisco, California from April 28 through May 1.

The keynote lineup was eclectic, with 37 presentations featuring speakers ranging from NBA Hall of Famer Earvin “Magic” Johnson to public and private-sector luminaries such as former US National Cyber Director Chris Inglis, U.S. Secretary of Homeland Security Kristi Noem, and cryptography experts Tal Rabin, Whitfield Diffie, and Adi Shamir.

Topics aligned with this year’s conference theme, “Many Voices. One Community,” and focused on the security industry’s shared drive to foresee risks, counter threats, and embrace new challenges.

Three themes caught our attention: agentic AI, cryptography, and public-private collaboration.

Agentic AI

The potential of agentic AI to augment human decision-making was a common thread among conversations at the conference. Numerous sessions touched on the topic, and the desire of attendees to understand the technology and learn how to balance its risks and opportunities was clear.

Separating hype from reality

An AI agent is a software program that can interact with its environment (as detailed in Figure 1), collect data, and use the data to perform self-determined tasks to meet predetermined goals.

Figure 1: Generative AI agents

Agentic systems offer a fundamentally different approach compared to traditional software, particularly in their ability to handle complex, dynamic, and domain-specific challenges. While traditional systems rely on rule-based automation and structured data, agentic systems use large language models (LLMs)—a subset of generative AI—to operate autonomously. Agents can learn from interactions with users, and make nuanced, context-aware decisions while keeping human analysts in the loop.

Numerous RSAC speakers alluded to AI agents as the next frontier in enterprise transformation. Gartner® predicts that: “By 2028, 33% of enterprise software applications will include agentic AI, up from less than 1% in 2024,” and “at least 15% of day-to-day work decisions will be made autonomously through agentic AI, up from zero percent in 2024.”

However, as organizations build AI agents, understanding the concerns that come with them is critical.

“Agentic AI presents tremendous opportunities to deliver business value and innovative security outcomes. Production deployments require a balance between its capabilities, and robust security and trust mechanisms.”
—Hart Rossman, Global Services Security Vice President at AWS

In the RSAC keynote session The Five Most Dangerous New Attack Techniques…and What to Do for Each, Rob Lee, Chief of Research and Head of Faculty at SANS Institute noted that while security teams are embracing AI to amplify productivity, threat actors are doing the same. He pointed to MIT research that shows adversarial agent systems executing attack sequences are 47 times faster than human operators, with a 93 percent success rate in privilege escalation paths.

Safeguarding GenAI & Agentic Apps, Top 10 Risks in 2025, a half-day Open Worldwide Application Security Project (OWASP) event, focused on helping attendees distinguish real threats from hype. OWASP Gen AI Security Project team members and industry experts reviewed the 2025 OWASP Top 10 List for LLM and GenAI (shown in Figure 2), and introduced Agentic AI—Threats and Mitigations—the first in a series of guides from the OWASP Agentic Security Initiative (ASI) to provide a threat-model-based reference of emerging agentic threats and mitigations. Content feedback can be submitted to ASI in advance of the guide’s next release.

Figure 2: 2025 OWASP Top 10 for LLM Applications

Agentic AI wins Cybersecurity Startup Accelerator

The second annual AWS and CrowdStrike Cybersecurity Startup Accelerator, in collaboration with the NVIDIA Inception program, took place during RSAC. A panel of judges—including George Kurtz, Founder and CEO of CrowdStrike, CJ Moses, Chief Information Security Officer at Amazon, and David Reber Jr., Chief Security Officer at NVIDIA—evaluated startups on innovation, market relevance, and go-to-market potential. Terra Security, a provider of agentic AI-powered, continuous web application penetration testing, was selected from a group of 10 finalists who pitched live. Two runners-up, Kenzo Security and Rig Security, were also recognized for their standout approaches to agentic AI-driven security.

Addressing AI risks

The need to consider your security posture when assessing overall AI readiness was emphasized throughout the conference. A defense-in-depth architecture can help mitigate risks with multiple layers of protection across both traditional and AI software components. Innovative solutions such as AI red teaming, AI behavioral sandboxing, and advanced tracing and evaluation of generative AI agents can enhance your security strategy with a proactive approach to securing AI.

Visit the following resources to help design, build, and operate AI systems: DevsecOps Revolution: Unleashing Generative AI for Automated Excellence, AWS generative AI security, responsible AI, and the Amazon AGI Labs Blog.

Cryptography

Encryption was another key topic. The FIDO Alliance hosted a half-day seminar that focused on developments in the global movement to passwordless technology such as passkeys—cryptographic keys designed to replace passwords by combining the power of public key cryptography with biometric authentication.

In Dude, Where’s My Password? The Challenges of Getting to Passwordless, Andy Ozment, Chief Technology Risk Officer and Executive Vice President at Capital One noted that 88 percent of data compromised in basic web application attacks reported in 2024 involved stolen credentials. Ozment pointed out that “going passwordless” through a combination of X.509 device certificates and FIDO2 passkeys presented Capital One with an opportunity to nearly eliminate entire classes of threats (as detailed in Figure 3), while increasing the quality of user experience.

Figure 3: Using passkeys to reduce risk while advancing user experience

Along the way, Ozment said, Capital One’s journey to passwordless was enabled by its transition from on-premises technology to going “all-in” on the public cloud. Watch the recording of his session or view the slides to learn more.

Post-quantum encryption

The state of post-quantum encryption was detailed in the popular Cryptographer’s Panel, moderated by Tal Rabin, Senior Principal Applied Scientist at AWS.

Panelist Vinod Vaikuntanathan, Professor at MIT underscored the impact of the quantum-resistant algorithm standardization process (Figure 4) started by the National Institute of Standards and Technology (NIST) in 2016. “We now have two public key encryption algorithms, and three new digital signature algorithms that are standardized,” he pointed out.

Figure 4: Post-quantum encryption algorithms

The panelists agreed that even though quantum computers aren’t here yet, the time to deploy these algorithms is now. NIST recommends phasing out existing encryption methods by 2030 in its Transition to Post-Quantum Cryptography Standards report. However, Vaikuntanathan and Adi Shamir, the “s” in the Rivest–Shamir–Adleman (RSA) public-key cryptosystem, advise organizations to take a hybrid approach that combines classic encryption algorithms such as RSA or Elliptic-curve Diffie–Hellman (ECDH) with post-quantum algorithms such as Module-Lattice-based Key Encapsulation Mechanism (ML-KEM). This approach, which is used by AWS and recommended by The European Commission, offers protection against both current and future threats.

RSAC Award for Excellence in the Field of Mathematics

Dr. Shai Halevi, Senior Principal Applied Scientist at AWS, was presented with the Award for Excellence in the Field of Mathematics for remarkable contributions to many areas of cryptography, including fundamental theory, advanced cryptographic primitives, secure multi-party computations, homomorphic encryption, and cryptographic code obfuscation.

Figure 5: Dr. Shai Halevi receives RSAC Award for Excellence in the Field of Mathematics

End-to-end encryption

Concerns about the recent US government group chat leak were also raised during the discussion. Public-key cryptography pioneer Whitfield Diffie noted that the use of an encrypted consumer messaging app to communicate classified information broke archiving laws. Because some commercial tools use 256-bit Advanced Encryption Standard (AES) encryption, which is “good enough” to protect communications, he predicted an increase in the use of consumer applications to protect sensitive information in unapproved ways.

The Cybersecurity and Infrastructure Security Agency (CISA) and the Federal Bureau of Investigation (FBI) recently advised individuals and organizations to start using encrypted messaging apps. However, as the role of these applications in business communication expands, it’s important not to lose sight of recordkeeping and compliance obligations. Organizations should consider solutions that offer administrative controls and data retention capabilities along with encryption.

AWS Wickr, for example, is a messaging and collaboration service that protects messaging, calling, file sharing, screen sharing, and location sharing with 256-bit end-to-end encryption. The data retention and administrative controls that it provides help customers meet regulatory requirements and manage user and device data remotely.

Wickr is Department of Defense Cloud Computing Security Requirements Guide Impact Level 5 (DoD CC SRG IL5) and Federal Risk and Authorization Management Program (FedRAMP) High authorized in the AWS GovCloud (US-West) Region. It also meets compliance programs and standards such as Health Insurance Portability and Accountability Act (HIPAA) eligibility, International Organization for Standardization (ISO) 27001, and System and Organization Controls (SOC) 1, 2, and 3.

Visit the AWS News Blog and the AWS Security Blog to learn about AWS passkey multi-factor authentication, how AWS is migrating to post quantum cryptography (PQC), and how we can help you implement a layered encryption strategy for your organization.

Public-private collaboration

Numerous sessions underlined the importance of collaboration to strengthening security. In his keynote, Johnson called attention to a lesson he learned on the basketball court—his peers made him stronger. “Larry Bird made me a better basketball player,” he said, relating his experience to the need for security teams to assist and learn from each other.

In Making America Safe Again Through Cyber Defense, Kristi Noem, U.S. Secretary of Homeland Security equated cybersecurity with national security, and insisted that building on public-private partnerships is “incredibly important.” “Our goal,” she said, “is to use our maximum effect of cooperation to make sure that we’re going after bad actors.”

After assuring attendees that CISA will continue to be America’s cyber defense agency, she urged congress to reauthorize the Cybersecurity Information Sharing Act of 2015. The law, which is set to expire in September, incentivizes businesses to share threat indicators with the Department of Homeland Security (DHS) and helps make sure that both the federal government and companies can take collaborative steps to address threats.

Panelists at an offsite threat intelligence discussion reiterated the ability of private industry to supplement government security capabilities. Adam Meyers, Senior VP, Counter Adversary Operations at CrowdStrike pointed out that technology companies often have more data and signals than governments. The CrowdStrike Falcon solution, he said, processes over 6 trillion events per day, and 55 million events per second at peak. This volume facilitates the detection of threat patterns that might otherwise go unnoticed.

Similarly, Moses noted that the size and scale of AWS infrastructure gives us unique visibility into internet traffic. Our global network of sensors and associated disruption tools observe over 700 million threat interactions every day, out of which 450 million can be classified as malicious. Internal threat intelligence tools such as MadPot, our sophisticated global honeypot system, produce high-fidelity findings (pieces of relevant information) that can be used to drive proactive intelligence sharing, and reduce investigative workloads.

“We’ll work together in order to be able to put a bow on a case and hand it to the FBI and DOJ, such that they don’t have to expend a great amount of resources in order to go forward and try to figure things out that we already know.” —CJ Moses, Chief Information Security Officer and VP of Security Engineering at Amazon

An example of this is the disruption of the cybercriminal group known as Anonymous Sudan. The group was responsible for tens of thousands of distributed denial-of-service (DDoS) attacks against critical infrastructure, corporate networks, and government agencies. With the help of tools like MadPot, AWS experts were able to identify the hosting provider infrastructure that the group used to launch the DDos attacks, and work with providers to disrupt them. Akamai SIRT, Cloudflare, CrowdStrike, DigitalOcean, Flashpoint, Google, Microsoft, PayPal, SpyCloud, and other private sector entities also assisted law enforcement, leading to the indictment of two Anonymous Sudan leaders.

The value of combined perspectives

RSA Conference 2025 might be over, but the learning continues. Additional highlights that include the west stage keynotes, the Innovation Sandbox, and dozens of insightful sessions on topics such as the changing role of the CISO, women in cyber, and of course—cloud security—are available on demand.

If there’s one key takeaway, it’s a collective sense of transition. As we explore the benefits and risks of emerging AI technologies, encryption strategies, and information sharing, it’s important to remember that we cannot effectively combat threats in isolation. Security is a collective endeavor; only by working together can we adapt to evolving challenges and build cyber resilience.

For more information about cloud security, register to join AWS, Google Cloud, and Microsoft online at the SANS 2025 Cloud Security Exchange on August 21.

Introducing AI Attack Coverage in Exposure Command: Secure what traditional AppSec Tools miss

2025-06-03 Michael Chroney

Post Syndicated from Michael Chroney original https://blog.rapid7.com/2025/06/03/introducing-ai-attack-coverage-in-exposure-command-secure-what-traditional-appsec-tools-miss/

Introducing AI Attack Coverage in Exposure Command: Secure what traditional AppSec Tools miss

The rise of GenAI-powered applications – from internal copilots to customer-facing chatbots – is changing how businesses operate. While these tools drive innovation, they also introduce a fast moving, often invisible layer of risk.

Most traditional AppSec tools were never built to handle the unique threats of conversational AI interfaces. As attackers get savvier, security teams need the right kind of coverage.

That’s why we’re excited to introduce AI Attack Coverage in Exposure Command and InsightAppSec.

This release brings purpose built protection for AI driven applications into your existing AppSec workflows, so you can uncover vulnerabilities that legacy tools miss – and stop AI specific threats before they become business problems.

A new class of risk requires a new kind of coverage

As organizations embrace GenAI, they’re also expanding their attack surface – often without realizing it. LLMs (large language models) and AI integrations create new opportunities for attackers to exploit vulnerabilities like:

Prompt injection: Tricking the model into revealing sensitive data or bypassing security controls.
Plugin abuse: Misusing connected tools through AI interfaces.
Data leakage: Inadvertent exposure of sensitive information in responses.

The problem? These aren’t issues most scanners can detect, and manual reviews don’t scale. AI Attack Coverage addresses this gap head-on with capabilities designed to tackle the evolving threat landscape.

Built to secure what matters most

AI Attack Coverage in Exposure Command introduces a suite of enhancements that work seamlessly within your existing DevSecOps pipelines:

Smarter scanning for smarter apps: Our enhanced R7Crawler interacts with LLMs and chatbots in real-world ways – uncovering vulnerabilities traditional scanners can’t see.
Purpose built LLM testing: With 6 new attack modules, comprising 25+ new attack techniques, that will target six of the OWASP Top 10 for LLMs, we help you find prompt injection, improper output handling, and more.
AI aware validation: Reduce false positives with intelligent validation powered by AWS Nova Pro, so teams can focus on what’s real and actionable.
Developer first remediation: Features like Attack Replay and CI/CD integrations help teams fix faster – without slowing down releases.

Introducing AI Attack Coverage in Exposure Command: Secure what traditional AppSec Tools miss

Complete visibility, from code to cloud

Exposure Command doesn’t stop at the app layer. With integrated telemetry from InsightCloudSec, you also get:

Full-stack visibility into where GenAI services live across your environment.
Automated enforcement of security best practices for AI/ML environments.
Unified context to prioritize what’s truly risky in your hybrid estate.

Get started with AI Attack Coverage

If you’re building with AI – or thinking about it – now’s the time to make sure your security strategy keeps up. AI Attack Coverage gives your team the visibility, context, and control to manage risk in a world where apps are getting smarter, and attackers are more adept at exploiting them.

Whether you’re an AppSec engineer, a risk leader, or a CISO trying to future-proof your security posture, Exposure Command brings it all together.

Learn More About Rapid7’s Exposure Command

Confidently identify and prioritize exposures from endpoint to cloud with full attack surface visibility and threat-aware risk context.

Analyze media content using AWS AI services

2025-06-02 Jack Bradham

Post Syndicated from Jack Bradham original https://aws.amazon.com/blogs/architecture/analyze-media-content-using-aws-ai-services/

Organizations managing large audio and video archives face significant challenges in extracting value from their media content. Consider a radio network with thousands of broadcast hours across multiple stations and the challenges they face to efficiently verify ad placements, identify interview segments, and analyze programming patterns.

In this post, we demonstrate how you can automatically transform unstructured media files into searchable, analyzable content. By combining Amazon Transcribe, Amazon Bedrock, Amazon QuickSight, and Amazon Q, organizations can achieve the following:

Process and transcribe media files upon upload
Identify commercials, interviews, and program segments
Extract insights using foundation models (FMs)
Create a searchable knowledge base
Generate rich visualizations for decision-making
Enable natural language queries across their media archive
Visualize complex information with intuitive graphics

In the following sections, we explore how these AWS services work together to help organizations unlock the full potential of their media content, whether for advertising compliance, content analysis, or discovering specific segments within thousands of hours of recordings.

Solution overview

This solution provides an event-driven media analysis pipeline that transforms how you manage and extract value from your content:

Streamline content management – Automatically process media files the moment they’re uploaded, saving time and reducing manual work
Unlock deeper insights – Generate accurate transcriptions that capture not just words, but the full context of your content—including speakers, timing, and key moments
Harness AI – Automatically extract meaningful insights and uncover hidden patterns in your media without extensive manual review
Build a searchable knowledge base – Turn scattered media files into a discoverable catalog that your entire team can use
Build a customizable interface – Create a customizable UI to search the catalog
Create powerful visualizations – Bring your insights to life with intuitive visualizations that make complex information immediately understandable

The following diagram illustrates our architecture.

Media Analysis Architecture

This event-driven architecture automatically processes and analyzes multimedia content using AWS services. The workflow consists of the following steps:

A user uploads media files to an Amazon Simple Storage Service (Amazon S3) bucket. A “New Media” event triggers the first AWS Step Functions workflow. This workflow handles the initial cataloging based on values in the file name and launches the transcription process.
Amazon Transcribe converts the audio into accurate, readable text. The transcribed content is securely saved to an S3 bucket for further analysis. A “Transcription Complete” event triggers the next step.
A second Step Functions workflow processes the transcription. Using predefined prompts, Amazon Bedrock analyzes the transcripts to extract meaningful information. Key insights extracted from the transcript are stored in an S3 data lake.
The processed results are organized systematically, structured by date (year/month/day) and tagged with relevant attributes. This organized data enables natural language queries through Amazon Q when used as a knowledge base, interactive visualizations using QuickSight, and straightforward content discovery and analysis.
Amazon Athena serves as the data exploration tool to query the data lake. Athena is used as the data source in QuickSight, which turns complex data into clear, compelling visuals.

This architecture automatically transforms raw media content into searchable, analyzable data while maintaining an organized hierarchy for efficient access and analysis. The event-driven design provides automatic processing of new content as it arrives, and the combination of AWS AI services enables deep content understanding and insight extraction. Each AWS service plays a crucial role in transforming your media content:

Amazon Bedrock – Reviews content after transcription for insights and entity extraction:
- Uses advanced FMs to analyze transcripts
- Identifies commercials, interviews, and program segments
- Extracts meaningful insights from content
Amazon EventBridge – Triggers actions in the cataloging workflow:
- Monitors for new media files and completed transcriptions
- Automatically triggers Step Functions workflows
AWS Lambda – Handles custom code actions needed in the workflow:
- Facilitates interaction with Amazon Bedrock
- Executes custom prompts on transcripts
- Enables flexible, scalable processing
Amazon Q – Serves as the frontend and Retrieval Augmented Generation (RAG) engine:
- Addresses enterprise generative AI needs by providing a turnkey solution with built-in security features like single sign-on (SSO) integration and responsible AI governance policies
- Allows businesses to quickly deploy AI assistance while maintaining compliance, data privacy, and security standards
- Enables natural language queries across the media archive
- Links results to the source media files
- Provides conversational access to content
Amazon QuickSight – Turns insights in beautiful visualization for better consumption:
- Creates interactive dashboards and visualizations
- Displays comprehensive media analytics
- Helps track advertising, programming, and content patterns
Amazon S3 – Stores assets and the catalog:
- Securely stores raw media files, transcripts, and processed data
- Automatically triggers processing when new content is uploaded
AWS Step Functions – Orchestrates the entire content processing workflow:
- Manages transcription and AI analysis steps
- Provides robust error handling and automatic retries
Amazon Transcribe – Converts speech to accurate, readable text:
- Identifies speakers and timestamps
- Provides accurate transcriptions of audio content

Security considerations

Although this post focuses on the technical implementation of media content analysis, it’s important to acknowledge that production deployments should include comprehensive security measures:

Data storage security (Amazon S3):
- Enable server-side encryption using AWS Key Management Service (AWS KMS) keys
- Apply bucket policies restricting access to authorized principals only
- Enable Amazon S3 Block Public Access at account and bucket levels
- Enable versioning for data recovery
- Implement lifecycle policies for data retention
- Enable S3 access logging
- Use presigned URLs for temporary access
Identity and Access Management (IAM):
- Create dedicated service roles with minimum required permissions for:
  - Step Functions execution
  - Amazon Transcribe jobs
  - Amazon Bedrock API calls
  - Athena queries
- Implement role-based access control
- Regularly rotate credentials
- Enable multi-factor authentication (MFA) for all users
- Use AWS Organizations for multi-account management
Network security:
- Deploy virtual private cloud (VPC) endpoints for:
  - Amazon S3
  - Athena
  - QuickSight
- Implement network access control lists (ACLs) and security groups
- Enable VPC Flow Logs
- Use AWS PrivateLink where applicable
- Configure route tables to control traffic flow
Data encryption:
- Implement AWS KMS encryption for S3 objects
- Use TLS 1.2+ for all API communications
- Enable automatic key rotation in AWS KMS
- Implement envelope encryption for sensitive data
Monitoring and detection:
- Enable AWS CloudTrail for API activity logging
- Configure Amazon GuardDuty for threat detection
- Set up Amazon CloudWatch:
  - Metrics for service health
  - Alarms for security events
  - Log groups for application logs
- Enable S3 server access logging
- Configure VPC Flow Logs
Access controls:
- Implement fine-grained access controls for:
  - Amazon Bedrock model access
  - Athena query permissions
  - QuickSight dashboard sharing
- Conduct regular access reviews

Additionally, compliance requirements and data governance policies might impact how you implement this solution in your environment.

These security considerations are crucial but beyond the scope of this post. We recommend consulting AWS security best practices and working with your security team to implement appropriate measures for your specific use case. For more information on AWS security best practices, refer to Best Practices for Security, Identity, & Compliance.

The following sections walk you through setting up each component of the architecture to help you transform raw media into actionable insights.

Prerequisites

The following are the prerequisites to follow along this post:

An AWS account with an IAM user who has permissions to Lambda, Amazon Bedrock, Amazon S3, Amazon Transcribe, Step Functions, and IAM.
Access to the FMs that you are planning to use in Amazon Bedrock. For instructions, see Access Amazon Bedrock foundation models.
A QuickSight enterprise account. For instructions, see Setting up for Amazon QuickSight.

Create S3 buckets

For this solution, we create three distinct buckets to support the media analytics workflow:

Raw media bucket for incoming files
Transcription outputs bucket
Processed insights bucket

For instructions on creating buckets, refer to Creating a general purpose bucket.

Configure EventBridge

You can enable event notifications on the raw media bucket to trigger your automated workflow through EventBridge. Establish your automation backbone by monitoring S3 bucket activities. When new media arrives or transcription completes, EventBridge will trigger the appropriate workflow, providing continuous processing. For further instructions, refer to Creating rules that react to events in Amazon EventBridge.

The following are two example triggers that can be used to filter events and trigger Step Functions workflows. The following is an example filter for new media files:

{
  "source": ["aws.s3"],
  "detail-type": ["Object Created"],
  "detail": {
    "bucket": {
      "name": ["rawinputbucket"]
    }
  }
}

The following is an example filter for new transcripts added to the data lake:

{
  "source": ["aws.s3"],
  "detail-type": ["Object Created"],
  "detail": {
    "bucket": {
      "name": ["business-data-lake"]
    },
    "object": {
      "key": [{
        "suffix": ".transcription"
      }]
    }
  }
}

Create Step Functions workflows

We design the orchestration layer with two key workflows. The first handles media intake and transcription, and the second manages AI analysis. Each workflow includes safeguards for potential failures and retry mechanisms. For further instructions, refer to Learn how to get started with Step Functions.

The following diagram shows an example of processing new media uploads for indexing and transcription.

Media Analysis - Step Function Example

The following diagram shows an example of the Step Functions workflow that analyzes the transcription.

Set up Amazon Transcribe

To create an Amazon Transcribe job, you need permissions to do so. You can implement a speech-to-text conversion with powerful features like language detection, speaker identification, and custom vocabulary support to provide accurate transcription of your media content. For further instructions, refer to How Amazon Transcribe works.

Configure Amazon Bedrock

You can power your AI analysis engine by setting up precise prompts that extract meaningful insights. Amazon Bedrock processes transcripts to identify key segments, speakers, and topics, transforming raw text into structured data. For instructions, refer to Design a prompt.

The following is a sample prompt:

You will be reviewing a radio transcription to identify advertisements and extract relevant details. Your task is to analyze the provided transcript and output the results in a specific JSON format based on a given schema.
Please follow these steps to complete the task:
1. Carefully read through the entire transcript.
2. Identify all advertisements within the transcript. Look for clear indicators such as product mentions, promotional language, or transitions from regular content to commercial content.
3. For each advertisement you identify, determine the following information:
    - Company: The name of the company being advertised
    - Start time: The timestamp in the transcript where the ad begins
    - End time: The timestamp in the transcript where the ad ends
    - Product: The specific product or service being advertised
4. Format your findings into a JSON object that follows the provided schema. Each advertisement should be a separate object within an array.
5. Ensure these fields in your response are provided for each advertisement.. All are required fields: company, starttime, endtime, product. 
6. Use precise timestamps for start and end times. If exact times are not available, make a best estimate based on the transcript's context.
7. If a particular field is unclear or not explicitly mentioned in the transcript, you may use "Unknown" as the value.
8. Only respond with json and nothing else. Do not provide comments or explain your answer. 
9. Surround the JSON response with standard ```json markers
Here's an example of how your output should be formatted:
{
"advertisements": [
        {
            "company": "TechGadgets Inc.",
            "starttime": "00:05:30",
            "endtime": "00:06:15",
            "product": "SmartHome Hub"
        },
        {
            "company": "FreshFoods Market",
            "starttime": "00:15:45",
            "endtime": "00:16:30",
            "product": "Organic Produce Delivery Service"
        }
    ]
}
Do not add any fields that are not specified in the schema, and ensure all required fields are present for each advertisement.

Create a structured data lake

We create a hierarchical data organization strategy that enables efficient access and analysis. You can use AWS Glue crawlers to automatically discover and catalog your media metadata. For instructions, refer to Using crawlers to populate the Data Catalog. Configure Athena tables to enable SQL-based querying of your media insights:

CREATE OR REPLACE VIEW "commercials_view" AS 
SELECT
  metadata.market market
, metadata.station_call station_call
, metadata.format_type format_type
, CAST(metadata.timestamp AS timestamp) timestamp
, ads.company adCompany
, ads.product adProduct
, ads.starttime
, ads.endtime
FROM
  (commercials
CROSS JOIN UNNEST(advertisements) t (ads))

Set Up Amazon Q

You can enable natural language interaction with your media archive using Amazon Q Business. Configure the knowledge base and metadata to make your content searchable and accessible through conversational queries. Use the processed insights S3 buckets to configure the knowledge base. For instructions, refer to Getting started with Amazon Q Business.

The following screenshot shows example conversations with an AI assistant.

Build QuickSight dashboards

With QuickSight, you can create visual analytics that bring your data to life. Connect to Athena views to display advertising patterns, content analysis, and performance metrics in interactive dashboards. For more information, refer to Tutorial: Create an Amazon QuickSight dashboard.

The following screenshots are a few examples of dashboards created for a fictitious radio station as part of our use case.

Validate and optimize your media analytics solution

After you implement your media analytics architecture, follow these critical steps to achieve robust performance and alignment with your organization’s needs. First, configure a comprehensive testing approach. Imagine you’re preparing to launch your media analysis solution. Your testing journey begins with accuracy validation:

Compare transcription outputs against original media
Verify AI-generated insights for precision
Use representative sample sets from your content library

You start by taking a recently processed radio show and comparing its transcription against the original broadcast. Your team meticulously reviews the AI-generated insights, checking if key moments like ad transitions or interview segments are correctly identified. To make sure your system works across all content types, you select diverse samples from your library—perhaps a morning talk show, an evening news segment, and a weekend sports broadcast. Next, you delve into performance benchmarking:

Measure processing time for different media types
Evaluate resource utilization across AWS services
Identify potential bottlenecks in the workflow

Time how long it takes to process different types of media files, from short commercial segments to lengthy program recordings. As you watch how your AWS services respond under various loads, you can monitor resource consumption patterns. This helps you identify processing bottlenecks—for instance, you might discover that certain file types take longer to transcribe or that concurrent processing needs optimization. Finally, you put yourself in your users’ shoes for a thorough user experience assessment:

Test natural language queries with Amazon Q
Validate search result relevance
Gather feedback from potential end-users

Team members can interact with Amazon Q, asking questions they would naturally pose when searching for specific content. For example, you can test whether searching for “interviews about climate change last week” returns relevant results. Gathering feedback from potential users—perhaps a content manager with different needs than a compliance officer—provides invaluable insights. Their real-world experiences guide your refinements and make sure the system serves its intended purpose. This comprehensive testing approach, combining structured evaluation with real-world scenarios, sets the stage for a robust and user-friendly media analysis solution. As your media analysis solution moves from initial deployment to production, optimizing its performance becomes crucial for both cost-efficiency and user satisfaction. A radio network processing thousands of hours of content weekly might find that even small improvements in transcription accuracy or processing speed can lead to significant cost savings and better content discoverability. Similarly, a marketing team analyzing ad placements across multiple stations needs precise insights to make data-driven decisions about advertising effectiveness. With these business imperatives in mind, consider the following configuration optimization strategies:

Transcription refinement:
- Adjust language models for domain-specific terminology
- Fine-tune speaker identification settings
- Implement custom vocabularies for improved accuracy
AI insight generation:
- Refine prompts for more targeted analysis
- Experiment with different AI models
- Align extraction parameters with business objectives
Scalability considerations:
- Test workflow performance with increasing media volumes
- Implement appropriate auto scaling configurations
- Monitor cost-effectiveness of your architecture
Continuous improvement:
- Establish regular review cycles
- Track key performance metrics
- Iterate on your solution based on real-world usage

We recommend starting with a pilot implementation and gradually expanding your media analytics capabilities.

Clean up

To avoid incurring ongoing charges, clean up the resources you created as part of this solution:

Delete QuickSight resources:
1. Delete dashboards created for media analytics.
2. Delete the datasets connected to Athena.
3. If no longer needed, delete the QuickSight Enterprise subscription.
Delete S3 buckets:
1. Empty and delete the raw media bucket, transcription outputs bucket, and processed insights bucket.
Remove EventBridge rules:
1. Delete the rules created for monitoring S3 bucket activities.
2. Remove targets associated with these rules.
Delete Step Functions workflows:
1. Delete the media intake and transcription workflow.
2. Delete the AI analysis workflow.
Remove Lambda functions:
1. Delete Lambda functions created for interaction with Amazon Bedrock.
2. Remove associated IAM roles and policies.
Clean up data lake components:
1. Delete Athena views and tables.
2. Remove AWS Glue crawlers and databases.
3. Delete stored query results.
Remove Amazon Q configurations:
1. Delete knowledge bases created.
2. Remove custom configurations.
Remove Amazon Bedrock settings:
1. Remove custom prompts.
2. Disable access to FMs if no longer needed.
Delete Amazon Transcribe settings:
1. Remove custom vocabularies.
2. Delete stored transcription jobs.
Remove IAM resources:
1. Delete custom IAM roles created for this solution.
2. Remove associated IAM policies.
Complete additional cleanup:
1. Delete CloudWatch Logs groups associated with Lambda functions.
2. Remove CloudWatch alarms or metrics created for monitoring.
3. Delete saved queries in Athena.

Common use cases

Organizations in different sectors can use this architecture to unlock value from their audio and video content. You can adapt this solution to meet your specific needs, such as managing broadcast media, corporate communications, educational materials, and more. Let’s explore how different industries might apply this technology:

Media and broadcasting:
- Track advertising compliance
- Verify media placement accuracy
- Analyze broadcast content at scale
Corporate and enterprise:
- Convert meeting recordings into searchable knowledge bases
- Identify key decisions and action items
- Enhance organizational knowledge management
Education and training:
- Create comprehensive, topic-based course catalogs
- Index training materials for quick retrieval
- Support continuous learning initiatives
Legal services:
- Generate precise, timestamped transcripts
- Develop searchable legal proceeding archives
- Improve document review efficiency
Healthcare:
- Extract critical medical insights from consultations
- Categorize patient interaction data
- Support clinical documentation processes
Government and public sector:
- Build comprehensive public meeting archives
- Implement automated topic categorization
- Enhance transparency and accessibility
Customer service:
- Analyze call recordings for quality improvement
- Identify service trends and customer pain points
- Drive continuous customer experience enhancement

This media analytics architecture demonstrates notable versatility. By using AI, organizations can transform raw audio and video content into structured, meaningful insights that drive decision-making across industries.

Conclusion

In this post, we demonstrated how to use AWS services to convert unstructured media content into actionable intelligence. By combining Amazon Transcribe, Amazon Bedrock, QuickSight, and Amazon Q, you can create a scalable, automated solution for media analysis that adapts to your organizational needs.

This solution offers the following key architectural advantages:

Automated media file processing at scale
AI-powered insight generation
Natural language search capabilities
Interactive decision-making visualizations
Flexible, maintainable infrastructure

Organizations can now convert content into searchable knowledge, extract insights automatically, develop data-driven content strategies, and enhance operational efficiency through automation.

As audio and video content generation continues to accelerate, the ability to efficiently process and extract value becomes increasingly critical. This architecture provides a robust foundation for current needs while remaining adaptable to future technological innovations.

We invite you to explore how this media analytics solution can address your organization’s unique challenges. Consider your specific use cases and unlock the insights waiting to be discovered in your media archives.

About the authors

Bridging the divide: Connecting global communities with Experience AI

2025-05-29 Katherine Leadbetter

Post Syndicated from Katherine Leadbetter original https://www.raspberrypi.org/blog/bridging-the-divide-connecting-global-communities-with-experience-ai/

From smart devices to workplace tools, AI is becoming part of everyday life and a major part of how people are thinking about the future — raising big questions about access, skills, and readiness.

As governments around the world create AI strategies for the decade ahead, many are seeing an urgent need to address the large gap between how AI tools are already impacting jobs and people’s lives, and making sure young people have the chance to gain the skills and knowledge to keep up with this rapid pace of technological change. This gap is larger still when it comes to opportunities for educationally underserved communities.

A group of students and educators holding an Experience AI poster.

That’s why we’re excited to share how Experience AI, our AI literacy programme, is helping organisations around the world create these much-needed opportunities for young people.

The value of a global network

Experience AI was co-developed in 2022 by us and industry experts at Google DeepMind with a clear mission: to equip teachers with free, accessible, easy-to-use classroom resources that build AI literacy from the ground up. The programme offers a suite of materials to help students understand real-world applications of AI, the basics of machine learning, and the ethical considerations around these technologies.

A picture of Philip Colligan delivering a talk.

In 2023, we started building an international Experience AI network by collaborating with a group of our existing educational partners. We saw a huge amount of interest and received very positive feedback, and through our partnerships we reached an estimated one million young people. In late 2024, with support from Google.org, we tripled the size of our Experience AI partner network to 21, with new organisations joining from across Europe, the Middle East, and Africa. In this way, we aim to reach an additional 2.3 million young people by December 2026, helping them to gain the knowledge and skills to confidently engage with AI in an ever-changing world.

Each partner in the Experience AI network is a unique educational organisation looking to create lasting social change. Through their local knowledge and networks, we can present Experience AI to educators and students in a way that is engaging and relevant for local communities.

A group of students participating in an Experience AI session.

Partners help us to adapt and translate our resources, all while making sure that the core pedagogy and design principles of Experience AI are preserved. Just as importantly, these organisations train thousands of teachers on how to use the materials, providing educators with free support. With their work, they reach communities that otherwise may have never had the opportunity to learn about AI.

We asked some of our partners to share their insights on the impact Experience AI is having on the teachers and young people in their communities.

Building communities

The Latvian Safer Internet Centre (LSIC), an initiative of our partner, the Latvian Internet Association (LIA), is dedicated to helping young people protect themselves online, and to preparing them for a fast-changing digital economy. As an Experience AI partner, they aim to train 850 teachers and support 43,000 students to build a strong foundation in AI literacy through the programme.

“We hope to spark a cultural shift in how AI is […] taught in Latvian schools. Our goal is for AI literacy to become a natural part of digital competence education, not an optional extra.”

A woman is delivering a presentation about Experience AI.

Based in Riga, the team is travelling to 18 different regions across Latvia to bring in-person professional development to teachers, including those in rural communities far from major cities. By meeting teachers where they are, the LIA are creating invaluable networks for learning and support between communities. Through hands-on training, they are also supporting teachers to bring Experience AI into their own classroom, creating examples which are suited for their learners.

“We chose an in-person training model because it fosters a more collaborative and engaging environment, especially for teachers who are new to AI. Many educators, particularly those who are less confident with digital tools, benefit from direct interaction, real-time discussions, and the chance to ask questions in a supportive setting.”

As an Experience AI partner, the Latvian Internet Association is not just delivering content but working to strengthen digital competency across the country and ensure that no teacher or student is left behind in Latvia’s AI journey.

One teacher shares: “The classroom training was truly valuable: it gave us the chance to exchange ideas and reflect on our diverse experiences. Hearing different perspectives was enriching, and I’m glad we’re shaping the future of our schools together.”

“AI is for everyone”

EdCamp Ukraine’s mission is to unite educators and help them to grow. Operating from their main base in Kharkiv, near the Eastern border and the frontline of the ongoing war in Ukraine, they see AI as both a tool for new technological breakthroughs and as something that can help build a fairer, more efficient, and resilient society.

“We firmly believe AI should not only be an object of study — it must become a tool for amplifying human potential. AI should also not be a privilege, but a resource for everyone. We believe the Experience AI programme can truly transform education from the bottom up.”

A man is delivering a presentation about Experience AI to a group of educators.

Within their community of 50,000 teachers, EdCamp Ukraine ensures that every educator, regardless of their living conditions or where they work, can access high-quality, relevant, and accessible support. For the organisation, the ongoing situation in Ukraine means being flexible with planning, preparing for a range of different outcomes, and being ready to pivot delivery to different locations or to an online setting when needed. These same considerations apply to EdCamp Ukraine’s teacher community, who need to be ready to adapt their lessons for any scenario.

“Recognising these war-related challenges helps us see the bigger picture and always have contingency plans in place. We think ahead and develop flexible scenarios.”

Two educators looking at a laptop screen.

This year, the team piloted Experience AI through their community of trainers, who, when they’re not training, are busy teaching in the classroom. Teacher Yuliia shared how her students valued the opportunity to be creators, rather than just users of technology:

“One student, who is an active AI user, kept silent during the lesson. I thought he wasn’t interested, but during the reflection he shared a lot of positive feedback and expressed his gratitude. Other students said it was important that they weren’t just told about AI — they were using it, creating images, and working with apps.”

A group of educators looking at a laptop screen.

EdCamp Ukraine plans to roll out training for Ukrainian teachers this autumn, reaching 2,000 teachers and 40,000 young people by the end of next year.

More countries, more classrooms

Two new partners in Nigeria are about to join the Experience AI network, and there are many more organisations in more countries coming soon. As our partner network continues to grow, we are excited to reach more communities and give more young people around the world the chance to build AI literacy skills and knowledge.

You can find out more about Experience AI on the website. If your organisation is interested in partnering with us to deliver Experience AI, please register your interest and we will let you know about opportunities to work with us.

The post Bridging the divide: Connecting global communities with Experience AI appeared first on Raspberry Pi Foundation.

Introducing Claude 4 in Amazon Bedrock, the most powerful models for coding from Anthropic

2025-05-22 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/claude-opus-4-anthropics-most-powerful-model-for-coding-is-now-in-amazon-bedrock/

Anthropic launched the next generation of Claude models today—Opus 4 and Sonnet 4—designed for coding, advanced reasoning, and the support of the next generation of capable, autonomous AI agents. Both models are now generally available in Amazon Bedrock, giving developers immediate access to both the model’s advanced reasoning and agentic capabilities.

Amazon Bedrock expands your AI choices with Anthropic’s most advanced models, giving you the freedom to build transformative applications with enterprise-grade security and responsible AI controls. Both models extend what’s possible with AI systems by improving task planning, tool use, and agent steerability.

With Opus 4’s advanced intelligence, you can build agents that handle long-running, high-context tasks like refactoring large codebases, synthesizing research, or coordinating cross-functional enterprise operations. Sonnet 4 is optimized for efficiency at scale, making it a strong fit as a subagent or for high-volume tasks like code reviews, bug fixes, and production-grade content generation.

When building with generative AI, many developers work on long-horizon tasks. These workflows require deep, sustained reasoning, often involving multistep processes, planning across large contexts, and synthesizing diverse inputs over extended timeframes. Good examples of these workflows are developer AI agents that help you to refactor or transform large projects. Existing models may respond quickly and fluently, but maintaining coherence and context over time—especially in areas like coding, research, or enterprise workflows—can still be challenging.

Claude Opus 4
Claude Opus 4 is the most advanced model to date from Anthropic, designed for building sophisticated AI agents that can reason, plan, and execute complex tasks with minimal oversight. Anthropic benchmarks show it is the best coding model available on the market today. It excels in software development scenarios where extended context, deep reasoning, and adaptive execution are critical. Developers can use Opus 4 to write and refactor code across entire projects, manage full-stack architectures, or design agentic systems that break down high-level goals into executable steps. It demonstrates strong performance on coding and agent-focused benchmarks like SWE-bench and TAU-bench, making it a natural choice for building agents that handle multistep development workflows. For example, Opus 4 can analyze technical documentation, plan a software implementation, write the required code, and iteratively refine it—while tracking requirements and architectural context throughout the process.

Claude Sonnet 4
Claude Sonnet 4 complements Opus 4 by balancing performance, responsiveness, and cost, making it well-suited for high-volume production workloads. It’s optimized for everyday development tasks with enhanced performance, such as powering code reviews, implementing bug ﬁxes, and new feature development with immediate feedback loops. It can also power production-ready AI assistants for near real-time applications. Sonnet 4 is a drop-in replacement from Claude Sonnet 3.7. In multi-agent systems, Sonnet 4 performs well as a task-speciﬁc subagent—handling responsibilities like targeted code reviews, search and retrieval, or isolated feature development within a broader pipeline. You can also use Sonnet 4 to manage continuous integration and delivery (CI/CD) pipelines, perform bug triage, or integrate APIs, all while maintaining high throughput and developer-aligned output.

Opus 4 and Sonnet 4 are hybrid reasoning models offering two modes: near-instant responses and extended thinking for deeper reasoning. You can choose near-instant responses for interactive applications, or enable extended thinking when a request benefits from deeper analysis and planning. Thinking is especially useful for long-context reasoning tasks in areas like software engineering, math, or scientific research. By configuring the model’s thinking budget—for example, by setting a maximum token count—you can tune the tradeoff between latency and answer depth to fit your workload.

How to get started
To see Opus 4 or Sonnet 4 in action, enable the new model in your AWS account. Then, you can start coding using the Bedrock Converse API with model IDanthropic.claude-opus-4-20250514-v1:0 for Opus 4 and anthropic.claude-sonnet-4-20250514-v1:0 for Sonnet 4. We recommend using the Converse API, because it provides a consistent API that works with all Amazon Bedrock models that support messages. This means you can write code one time and use it with different models.

For example, let’s imagine I write an agent to review code before merging changes in a code repository. I write the following code that uses the Bedrock Converse API to send a system and user prompts. Then, the agent consumes the streamed result.

private let modelId = "us.anthropic.claude-sonnet-4-20250514-v1:0"

// Define the system prompt that instructs Claude how to respond
let systemPrompt = """
You are a senior iOS developer with deep expertise in Swift, especially Swift 6 concurrency. Your job is to perform a code review focused on identifying concurrency-related edge cases, potential race conditions, and misuse of Swift concurrency primitives such as Task, TaskGroup, Sendable, @MainActor, and @preconcurrency.

You should review the code carefully and flag any patterns or logic that may cause unexpected behavior in concurrent environments, such as accessing shared mutable state without proper isolation, incorrect actor usage, or non-Sendable types crossing concurrency boundaries.

Explain your reasoning in precise technical terms, and provide recommendations to improve safety, predictability, and correctness. When appropriate, suggest concrete code changes or refactorings using idiomatic Swift 6
"""
let system: BedrockRuntimeClientTypes.SystemContentBlock = .text(systemPrompt)

// Create the user message with text prompt and image
let userPrompt = """
Can you review the following Swift code for concurrency issues? Let me know what could go wrong and how to fix it.
"""
let prompt: BedrockRuntimeClientTypes.ContentBlock = .text(userPrompt)

// Create the user message with both text and image content
let userMessage = BedrockRuntimeClientTypes.Message(
    content: [prompt],
    role: .user
)

// Initialize the messages array with the user message
var messages: [BedrockRuntimeClientTypes.Message] = []
messages.append(userMessage)

// Configure the inference parameters
let inferenceConfig: BedrockRuntimeClientTypes.InferenceConfiguration = .init(maxTokens: 4096, temperature: 0.0)

// Create the input for the Converse API with streaming
let input = ConverseStreamInput(inferenceConfig: inferenceConfig, messages: messages, modelId: modelId, system: [system])

// Make the streaming request
do {
    // Process the stream
    let response = try await bedrockClient.converseStream(input: input)

    // Iterate through the stream events
    for try await event in stream {
        switch event {
        case .messagestart:
            print("AI-assistant started to stream"")

        case let .contentblockdelta(deltaEvent):
            // Handle text content as it arrives
            if case let .text(text) = deltaEvent.delta {
                self.streamedResponse + = text
                print(text, termination: "")
            }

        case .messagestop:
            print("\n\nStream ended")
            // Create a complete assistant message from the streamed response
            let assistantMessage = BedrockRuntimeClientTypes.Message(
                content: [.text(self.streamedResponse)],
                role: .assistant
            )
            messages.append(assistantMessage)

        default:
            break
        }
    }

To help you get started, my colleague Dennis maintains a broad range of code examples for multiple use cases and a variety of programming languages.

Available today in Amazon Bedrock
This release gives developers immediate access in Amazon Bedrock, a fully managed, serverless service, to the next generation of Claude models developed by Anthropic. Whether you’re already building with Claude in Amazon Bedrock or just getting started, this seamless access makes it faster to experiment, prototype, and scale with cutting-edge foundation models—without managing infrastructure or complex integrations.

Claude Opus 4 is available in the following AWS Regions in North America: US East (Ohio, N. Virginia) and US West (Oregon). Claude Sonnet 4 is available not only in AWS Regions in North America but also in APAC, and Europe: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Hyderabad, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), and Europe (Spain). You can access the two models through cross-Region inference. Cross-Region inference helps to automatically select the optimal AWS Region within your geography to process your inference request.

Opus 4 tackles your most challenging development tasks, while Sonnet 4 excels at routine work with its optimal balance of speed and capability.

Learn more about the pricing and how to use these new models in Amazon Bedrock today!

— seb

Empower financial analytics by creating structured knowledge bases using Amazon Bedrock and Amazon Redshift

2025-05-20 Nita Shah

Post Syndicated from Nita Shah original https://aws.amazon.com/blogs/big-data/empower-financial-analytics-by-creating-structured-knowledge-bases-using-amazon-bedrock-and-amazon-redshift/

Traditionally, financial data analysis could require deep SQL expertise and database knowledge. Now with Amazon Bedrock Knowledge Bases integration with structured data, you can use simple, natural language prompts to query complex financial datasets. By combining the AI capabilities of Amazon Bedrock with an Amazon Redshift data warehouse, individuals with varied levels of technical expertise can quickly generate valuable insights, making sure that data-driven decision-making is no longer limited to those with specialized programming skills.

With the support for structured data retrieval using Amazon Bedrock Knowledge Bases, you can now use natural language querying to retrieve structured data from your data sources, such as Amazon Redshift. This enables applications to seamlessly integrate natural language processing capabilities on structured data through simple API calls. Developers can rapidly implement sophisticated data querying features without complex coding—just connect to the API endpoints and let users explore financial data using plain English. From customer portals to internal dashboards and mobile apps, this API-driven approach makes enterprise-grade data analysis accessible to everyone in your organization. Using structured data from a Redshift data warehouse, you can efficiently and quickly build generative AI applications for tasks such as text generation, sentiment analysis, or data translation.

In this post, we showcase how financial planners, advisors, or bankers can now ask questions in natural language, such as, “Give me the name of the customer with the highest number of accounts?” or “Give me details of all accounts for a specific customer.” These prompts will receive precise data from the customer databases for accounts, investments, loans, and transactions. Amazon Bedrock Knowledge Bases automatically translates these natural language queries into optimized SQL statements, thereby accelerating time to insight, enabling faster discoveries and efficient decision-making.

Solution overview

To illustrate the new Amazon Bedrock Knowledge Bases integration with structured data in Amazon Redshift, we will build a conversational AI-powered assistant for financial assistance that is designed to help answer financial inquiries, like “Who has the most accounts?” or “Give details of the customer with the highest loan amount.”

We will build a solution using sample financial datasets and set up Amazon Redshift as the knowledge base. Users and applications will be able to access this information using natural language prompts.

The following diagram provides an overview of the solution.

For building and running this solution, the steps include:

Load sample financial datasets.
Enable Amazon Bedrock large language model (LLM) access for Amazon Nova Pro.
Create an Amazon Bedrock knowledge base referencing structured data in Amazon Redshift.
Ask queries and get responses in natural language.

To implement the solution, we use a sample financial dataset that is for demonstration purposes only. The same implementation approach can be adapted to your specific datasets and use cases.

Download the SQL script to run the implementation steps in Amazon Redshift Query Editor V2. If you’re using another SQL editor, you can copy and paste the SQL queries either from this post or from the downloaded notebook.

Prerequisites

Make sure your meet the following prerequisites:

Have an AWS account.
Create an Amazon Redshift Serverless workgroup or provisioned cluster. For setup instructions, see Creating a workgroup with a namespace or Create a sample Amazon Redshift database, respectively. The Amazon Bedrock integration feature is supported in both Amazon Redshift provisioned and serverless.
Create an AWS Identity and Access Management (IAM) role. For instructions, see Creating or updating an IAM role for Amazon Redshift ML integration with Amazon Bedrock.
Associate the IAM role to a Redshift instance.
Set up the required permissions for Amazon Bedrock Knowledge Bases to connect with Amazon Redshift.

Load sample financial data

To load the finance datasets to Amazon Redshift, complete the following steps:

Open the Amazon Redshift Query Editor V2 or another SQL editor of your choice and connect to the Redshift database.

Run the following SQL to create the finance data tables and load sample data:

-- Create table
CREATE TABLE accounts (
    id integer ,
    account_id integer PRIMARY KEY,
    customer_id integer,
    account_type character varying(256),
    opening_date date,
    balance bigint,
    currency character varying(256)
);

CREATE TABLE customer (
    id integer,
    customer_id integer PRIMARY KEY ,
    name character varying(256) ,
    age integer,
    gender character varying(256) ,
    address character varying(256) ,
    phone character varying(256) ,
    email character varying(256)
);

CREATE TABLE investments (
    id integer ,
    investment_id integer PRIMARY KEY,
    customer_id integer ,
    investment_type character varying(256) ,
    investment_name character varying(256) ,
    purchase_date date ,
    purchase_price bigint ,
    quantity integer 
);


CREATE TABLE loans (
    id integer ,
    loan_id integer PRIMARY KEY,
    customer_id integer ,
    loan_type character varying(256) ,
    loan_amount bigint ,
    interest_rate integer ,
    start_date date ,
    end_date date 
);

CREATE TABLE orders (
    id integer ,
    order_id integer PRIMARY KEY,
    customer_id integer ,
    order_type character varying(256) ,
    order_date date ,
    investment_id integer ,
    quantity integer ,
    price integer 
);

CREATE TABLE transactions (
    id integer ,
    transaction_id integer PRIMARY KEY ,
    account_id integer REFERENCES accounts(account_id),
    transaction_type character varying(256) ,
    transaction_date date ,
    amount integer ,
    description character varying(256) 
);

Download the sample financial dataset to your local storage and unzip the zipped folder.
Create an Amazon Simple Storage Service (Amazon S3) bucket with a unique name. For instructions, refer to Creating a general purpose bucket.
Upload the downloaded files into your newly created S3 bucket.

Using the following COPY command statements, load the datasets from Amazon S3 into the new tables you created in Amazon Redshift. Replace <<your_s3_bucket>> with the name of your S3 bucket and <<your_region>> with your AWS Region.

-- Load sample data
COPY accounts FROM 's3://<<your_s3_bucket>>/accounts.csv' IAM_ROLE DEFAULT FORMAT AS CSV DELIMITER ',' QUOTE '"' IGNOREHEADER 1 REGION AS '<<your_region>>';

COPY customer FROM 's3://<<your_s3_bucket>>/customer.csv' IAM_ROLE DEFAULT FORMAT AS CSV DELIMITER ',' QUOTE '"' IGNOREHEADER 1 REGION AS '<<your_region>>';
COPY investments FROM 's3://<<your_s3_bucket>>/investments.csv' IAM_ROLE DEFAULT FORMAT AS CSV DELIMITER ',' QUOTE '"' IGNOREHEADER 1 REGION AS '<<your_region>>';
COPY loans FROM 's3://<<your_s3_bucket>>/loans.csv' IAM_ROLE DEFAULT FORMAT AS CSV DELIMITER ',' QUOTE '"' IGNOREHEADER 1 REGION AS '<<your_region>>';
COPY orders FROM 's3://<<your_s3_bucket>>/orders.csv' IAM_ROLE DEFAULT FORMAT AS CSV DELIMITER ',' QUOTE '"' IGNOREHEADER 1 REGION AS '<<your_region>>';
COPY transactions FROM 's3://<<your_s3_bucket>>/transactions.csv' IAM_ROLE DEFAULT FORMAT AS CSV DELIMITER ',' QUOTE '"' IGNOREHEADER 1 REGION AS '<<your_region>>';

Enable LLM access

With Amazon Bedrock, you can access state-of-the-art AI models from providers like Anthropic, AI21 Labs, Stability AI, and Amazon’s own foundation models (FMs). These include Anthropic’s Claude 2, which excels at complex reasoning and content generation; Jurassic-2 from AI21 Labs, known for its multilingual capabilities; Stable Diffusion from Stability AI for image generation; and Amazon Titan models for various text and embedding tasks. For this demo, we use Amazon Bedrock to access the Amazon Nova FMs. Specifically, we use the Amazon Nova Pro model, which is a highly capable multimodal model designed for a wide range of tasks like video summarization, Q&A, mathematical reasoning, software development, and AI agents, including high speed and accuracy for text summarization tasks.

Make sure you have the required IAM permissions to enable access to available Amazon Bedrock Nova FMs. Then complete the following steps to enable model access in Amazon Bedrock:

On the Amazon Bedrock console, in the navigation pane, choose Model access.
Choose Enable specific models.
Search for Amazon Nova models, select Nova Pro, and choose Next.
Review the selection and choose Submit.

Create an Amazon Bedrock knowledge base referencing structured data in Amazon Redshift

Amazon Bedrock Knowledge Bases uses Amazon Redshift as the query engine to query your data. It reads metadata from your structured data store to generate SQL queries. There are different supported authentication methods to create the Amazon Bedrock knowledge base using Amazon Redshift. For more information, refer to the Set up query engine for your structured data store in Amazon Bedrock Knowledge Bases.

For this post, we create an Amazon Bedrock knowledge base for the Redshift database and sync the data using IAM authentication.

If you’re creating an Amazon Bedrock knowledge base through the AWS Management Console, you can skip the service role setup mentioned in the previous section. It automatically creates one with the necessary permissions for Amazon Bedrock Knowledge Bases to retrieve data from your new knowledge base and generate SQL queries for structured data stores.

When creating an Amazon Bedrock knowledge base using an API, you must attach IAM policies that grant permissions to create and manage knowledge bases with connected data stores. Refer to Prerequisites for creating an Amazon Bedrock Knowledge Base with a structured data store for instructions.

Complete the following steps to create an Amazon Bedrock knowledge base using structured data:

On the Amazon Bedrock console, choose Knowledge Bases in the navigation pane.
Choose Create and choose Knowledge Base with structure data store from the dropdown menu.
Provide the following details for your knowledge base:
1. Enter a name and optional description.
2. Select Amazon Redshift as the query engine.
3. Select Create and use a new service role for resource management.
4. Make note of this newly created IAM role.
5. Choose Next to proceed to the next part of the setup process.
6. Configure the query engine:
  - Select Redshift Serverless (Amazon Redshift provisioned is also supported).
  - Choose your Redshift workgroup.
  - Use the IAM role created earlier.
  - Under Default storage metadata, select Amazon Redshift databases and for Database, choose dev.
  - You can customize settings by adding specific contexts to enhance the accuracy of the results.
  - Choose Next.
7. Complete creating your knowledge base.
8. Record the generated service role details.
9. Next, grant appropriate access to the service role for Amazon Bedrock Knowledge Bases through the Amazon Redshift Query Editor V2. Update <your Service Role name> in the following statements with your service role, and update the value for <your schema>.
```
CREATE USER "IAMR:<your Service Role name>" WITH PASSWORD DISABLE;
SELECT * FROM PG_USER; -- To verify that the user is created.
GRANT SELECT ON ALL TABLES IN SCHEMA <your schema> TO "IAMR:<your Service Role name>";
--You can also Restricting access to certain tables for finer-grained control on the tables that can be accessed as shown below
GRANT SELECT ON TABLE customer to "IAMR:<your Service Role name>";
GRANT SELECT ON TABLE loan to "IAMR:<your Service Role name>";
```

Now you can update the knowledge base with the Redshift database.

On the Amazon Bedrock console, choose Knowledge Bases in the navigation pane.
Open the knowledge base you created.
Select the dev Redshift database and choose Sync.

It may take a few minutes for the status to display as COMPLETE.

Ask queries and get responses in natural language

You can set up your application to query the knowledge base or attach the knowledge base to an agent by deploying your knowledge base for your AI application. For this demo, we use a native testing interface on the Amazon Bedrock Knowledge Bases console.

To ask questions in natural language on the knowledge base for Redshift data, complete the following steps:

On the Amazon Bedrock console, open the details page for your knowledge base.
Choose Test.
Choose your category (Amazon), model (Nova Pro), and inference settings (On demand), and choose Apply.
In the right pane of the console, test the knowledge base setup with Amazon Redshift by asking a few simple questions in natural language, such as “How many tables do I have in the database?” or “Give me list of all tables in the database.”

The following screenshot shows our results.

To view the generated query from your Amazon Redshift based knowledge base, choose Show details next to the response.
Next, ask questions related to the financial datasets loaded in Amazon Redshift using natural language prompts, such as, “Give me the name of the customer with the highest number of accounts” or “Give the details of all accounts for customer Deanna McCoy.”

The following screenshot shows the responses in natural language.

Using natural language queries in Amazon Bedrock, you were able to retrieve responses from the structured financial data stored in Amazon Redshift.

Considerations

In this section, we discuss some important considerations when using this solution.

Security and compliance

When integrating Amazon Bedrock with Amazon Redshift, implementing robust security measures is crucial. To protect your systems and data, implement essential safeguards including restricted database roles, read-only database instances, and proper input validation. These measures help prevent unauthorized access and potential system vulnerabilities. For more information, see Allow your Amazon Bedrock Knowledge Bases service role to access your data store.

Cost

You incur a cost for converting natural language to text based on SQL. To learn more, refer to Amazon Bedrock pricing.

Use custom contexts

To improve query accuracy, you can enhance SQL generation by providing custom context in two key ways. First, specify which tables to include or exclude, focusing the model on relevant data structures. Second, supply curated queries as examples, demonstrating the types of SQL queries you expect. These curated queries serve as valuable reference points, guiding the model to generate more accurate and relevant SQL outputs tailored to your specific needs. For more information, refer to Create a knowledge base by connecting to a structured data store.

For different workgroups, you can create separate knowledge bases for each group, with access only to their specific tables. Control data access by setting up role-based permissions in Amazon Redshift, verifying each role can only view and query authorized tables.

Clean up

To avoid incurring future charges, delete the Redshift Serverless instance or provisioned data warehouse created as part of the prerequisite steps.

Conclusion

Generative AI applications provide significant advantages in structured data management and analysis. The key benefits include:

Using natural language processing – This makes data warehouses more accessible and user-friendly
Enhancing customer experience – By providing more intuitive data interactions, it boosts overall customer satisfaction and engagement
Simplifying data warehouse navigation – Users can understand and explore data warehouse content through natural language interactions, improving ease of use
Improving operational efficiency – By automating routine tasks, it allows human resources to focus on more complex and strategic activities

In this post, we showed how the natural language querying capabilities of Amazon Bedrock Knowledge Bases when integrated with Amazon Redshift enables rapid solution development. This is particularly valuable for the finance industry, where financial planners, advisors, or bankers face challenges in accessing and analyzing large volumes of financial data in a secured and performant manner.

By enabling natural language interactions, you can bypass the traditional barriers of understanding database structures and SQL queries, and quickly access insights and provide real-time support. This streamlined approach accelerates decision-making and drives innovation by making complex data analysis accessible to non-technical users.

For additional details on Amazon Bedrock and Amazon Redshift integration, refer to Amazon Redshift ML integration with Amazon Bedrock.

About the authors

Nita Shah is an Analytics Specialist Solutions Architect at AWS based out of New York. She has been building data warehouse solutions for over 20 years and specializes in Amazon Redshift. She is focused on helping customers design and build enterprise-scale well-architected analytics and decision support platforms.

Sushmita Barthakur is a Senior Data Solutions Architect at Amazon Web Services (AWS), supporting Strategic customers architect their data workloads on AWS. With a background in data analytics, she has extensive experience helping customers architect and build enterprise data lakes, ETL workloads, data warehouses and data analytics solutions, both on-premises and the cloud. Sushmita is based in Florida and enjoys traveling, reading and playing tennis.

Jonathan Katz is a Principal Product Manager – Technical on the Amazon Redshift team and is based in New York. He is a Core Team member of the open source PostgreSQL project and an active open source contributor, including PostgreSQL and the pgvector project.

Insights from a teacher trainer: Schools are ready to engage in AI — what they need is support

2025-05-15 Dan Shilling

Post Syndicated from Dan Shilling original https://www.raspberrypi.org/blog/insights-from-a-teacher-trainer-experience-ai/

Today’s blog post is written by Dan Shilling, Programmes Manager at Parent Zone, one of our global partners for Experience AI.

“Educators have been struggling to find resources and support to teach young people about AI.”

This is something I’ve heard a lot when delivering Experience AI teacher training through Parent Zone’s partnership with the Raspberry Pi Foundation.

An educator is delivering a presentation during a workshop.

Our partnership with the Raspberry Pi Foundation

Experience AI is an artificial intelligence (AI) literacy programme, co-developed by the Raspberry Pi Foundation and Google DeepMind, that teaches students aged 11 to 14 about AI and machine learning. Thanks to funding from Google.org, Parent Zone has partnered with the Raspberry Pi Foundation to provide free training to UK educators, equipping them with the skills they need to effectively deliver the programme in their settings.

The Experience AI resources help educators, including those from non-technical backgrounds, to deliver impactful lessons on AI and machine learning. Lesson resources span technical elements (e.g. data-driven models, bias) and practical elements (e.g. careers, safety).

Our face-to-face and virtual training sessions show teachers how to use the programme resources, as well as helping them feel more confident in the subject matter.

The sessions also give me an opportunity to hear from teachers about how AI is being used and taught in classrooms, and the opportunities and challenges it’s creating.

Curiosity and experimentation

AI has a major presence in many schools now.

Teachers tell me they’re seeing students use AI to support their homework. One teacher spoke about a student using a chatbot to help break down a maths problem, describing it like “having a tutor at home.”

Teachers are also using AI themselves to assist in their work — for example, to plan lessons, generate activities, and get ideas on how to explain complex topics more clearly.

Openness to experimentation is clearly there.

Addressing concerns

For all the benefits of AI, teachers also have concerns about it.

Some have told me their students have no idea how easily these tools can be used to mislead or manipulate, through disinformation and deepfakes, for example.

This is why Experience AI resources are meeting educator needs. Not only do they explain how AI and machine learning actually work, but they also address many pressing concerns around AI, from responsible usage and media literacy, to how data bias can affect the final output.

Positive changes

In all the workshops, what stands out to me most is how ready teachers are to engage. They want to understand more. They want to help their students make sense of AI, and use it positively.

They’re grateful for practical, grounded training and support that doesn’t assume they’ve all got computer science degrees. After one of our sessions, a teacher said:

“The better we educate ourselves, the better we’re able to help young people. It’s important because it’s affecting their day-to-day lives. We can help them navigate AI platforms, but in a safe way.”

Join a network of AI-ready educators

If you’re a UK secondary school teacher, you can sign up for free training from Parent Zone, with dates available until November 2025. You can choose from:

For more information about Experience AI, visit our website.

The post Insights from a teacher trainer: Schools are ready to engage in AI — what they need is support appeared first on Raspberry Pi Foundation.

How to enhance your application resiliency using Amazon Q Developer

2025-05-14 Dr. Rahul Sharad Gaikwad

Post Syndicated from Dr. Rahul Sharad Gaikwad original https://aws.amazon.com/blogs/devops/how-to-enhance-your-application-resiliency-using-amazon-q-developer/

“Everything fails, all the time” – Werner Vogels, Amazon.com CTO

In today’s digital landscape, designing applications with resilience in mind is crucial. Resiliency is the ability of applications to handle failures gracefully, adapt to changing conditions, and recover swiftly from disruptions. By integrating resilience into your application architecture, you can minimize downtime, mitigate the impact of failures, and ensure continuous availability and performance for end-users.

Amazon Q Developer, a generative AI-powered assistant for software development lifecycle (SDLC), helps design resilient architectures and enhance application availability. It recommends best practices, analyzes code, and identifies potential failure points, serving as an expert companion to strengthen application architecture and boost system availability through the following key resiliency practices.

Resilient design pattern recommendations: Access tailored design patterns like distributed systems, microservices, and serverless architectures. Amazon Q offers recommendations across redundancy, robust failovers, and circuit breakers to boost resilience in your environment.
Disaster Recovery planning: Amazon Q offers expert guidance on comprehensive disaster recovery (DR), including efficient backups, systematic restorations, strategic data replication, and seamless failovers to ensure rapid recovery from disruptions with minimal impact.
Customized Resiliency testing frameworks: Create custom templates to simulate diverse failure scenarios, such as network degradation and infrastructure outages. This streamlines thorough resilience verification across your systems.
Failure mode evaluation: Use Amazon Q to conduct comprehensive Failure Mode and Effects Analysis (FMEA) identifying infrastructure vulnerabilities and assessing their impact. Amazon Q then ranks these issues by severity, enabling you to prioritize and address the most critical risks to protect your production environment.

In the following sections, we will demonstrate how Amazon Q improves the resiliency of a foundational application architecture.

Prerequisites

To begin using Amazon Q, the following are required:

An AWS Account
An AWS Builder ID or an AWS IAM Identity Center login controlled by your organization
Visual Studio Code or supported IDEs
How to set up and chat with Amazon Q Developer

Application Overview

We have a three-tier web application shown below that is running on AWS in a single Availability Zone (AZ). The architecture consists of Application Layer hosted on Amazon Elastic Kubernetes Service (Amazon EKS) cluster with two Amazon Elastic Compute Cloud (Amazon EC2) nodes in a single-AZ and the Data Layer uses Amazon Relational Database Service (Amazon RDS) instance deployed in single-AZ configuration. The architecture is functional but has several limitations. It poses a single point of failure and offers limited application availability with no fault tolerance. High response times may occur because there is no caching layer in front of the database. Additionally, the lack of auto-scaling can lead to resource contention.

A three-tier web application basic architecture running on AWS in a single Availability Zone.

Basic Application Overview

Enhance Application Resiliency

Let’s explore how Amazon Q helps incorporate resiliency best practices that enhance system availability in our basic application architecture.

Resilient architecture recommendations

The initial architecture faced challenges with reliability, performance and scalability, largely due to its single-point of failure and lacked redundancy. To address this, we described the existing application design and its challenges to Amazon Q using a natural language prompt to seek resiliency recommendations.

Prompt for improving the architecture design:

I have manually setup an application that runs within an EKS cluster on two EC2 nodes in single AZ. My application is not highly available and scalable. It talks to an RDS database which is single AZ. However, there is high response times from database. Provide me only the recommendations to re-design this application architecture at each layer that will addresses all these issues.

Amazon Q offering resiliency architecture recommendations

Amazon Q analyzed the provided context and recommended improvements such as introducing Multi-AZ deployments for high availability, adding auto-scaling groups for elasticity, and incorporating caching layers to enhance performance. These targeted recommendations helped redesign the architecture to be more resilient and scalable, directly addressing the initial shortcomings.

Disaster Recovery (DR) recommendations to improve the architecture

To further enhance resiliency, we prompted Amazon Q for disaster recovery (DR) recommendations. We asked for guidance aligned with the AWS Well-Architected Framework. This built upon the previously improved architecture design.

Prompt for recommendations on Disaster Recovery (DR) and architecture based on RTO/RPO

Based on the above improvements on AWS architecture design, share recommendations for Disaster Recovery (DR) based on AWS Well Architected Framework

Optionally, we can use advanced prompts like the below with additional context:

Please provide a recommendations to redesign my application that is running on an EKS cluster with two EC2 nodes and a single-AZ RDS database, addressing high database latency, low availability, and scalability issues. Suggest improvements across all architectural layers including presentation tier, application tier and data tier to enhance performance, resiliency, and scalability. Also, recommend DR strategies aligned with the AWS Well-Architected Framework focusing on resilience, data protection, and recovery.

Amazon Q tailoring recommendations based on business requirements using AWS Well-Architected Framework

Amazon Q provided detailed DR strategies. These included multi-region configuration, backup and restore procedures, and best practices for meeting specific Recovery Time Objective (RTO) and Recovery Point Objective (RPO) requirements.

Prepare DR strategy based on RTO and RPO requirements:

Diving further, asking for a specific disaster recovery strategy that meets the application RTO requirements of 2 hours and RPO requirements of 30 minutes.

Prompt for DR strategy based on RTO/RPO values

Which DR strategy should I use if my RTO is less than 2 hours and RPO is less than 30 minutes?

Amazon Q recommending disaster recovery strategy

Amazon Q recommended a Pilot light approach, detailing the setup and components needed to achieve the specified disaster recovery objectives.

Define resiliency testing workflow, identify key metrics and tools

As we incorporate resiliency best practices into the application architecture, its is important to employ a resiliency test workflow to ensure application’s resiliency requirements are met. To do this, we are asking for guidance to define an end-to-end resiliency testing process workflow. We also want to identify the key metrics and tools needed to test the resilience of each AWS service involved in the architecture.

Prompt for defining the resiliency testing workflow:

Define the end-to-end resiliency testing process workflow. Also, identify the key metrics and tools that should be used to test the resilience of each AWS service involved in the improved architecture design.

Amazon Q offering resiliency testing best practices and tools

Amazon Q offers a step-by-step approach to define resiliency testing experiments and prepare the environment for testing.

Failure mode evaluation to prioritize resiliency tests

Failure Mode and Effects Analysis (FMEA) can further assist with designing the resiliency tests. It is a proactive method to identify potential failures in processes or systems, assess their impact, and prioritize critical issues. It evaluates failure modes across hardware, software, human factors, and external events, enabling teams to develop strategies for prevention, detection, and mitigation, ultimately enhancing system resilience.

Leveraging Amazon Q, we requested a comprehensive FMEA report that includes components, cause, effect and their respective Risk Priority Numbers (RPN). RPNs are calculated by multiplying three key factors: Severity (S), Occurrence (O), and Detection (D). It helps organizations understand and prioritize which risks to address first.

Prompt for designing the FMEA template and scoring:

Create the FMEA in tabular format with scoring for improved architecture design above keeping in mind the RTO/RPO values and provide the steps for execution as well.

Amazon Q assisting with systematic risk assessment and FMEA report

Amazon Q intelligently incorporated previously defined RTO and RPO requirements to identify critical failure scenarios and calculated RPN for each potential incident.

Enhanced Architecture Implementing Resiliency Best Practices

After identifying the key pain points in our original architecture such as single points of failure, limited scalability, and lack of automated recovery, we leveraged Amazon Q to analyze our architecture to get targeted recommendations to elevate the resiliency. By describing our requirements and challenges to Amazon Q, we received actionable guidance on AWS best practices and service configurations, which we then implemented to transform our infrastructure for high resilience and availability.

Resilient Application Architecture

The original Application Layer was running in a single Availability Zone without auto-scaling, leading to potential downtime and performance bottlenecks. Amazon Q recommended distributing Amazon EKS worker nodes across multiple Availability Zones and enabling the Cluster Autoscaler to dynamically adjust node capacity based on traffic patterns. Additionally, it suggested implementing horizontal pod autoscaling within Amazon EKS to automatically scale application resources according to CPU utilization and custom metrics. Following these recommendations, we deployed Amazon EKS worker nodes across three Availability Zones, configured Cluster Autoscaler and horizontal pod autoscaling, and integrated an Application Load Balancer, to intelligently distribute incoming traffic. These changes significantly improved scalability, fault tolerance, and performance.

The Data Layer initially relied on a single-instance Amazon RDS deployment, which posed a risk of downtime and limited read performance. Upon review, Amazon Q advised implementing a Multi-AZ Amazon RDS configuration to enable automated failover and improve availability. It also recommended deploying read replicas to offload read-heavy workloads and enhance performance. Furthermore, Amazon Q suggested adding a Multi-AZ Amazon ElastiCache for Redis to reduce database load and speed up data access. We incorporated these recommendations, resulting in a more resilient and performant data layer capable of handling failover scenarios and scaling read operations efficiently.

The Presentation Layer lacked an optimized content delivery mechanism and comprehensive security controls. Amazon Q recommended integrating Amazon CloudFront as a content delivery network to accelerate the delivery of static content and reduce load on application servers. It also suggested deploying AWS WAF to protect against common web exploits. To improve operational visibility, Amazon Q emphasized the importance of comprehensive monitoring using Amazon CloudWatch, combining logs, metrics, and traces for rapid issue detection and resolution. Implementing these recommendations enhanced both the performance and security posture of the presentation layer.

Conclusion

Amazon Q Developer transforms how teams build resilient applications by serving as your expert companion throughout the development journey. Its guidance helps create systems that excel in resilience, scalability, and availability—critical factors for today’s demanding digital landscape. Amazon Q goes beyond theoretical advice by providing practical, step-by-step implementation guidance. In the above, we’ve witnessed how Amazon Q’s expertise can transform basic architectures into robust, failure-resistant systems. Its recommendations such as Multi-AZ redundancy, elastic scaling, strategic caching, and proactive resilience testing create applications that maintain performance and availability even during significant disruptions.

Ready to strengthen your applications against unexpected challenges? Harness Amazon Q’s capabilities to create resilient infrastructure that consistently delivers for your customers, regardless of conditions. Unlock the full potential of your AWS infrastructure and deliver uninterrupted service to your customers, today. To learn more about Amazon Q refer to the documentation.

About the authors:

AI lifecycle risk management: ISO/IEC 42001:2023 for AI governance

2025-05-13 Abdul Javid

Post Syndicated from Abdul Javid original https://aws.amazon.com/blogs/security/ai-lifecycle-risk-management-iso-iec-420012023-for-ai-governance/

As AI becomes central to business operations, so does the need for responsible AI governance. But how can you make sure that your AI systems are ethical, resilient, and aligned with compliance standards?

ISO/IEC 42001, the international management system standard for AI, offers a framework to help organizations implement AI governance across the lifecycle. In this post, we walk through how ISO/IEC 42001 enables effective AI governance, review the risk management requirements, and explore how you can use threat modeling as a practical technique to meet those expectations.

AI governance

AI governance refers to the organizational structures, policies, and controls that enable AI systems to be used responsibly, ethically, and safely. Governance spans the entire AI lifecycle and includes the following activities:

Setting the intended purpose and stakeholder alignment
Managing data, models, and deployment risks
Designing in explainability, bias mitigation, and traceability
Establishing accountability, monitoring, and decommissioning practices

These activities are the foundation of a formal framework that you can use to establish governance processes, identify and manage risk, and implement processes for continuous improvement

AI lifecycle

While ISO 42001 provides a framework for AI governance, ISO/IEC 22989:2022 describes what an AI system is and how it evolves. Governance should be implemented at every stage of the AI lifecycle to manage AI risks effectively. According to the ISO/IEC 22989:2022 standard, an organization’s AI life cycle might include these stages:

Inception: Identifying needs, goals, and feasibility
Design and development: Defining system architecture, data flows, and training models
Verification and validation: Testing and confirming that the system meets requirements and performs as intended
Deployment: Releasing the system into its operational environment
Operation and monitoring: Running the system, logging activity, and monitoring performance and outcomes
Re-evaluation: Assessing whether the system continues to meet objectives under changing conditions
Retirement: Decommissioning the system and addressing long-term data and access risks

Understanding the AI lifecycle, shown in Figure 1 that follows, is critical for identifying and mitigating AI risks. While these seven stages are provided directly in ISO 22989:2022, your organization might define its AI lifecycle stages differently to suit its business context. We refer to these stages as we explore the components of an AI management system, from initial AI system scoping, through threat monitoring and risk assessment, to monitoring the established governance program.

Figure 1: Example of AI system lifecycle model stages and high-level processes based on ISO/IEC 22989:2022

Risk management in ISO/IEC 42001:2023

After an organization has identified and assessed AI risks (Clause 6.1 of ISO/IEC 42001:2023), operational controls to mitigate those risks must be implemented (Clause 8.2), and those controls and the AI system itself should be continuously monitored, documented, and improved (Clauses 9 and 10). AI impact assessments (AIIAs) are critical in high-risk use cases, complementing baseline risk assessments by focusing on societal, ethical, and legal impacts. AIIAs are like data protection impact assessments (DPIAs) for high-risk personal data processing under many privacy regulations. DPIAs are specifically designed to assess risks to individuals’ privacy and data protection rights under laws such as the GDPR. While AIIAs help organizations maintain responsible AI governance, DPIAs can be used in parallel to help verify that AI systems comply with data protection laws, together providing a holistic view of risks and safeguards across both ethical and legal dimensions.

You are free to select the AIIA tools or methodologies that best fit your use case. Two widely accepted frameworks are:

ISO 31000: A general-purpose enterprise risk management standard that helps identify, evaluate, and treat risks in a structured and repeatable way. It aligns well with organizations seeking to embed AI risk into their broader enterprise risk management (ERM) programs.

NIST AI Risk Management Framework (AI RMF): A NIST framework specifically designed for AI systems. It introduces tailored concepts such as explainability, robustness, fairness, and accountability, with actionable guidance organized into four core functions: Map, measure, manage, and govern.

ISO 42001 provides structured methods to conduct risk and impact assessments. Threat modeling tools such as:

STRIDE (spoofing, tampering, repudiation, information disclosure, denial of service, and elevation of privilege). STRIDE aims to make sure that a system meets security requirements for confidentiality, integrity and availability.
DREAD (damage potential, reproducibility, exploitability, affected users, and discoverability) is a framework that can assess severity of individual threats.
OWASP (Open Worldwide Application Security Project) for machine learning (ML) enables analysis of AI system vulnerabilities, adversarial risks, and privacy threats.

Trustworthy AI is the result of strategic governance, structured methodologies, and technical analysis.

Figure 2 that follows shows the tiered structure of AI risk governance, moving from high-level governance to detailed technical assessments. On the left side, there’s a downward flow representing the increasing depth of controls, while the right side shows an upward scale indicating escalating AI risks.

At the top layer, ISO/IEC 42001:2023 defines formal requirements for AI governance, including risk assessment mandates, control implementation, and lifecycle oversight.
The middle layer features widely adopted risk assessment methodologies and frameworks, such as ISO 31000 and the NIST AI Risk Management Framework (RMF), which provide structured methods to identify, evaluate, and mitigate AI risks.
At the base, are detailed threat modeling tools—including STRIDE, DREAD, PASTA, LINDDUN, and OWASP for ML—that support deep analysis of AI systems for vulnerabilities related to security, privacy, data protection, and adversarial threats.

Together, these layers form a comprehensive approach to AI risk governance, aligning strategic oversight with operational and technical defenses.

Figure 2: A layered approach to AI risk management aligned with ISO/IEC 42001. ISO/IEC 42001 defines AI governance for responsible AI

Threat modeling for AI risk identification

Threat modeling identifies AI lifecycle technical risks such as exploit surfaces, adversarial threats, and misuse scenarios that complement organizational risk analysis and impact assessments. This post takes a broader AI lifecycle view, showing you how threat modeling complements other risk strategies within the context of ISO/IEC 42001:2023. Additionally, AWS has published AI threat modeling guidance, such as:

The following table is an example STRIDE threat model for a generative AI resource using AWS services by AI lifecycle stage and risk type. This illustrates technical threat remediation through AWS cloud native governance features.

STRIDE category	Example threat	Lifecycle stage	Risk type	AWS feature for governance
Spoofing	A fake identity uses the AI system to generate phishing emails or misinformation	Inception	Security	AWS IAM Identity Center and Amazon Cognito for multi-factor authentication (MFA), Amazon GuardDuty for threat detection
Tampering	A malicious prompt injection or API injection alters the model behavior or bypasses filters	Design development	Integrity	Amazon Bedrock Guardrails, Amazon API Gateway and AWS WAF rules, AWS CloudTrail for input auditing
Repudiation	Users deny prompt activity or content creation, and there’s no logging	Verification and validation	Accountability	CloudTrail, Amazon Bedrock invocation logs, Amazon SageMaker ML Lineage Tracking for traceability
Information disclosure	Sensitive internal data—such as code or personally identifiable information (PII)—accidentally learned and reproduced by the large language model (LLM)	Operation and monitoring	Privacy, Security	SageMaker Clarify, AWS VPC PrivateLink, AWS Key Management Service (AWS KMS) encryption, Amazon Bedrock data handling commitments
Denial of service	Bad actors overload the AI endpoint with prompt spam, degrading service	Deployment	Availability	AWS Shield, API rate limiting using API Gateway, auto scaling with SageMaker endpoints
Elevation of privilege	An internal user modifies system prompts or updates to override content filters	Reevaluation	Ethics and access control	AWS Identity and Access Management (IAM) roles, Amazon Bedrock Guardrails, AWS Config, service control policies (SCPs)

While STRIDE is used here for illustrative clarity, it’s just one of several threat modeling approaches that can be applied depending on the system context. Other widely recognized methods include:

OWASP Top 10 for LLMs: A threat-focused list targeting large language models
MITRE ATLAS: A framework for adversarial threat modeling in AI/ML systems
NIST AI Risk Management Framework (AI RMF): A United States standards-based approach focusing on trustworthy and responsible AI development
PASTA (Process for Attack Simulation and Threat Analysis): A risk-centric threat modeling methodology
LINDDUN: A privacy threat modeling framework addressing data protection risks

By integrating these threat modeling practices into ISO/IEC 42001’s risk-based approach, organizations are not just “checking compliance boxes” they’re operationalizing trustworthy, secure, and accountable AI governance throughout the full system lifecycle.

Threat modeling touchpoints across the AI lifecycle

ISO 42001:2023 uses the STRIDE threat modeling framework to align specific security threats to each stage. Each lifecycle stage is associated with particular threat types, relevant Annex references from the ISO standard, and examples of what to monitor.

Inception (Annex A.8.1): Focuses on spoofing and fake identity input risks.
Design and Development (Annex A.9.1): Linked to tampering threats.
Verification and Validation (Annex A.7.1): Concerns around repudiation, such as lack of model decision logs.
Deployment (Annex A.5.1): Addresses information disclosure vulnerabilities.
Operation and Monitoring (Annex A.10.3): Maps to denial-of-service attacks.
Re-evaluation (Annex A.8.6): Highlights risks of privilege escalation.

AI threat modeling isn’t a one-time task but must be applied continuously across each lifecycle stage, supported by ISO 42001’s annexes and STRIDE categories.

Figure 3: An illustration of how organizations can use ISO/IEC 42001:2023 as a structured framework for AI risk management, using threat modeling as a key technique across the AI lifecycle

AWS tools for AI governance and risk management

AWS governance service capabilities support the controls required in the Statement of Applicability (SoA) under ISO/IEC 42001. These services and features help organizations operationalize responsible AI practices at scale and align with ISO/IEC 42001’s emphasis on structured, accountable AI lifecycle management.

Amazon SageMaker Model Cards: Provides standardized documentation for ML models including purpose, performance, and limitations. In the governance context, model cards help maintain transparency, accountability, and auditability of model behavior and use.
Amazon SageMaker Clarify: Detects bias in datasets and models and supports explainability of predictions. This directly supports governance controls related to fairness, non-discrimination, and explainability.
Amazon SageMaker Ground Truth: Provides high-quality, human-in-the-loop data labeling workflows. It supports data governance by making sure labeled datasets are accurate, consistent, and traceable.
Amazon Bedrock Guardrails: Can be used to define safety filters for generative AI, such as avoiding toxic content or harmful outputs. This facilitates alignment with ethical and content governance policies.
AWS CloudTrail and AWS Config: Enable audit logging and continuous monitoring of system changes. These are essential for accountability, traceability, and compliance reporting within AI governance frameworks.
AWS Identity and Access Management (IAM), AWS Key Management Service (AWS KMS), and AWS PrivateLink: IAM controls access, AWS KMS provides encryption and key management, and PrivateLink enables private connectivity. These features are critical for enforcing access governance, securing data, and maintaining privacy standards.
AWS Generative AI Lens: A part of the AWS Well-Architected Framework tool. It provides structured guidance for evaluating and improving the design of generative AI systems. It helps organizations implement responsible AI practices, manage risks

Conducting AI impact assessments for high risk use cases

While general risk assessments (Clause 6.1 of ISO/IEC 42001) are required for AI systems, ISO/IEC 42001 also calls for AIIAs in situations where the AI system poses high potential impact to individuals, groups, or society. AIIAs should result in a documented report of identified risks associated with the target AI activity, in addition to the severity of potential negative outcomes. These risks should be integrated into the AI management system (AIM) and monitored over time. Several stakeholders and specialists might need to provide input in the assessment process, such as legal, risk, compliance, data management, and security teams. Identified risks should be mitigated where possible, and a determination made about whether the residual risk is acceptable.

AIIAs help answer questions such as:

Is the AI use justifiable, ethical, and proportionate?
Could the system cause discrimination, exclusion, or loss of rights?
What safeguards should be built to protect affected people?

AIIA is required:

If the system makes or informs decisions that materially affect people
If the system is deployed in sensitive domains (such as healthcare, finance, or public services)
If risks to fundamental rights, fairness, or trust are flagged during initial risk assessments

AIAA should cover:

Purpose and scope of the AI system
Stakeholder and impact mapping
Legal, ethical, and social risk evaluation
Transparency and recourse mechanisms
Recommendations for mitigation

AIIA process workflow

Figure 4 that follows illustrates a generic AIIA workflow that includes initiating, scoping, assessing impact, planning mitigation, and documenting the outcome to evaluate how an AI system can affect individuals, groups, and society. Organizations can tailor this process to the AI system context, business objectives, and compliance requirements for their use case.

Figure 4: Sample prescriptive process with key phases on conducting an AIIA

AIIA outcome

AIIA reports should capture the core purpose of the exercise: to evaluate how an AI system might affect individuals, communities, and society at large and to make sure that potential risks are addressed through appropriate mitigation strategies. While formats might vary across industries, an AIIA outcome typically includes key sections such as summary of system purpose, a mapping of affected stakeholders, a contextual analysis of legal and social factors, an evaluation of likely impacts (including fairness, bias, and autonomy risks) and a plan for a mitigation, oversight, and monitoring. Governance details such as sign off responsibility and reassessment triggers should also be included.

Whether you’re starting from scratch or adapting an existing template, these foundational elements will help make sure that your documentation supports transparency, accountability, and ethical AI deployment.

Templates:

Mapping AI lifecycle risks to ISO/IEC 42001 controls

After you have identified risks through techniques such as threat modeling and impact assessments, the next step is to make sure that they’re mitigated through the appropriate ISO/IEC 42001 controls. Using the lifecycle stages defined in ISO/IEC 22989:2022, you can map AI risks identified during the threat hunting process to the corresponding ISO/IEC 42001:2023 clauses and Annex A controls. This mapping helps you align your AI development and governance efforts with a standards-based risk framework.

AI lifecycle stage	Identified risk	Relevant ISO/IEC 42001 clauses	Risk mitigation – Annex A controls
Inception	Spoofing: Impersonation	Clause 4, Clause 5	A.6.1 (Governance roles), A.5.1
Design and development	Tampering: Unauthorized changes	Clause 6.1, Clause 8.2	A.8.2, A.9.1
Verification and validation	Repudiation: No traceability	Clause 8.2	A.8.5, A.7.1
Deployment	Elevation of privilege: Unauthorized model tweaks	Clause 8.2, Clause 9.1	A.10.2, A.6.1
Operation and monitoring	Denial of service: System overload	Clause 9.1, Clause 10.1	A.8.3, A.10.3
Re-evaluation	Drift and new threat vectors	Clause 9.3, Clause 10.2	A.10.2, A.6.4
Retirement	Information disclosure: Residual risks	Clause 8.3, Clause 10.2	A.9.4, A.5.2

Maintaining AI governance

Like most technology risk and governance programs, AI management must be continuously monitored and maintained. ISO 42001 requires an organization to have leadership support and sufficient resources to operate effectively over time. This means that AI governance should be built into every process in the AI development and maintenance journey. AIIAs and threat modeling should be conducted at least annually on existing systems, and prior to the deployment of any new AI function. Policies should be reviewed at least annually and after major change to the AI system. Internal audits should review and monitor compliance with controls continuously, and organizations seeking ISO certification will require annual external audits. Progress toward governance goals and metrics on the status of known AI risks should be reported to the highest level of leadership in a live dashboard, and incidents of negative outcomes related to AI use should be tracked and analyzed to improve the AI system.

Conclusion

Managing AI risk effectively means aligning technical, organizational, and ethical considerations throughout the AI system lifecycle. ISO/IEC 42001 provides structure and accountability. Threat modeling techniques such as STRIDE, MITRE ATLAS, and OWASP for LLM surface deep technical risks. AWS services and features such as SageMaker Model Cards, SageMaker Clarify, and Amazon Bedrock Guardrails help embed governance into layers of AI development.

By combining technical tools, structured assessments, and standards-driven controls, you can build AI systems that are trustworthy, resilient, and aligned with societal expectations.

For additional guidance on achieving, maintaining, and automating compliance in the cloud, contact AWS Security Assurance Services (AWS SAS) or their account team. AWS SAS is a PCI QSAC and HITRUST Assessor Firm that can help by tying together applicable audit standards to AWS service specific features and functionality. They help you build on frameworks such as ISO 42001, PCI DSS, HITRUST CSF, NIST-CSF and Privacy Framework, SOC 2, HIPAA, ISO 27001 and 27701, and more. In addition, AWS Professional Services can also help you plan and map your compliance journey.

Disclaimer: The risk strategies and threat modeling guidance shared in this blog are intended to provide general direction and practical insight into implementing AI risk management under ISO/IEC 42001:2023. However, organizations are responsible for conducting their own context-specific risk assessments, as mandated by the standard. This blog should not be interpreted as an exhaustive approach to or guarantee of compliance with ISO/IEC 42001.

If you have feedback about this post, submit comments in the Comments section below.

Introducing the AWS User Guide to Governance, Risk and Compliance for Responsible AI Adoption within Financial Services Industries

2025-05-07 Krish De

Post Syndicated from Krish De original https://aws.amazon.com/blogs/security/introducing-the-aws-user-guide-to-governance-risk-and-compliance-for-responsible-ai-adoption-within-financial-services-industries/

Financial services institutions (FSIs) are increasingly adopting AI technologies to drive innovation and improve customer experiences. However, this adoption brings new governance, risk, and compliance (GRC) considerations that organizations need to address. To help FSI customers navigate these challenges, AWS is excited to announce the launch of the AWS User Guide to Governance, Risk and Compliance for Responsible AI Adoption within Financial Services Industries.

This comprehensive guide provides FSI customers practical considerations for responsible AI adoption across key dimensions including governance, risk management, compliance, data management, model management and AI agent management. It includes detailed AWS service capabilities that customers can use to address these considerations, such as Amazon Bedrock Guardrails, Amazon Bedrock Agents, Amazon SageMaker Autopilot, and Amazon SageMaker Model Monitor.

The guide is available through AWS Artifact and is complementary to other AWS resources such as the AWS Responsible Use of AI Guide, AWS Cloud Adoption Framework for AI, AWS Well-Architected Framework Generative AI Lens, and AWS Well-Architected Framework Machine Learning Lens.

As the regulatory environment and leading practices continue to evolve, we’ll provide further updates on the AWS Security Blog and AWS Compliance Center. You can also reach out to your AWS account team for help finding the resources you need.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

2025-05-02 Dylan Tong

Post Syndicated from Dylan Tong original https://aws.amazon.com/blogs/big-data/amazon-opensearch-service-launches-flow-builder-to-empower-rapid-ai-search-innovation/

You can now access the AI search flow builder on OpenSearch 2.19+ domains with Amazon OpenSearch Service and begin innovating AI search applications faster. Through a visual designer, you can configure custom AI search flows—a series of AI-driven data enrichments performed during ingestion and search. You can build and run these AI search flows on OpenSearch to power AI search applications on OpenSearch without you having to build and maintain custom middleware.

Applications are increasingly using AI and search to reinvent and improve user interactions, content discovery, and automation to uplift business outcomes. These innovations run AI search flows to uncover relevant information through semantic, cross-language, and content understanding; adapt information ranking to individual behaviors; and enable guided conversations to pinpoint answers. Nonetheless, search engines are limited in native AI-enhanced search support, so builders develop middleware to complement search engines to fill in functional gaps. This middleware consists of custom code that runs data flows to stitch data transformations, search queries, and AI enrichments in varying combinations tailored to use cases, datasets, and requirements.

With the new AI search flow builder for OpenSearch, you have a collaborative environment to design and run AI search flows on OpenSearch. You can find the visual designer within OpenSearch Dashboards under AI Search Flows, and get started quickly by launching preconfigured flow templates for popular use cases like semantic, multimodal or hybrid search, and retrieval augmented generation (RAG). Through configurations, you can create customize flows to enrich search and index processes through AI providers like Amazon Bedrock, Amazon SageMaker, Amazon Comprehend, OpenAI, DeepSeek, and Cohere. Flows can be programmatically exported, deployed, and scaled on any OpenSearch 2.19+ cluster through OpenSearch’s existing ingest, index, workflow and search APIs.

In the remainder of the post, we’ll walk through a couple of scenarios to demonstrate the flow builder. First, we’ll enable semantic search on your old keyword-based OpenSearch application without client-side code changes. Next, we’ll create a multi-modal RAG flow, to showcase how you can redefine image discovery within your applications.

AI search flow builder key concepts

Before we get started, let’s cover some key concepts. You can use the flow builder through APIs or a visual designer. The visual designer is recommended for helping you manage workflow projects. Each project contains at least one ingest or search flow. Flows are a pipeline of processor resources. Each processor applies a type of data transform such as encoding text into vector embeddings, or summarizing search results with a chatbot AI service.

Ingest flows are created to enrich data as it’s added to an index. They consist of:

A data sample of the documents you want to index.
A pipeline of processors that apply transforms on ingested documents.
An index constructed from the processed documents.

Search flows are created to dynamically enrich search request and results. They consist of:

A query interface based on the search API, defining how the flow is queried and ran.
A pipeline of processors that transform the request context or search results.

Generally, the path from prototype to production starts with deploying your AI connectors, designing flows from a data sample, then exporting your flows from a development cluster to a preproduction environment for testing at-scale.

Scenario 1: Enable semantic search on an OpenSearch application without client-side code changes

In this scenario, we have a product catalog that was built on OpenSearch a decade ago. We aim to improve its search quality, and in turn, uplift purchases. The catalog has search quality issues, for instance, a search for “NBA,” doesn’t surface basketball merchandise. The application is also untouched for a decade, so we aim to avoid changes to client-side code to reduce risk and implementation effort.

A solution requires the following:

An ingest flow to generate text embeddings (vectors) from text in an existing index.
A search flow that encodes search terms into text embeddings, and dynamically rewrites keyword-type match queries into a k-NN (vector) query to run a semantic search on the encoded terms. The rewrite allows your application to transparently run semantic-type queries through keyword-type queries.

We will also evaluate a second-stage reranking flow, which uses a cross-encoder to rerank results as it can potentially boost search quality.

We’ll accomplish our task through the flow builder. We begin by navigating to AI Search Flows in the OpenSearch Dashboard, and selecting Semantic Search from the template catalog.

image of the flow template catalog.

This template requires us to select a text embedding model. We’ll use Amazon Bedrock Titan Text, which was deployed as a prerequisite. Once the template is configured, we enter the designer’s main interface. From the preview, we can see that the template consists of a preset ingestion and search flow.

image of the visual flow designer.

The ingest flow requires us to provide a data sample. Our product catalog is currently served by an index containing the Amazon product dataset, so we import a data sample from this index.

importing a data sample from an existing index.

The ingest flow includes a ML Inference Ingest Processor, which generates machine learning (ML) model outputs such as embeddings (vectors) as your data is ingested into OpenSearch. As previously configured, the processor is set to use Amazon Titan Text to generate text embeddings. We map the data field that holds our product descriptions to the model’s inputText field to enable embedding generation.

Configuring the ML Inference Ingest Processor to generate text embeddings.

We can now run our ingest flow, which builds a new index containing our data sample embeddings. We can inspect the index’s contents to confirm that the embeddings were successfully generated.

Inspect your new index and embeddings from the flow designer.

Once we have an index, we can configure our search flow. We’ll start with updating the query interface, which is preset to a basic match query. The placeholder my_text has to be replaced with the product descriptions. With this update, our search flow can now respond to queries from our legacy application.

Update the search flow’s query interface

The search flow includes an ML Inference Search Processor. As previously configured, it’s set to use Amazon Titan Text. Since it’s added under Transform query, it’s applied to query requests. In this case, it will transform search terms into text embeddings (a query vector). The designer lists the variables from the query interface, allowing us to map the search terms (query.match.text.query), to the model’s inputText field. Text embeddings will now be generated from the search terms whenever our index is queried.

Configure a ML Inference Search Processor to generate query vectors.

Next, we update the query rewrite configurations, which is preset to rewrite the match query into a k-NN query. We replace the placeholder my_embedding with the query field assigned to your embeddings. Note that we could rewrite this to another query type, including a hybrid query, which may improve search quality.

Let’s compare our semantic and keyword solutions from the search comparison tool. Both solutions are able to find basketball merchandise when we search for “basketball.”

Keyword versus semantic search results on the term “basketball”.

But what happens if we search for “NBA?” Only our semantic search flow returns results because it detects the semantic similarities between “NBA” and “basketball.”

Keyword versus semantic search results on the term “NBA”.

We’ve managed improvements, but we might be able to do better. Let’s see if reranking our search results with a cross-encoder helps. We’ll add a ML Inference Search Processor under Transform response, so that the processor applies to search results, and select Cohere Rerank. From the designer, we see that Cohere Rerank requires a list of documents and the query context as input. Data transformations are needed to package the search results into a format that can be processed by Cohere Rerank. So, we apply JSONPath expressions to extract the query context, flatten data structures, and pack the product descriptions from our documents into a list.

configure a ML Inference Search Processor with a reranker and apply JSONPath expressions.

Let’s return to the search comparison tool to compare our flow variations. We don’t observe any meaningful difference in our previous search for “basketball” and “NBA.” However, improvements are observed when we search, “hot weather.” On the right, we see that the second and fifth search hit moved 32 and 62 spots up, and returned “sandals” that are well suited for “hot weather.”

Reranked search results for “hot weather” demonstrate search quality gains.

We’re ready to proceed to production, so we export our flows from our development cluster into our preproduction environment, use the workflow APIs to integrate our flows into automations, and scale our test processes through the bulk, ingest and search APIs.

Scenario 2: Use generative AI to redefine and elevate image search

In this scenario, we have photos of millions of fashion designs. We’re looking for a low-maintenance image search solution. We will use generative multimodal AI to modernize image search, eliminating the need for labor to maintain image tags and other metadata.

Our solution requires the following:

An ingest flow which uses a multimodal model like Amazon Titan Multimodal Embeddings G1 to generate image embeddings.
A search flow which generates text embeddings with a multimodal model, runs a k-NN query for text to image matching, and sends matching images to a generative model like Anthropic’s Claude Sonnet 3.7 that can operate on text and images.

We’ll start from the RAG with Vector Retrieval template. With this template, we can quickly configure a basic RAG flow. The template requires an embedding and large language model (LLM) that can process text and image content. We use Amazon Bedrock Titan Multimodal G1 and Anthropic’s Claude Sonnet 3.7, respectively.

From the designer’s preview panel, we can see similarities between this template and the semantic search template. Again, we seed the ingest flow with a data sample. Like the previous example, we use the Amazon product dataset except we replace the production descriptions with base64 encoded images because our models require base64 images, and this solution doesn’t require text. We map the base64 image data to the corresponding Amazon Titan G1 inputs to generate embeddings. We then run our ingest flow and confirm that our index contains base64 images and corresponding embeddings.

Ingest flow creates a new index containing base64 images and corresponding embeddings.

The initial steps for configuring this search flow are similar to the previous scenario: we update the query interface, map the query text fields to the model inputs for the ML Inference Search Processor, and revise the query rewrite settings. The main difference with this flow is the additional response processor set to use Anthropic’s Claude Sonnet 3.7 to process images.

We need to configure an LLM prompt that includes the query context and instructions for the LLM to play the role of a fashion advisor, and provide commentary about the image payload.

configure a LLM prompt to generate commentary for product images.

Next, we map the prompt and the base64 image data field to the model’s inputs accordingly.

Configure the ML Inference Search Processor to process search results with Anthropic Claude Sonnet 3.7.

Let’s test our multimodal RAG flow by searching for “sunset colored dresses.” We observe the following results.

This beautiful dress perfectly captures the essence of a sunset with its gorgeous ombre color gradient. Starting with a soft peachy pink at the top, it transitions into a vibrant orange in the middle, and finishes with a bright yellow at the hem – just like the colors of the sky as the sun dips below the horizon. The dress features a simple, casual silhouette with short cap sleeves and appears to have a comfortable, relaxed fit that would flatter many body types. The tie-dye effect gives it a bohemian, carefree vibe that’s perfect for summer outings, beach vacations, or casual evening events. I recommend this sunset-colored dress because it’s eye-catching, versatile, and embodies that magical sunset glow that can brighten any occasion. The colors would complement warm skin tones particularly well, and you could easily dress it up with gold accessories or keep it casual with simple sandals.

This vibrant tie-dye dress perfectly captures the essence of a sunset with its beautiful gradient of colors. The dress features bands of golden yellow, fiery orange, and rich pink that blend together seamlessly, reminiscent of the sky at dusk. The strapless design showcases the shoulders elegantly, while the gathered waist creates a flattering silhouette that flows into a comfortable, knee-length skirt. This dress would be an excellent choice for summer events, beach vacations, or casual outings. The sunset color palette is not only on-trend but also versatile enough to pair with neutral accessories. I recommend this piece for its eye-catching colors, comfortable fit, and the way it embodies the warm, relaxed feeling of watching a beautiful sunset.

Without any image metadata, OpenSearch finds images of sunset-colored dresses, and responds with accurate and colorful commentary.

Conclusion

The AI search flow builder is available in all AWS Regions that support OpenSearch 2.19+ on OpenSearch Service. To learn more, refer to Building AI search workflows in OpenSearch Dashboards, and the available tutorials on GitHub, which demonstrate how to integrate various AI models from Amazon Bedrock, SageMaker, and other AWS and third-party AI services.

About the authors

Dylan Tong is a Senior Product Manager at Amazon Web Services. He leads the product initiatives for AI and machine learning (ML) on OpenSearch including OpenSearch’s vector database capabilities. Dylan has decades of experience working directly with customers and creating products and solutions in the database, analytics and AI/ML domain. Dylan holds a BSc and MEng degree in Computer Science from Cornell University.

Tyler Ohlsen is a software engineer at Amazon Web Services focusing mostly on the OpenSearch Anomaly Detection and Flow Framework plugins.

Mingshi Liu is a Machine Learning Engineer at OpenSearch, primarily contributing to OpenSearch, ML Commons and Search Processors repo. Her work focuses on developing and integrating machine learning features for search technologies and other open-source projects.

Ka Ming Leung (Ming) is a Senior UX designer at OpenSearch, focusing on ML-powered search developer experiences as well as designing observability and cluster management features.

Use an Amazon Bedrock powered chatbot with Amazon Security Lake to help investigate incidents

2025-05-01 Madhunika Reddy Mikkili

Post Syndicated from Madhunika Reddy Mikkili original https://aws.amazon.com/blogs/security/use-an-amazon-bedrock-powered-chatbot-with-amazon-security-lake-to-help-investigate-incidents/

In part 2 of this series, we showed you how to use Amazon SageMaker Studio notebooks with natural language input to assist with threat hunting. This is done by using SageMaker Studio to automatically generate and run SQL queries on Amazon Athena with Amazon Bedrock and Amazon Security Lake. The Security Lake service team and the Open Cybersecurity Schema Framework (OCSF) community continue to add additional log sources and OCSF mappings to enable Security Lake to provide a consolidated source for customers to conduct security investigation.

Because security logging data sources continually grow, organizations need to provide a mechanism for their security teams to understand and query those data sources. You might have existing investigation and response playbooks that your security teams need to be well-versed in and know when to use. It can take security teams an extended period of time to onboard and understand the available security data sources and playbooks and how to efficiently use them to reduce the mean time to respond.

In this post, we show you how to extend the functionality from the previous post. You will learn how to deploy a security chatbot with a graphical user interface (GUI) and a serverless backend powered by an Amazon Bedrock agent that incorporates existing playbooks to investigate or respond to a security event. The chatbot demonstrates purpose-built Amazon Bedrock agents that help address security concerns depending on the user’s natural language input. The solution has a single GUI that provides a direct interface with the Amazon Bedrock agent to create and invoke SQL queries or provide recommendations for internal incident response playbooks to investigate or respond to possible security events.

Security chatbot sample solution overview

Figure 1: Security chatbot sample solution architecture diagram

Application flow as shown in Figure 1:

User submits a query through the React UI.

Note: The React UI used in this solution doesn’t have authentication built in. It’s recommended that you add authentication capabilities that follow your organization’s security requirements. You can add authentication capabilities by using Amazon Cognito and AWS Amplify UI.
The user’s query is sent to an Amazon API Gateway REST API, which invokes the Invoke Agent AWS Lambda function.
The Lambda function invokes the Amazon Bedrock agent with the user’s query.
The Amazon Bedrock agent (using Anthropic’s Claude 3 Sonnet) processes the query and decides between retrieving information from the playbooks or by querying Security Lake using Amazon Athena.

For playbook knowledge base queries:

The Amazon Bedrock agent queries the playbooks knowledge base and retrieves relevant results.

For Security Lake data queries:

The Amazon Bedrock agent queries the schema knowledge base and retrieves the Security Lake table schemas to create an SQL query.
The Amazon Bedrock agent invokes the SQL query action from the Amazon Bedrock action group, passing the SQL query as a parameter.
The action group invokes the Execute SQL on Athena Lambda function, which executes the query on Athena and returns the results to the Amazon Bedrock agent.

After retrieving results from the knowledge base or action group:

The Amazon Bedrock agent uses the retrieved information to formulate the final response and sends it back to the Invoke Agent Lambda function.
The Lambda function sends the response back to the client using an API Gateway WebSocket API.
API Gateway delivers the response to the React UI using a WebSocket connection to the client.
The agent’s response is displayed to the user in the chat interface.

Prerequisites

Before deploying the sample solution, complete the following prerequisites:

Enable Security Lake in your organization in AWS Organizations and specify a delegated administrator account to manage the Security Lake configuration for all member accounts in your organization. Configure Security Lake with the appropriate log sources: Amazon Virtual Private Cloud (Amazon VPC) Flow Logs, AWS Security Hub, AWS CloudTrail, and Amazon Route53.
Create subscriber query access from the source Security Lake AWS account to the subscriber AWS account.
Accept a resource share request in the subscriber AWS account in AWS Resource Access Manager (AWS RAM).
Create a database link in AWS Lake Formation in the subscriber AWS account and grant access for the Athena tables in the Security Lake AWS account.
Grant Anthropic’s Claude v3 model access for Amazon Bedrock in the AWS subscriber account where you will deploy the solution. If you try to use a model before you enable it in your AWS account, you will get an error message.

With the prerequisites in place, the sample solution architecture provisions the following resources:

Amazon CloudFront with an Amazon Simple Storage Service (Amazon S3) origin.
An Amazon S3 static website for the chatbot UI.
An API Gateway to call a Lambda function.
A Lambda function to invoke the Amazon Bedrock agent.
An Amazon Bedrock agent with a knowledge base.
1. An Amazon Bedrock agent action group to generate and invoke SQL queries on Athena.
  1. An Amazon Bedrock knowledge base to reference example Athena table schemas in Security Lake. Although the Amazon Bedrock agent can get rows directly from the Athena table, providing example table schemas improves SQL query generation accuracy for table columns in Security Lake.
  2. An Amazon Bedrock knowledge base to reference existing incident response playbooks. By incorporating this knowledge base, the Amazon Bedrock agent can suggest actions for investigation or response based on existing playbooks that have already been approved by your organization.

Cost

Before deploying the sample solution and walking through this post, it’s important to understand the cost of the AWS services being used. The cost will largely depend on the amount of data you interact with in Amazon Bedrock and by querying Security Lake with Athena.

Security Lake costs are determined by the volume of log and event data ingested from AWS services. Security Lake orchestrates other AWS services on your behalf, which incur separate charges. You can find more information on pricing for the respective services: Amazon S3, AWS Glue, Amazon EventBridge, AWS Lambda, Amazon Simple Query Service (Amazon SQS), and Amazon Simple Notification Service (Amazon SNS).
Amazon Bedrock on-demand pricing is based on the selected large language model (LLM) and the number of input and output tokens. A token is comprised of a few characters and refers to the basic unit of text that a model learns to understand the user input and prompts. For more details, see Amazon Bedrock pricing.
The SQL queries generated by Amazon Bedrock are invoked using Athena. Athena cost is based on the amount of data scanned within Security Lake for that query. For more details, see Athena pricing.

Deploy the sample chatbot

You can deploy the sample solution by using AWS Cloud Development Kit (AWS CDK). For instructions and more information on using the AWS CDK, see Get Started with AWS CDK.

Clone the sample-generative-ai-chatbot-for-amazon-security-lake repository.
Navigate to the project’s root folder.
Install the project dependencies.
Build and deploy the app using the following commands:
```
npm install -g aws-cdk
npm install 
cdk synth
```
Run the following commands in your terminal while signed in to your subscriber AWS account. Replace <INSERT_AWS_ACCOUNT> with your account number and replace <INSERT_REGION> with the AWS Region that you want the solution deployed to.
```
cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>
cdk deploy –all
```

As part of the CDK deployment, there is an Output value for the React Application URL (FrontendAppStack.ReactAppUrl). You will use this value to interact with the GenAI application. Wait up to 5 mins for the URL to be live.

Post-deployment configuration steps

Now that you’ve deployed the solution, you need to add permissions to allow the Lambda function’s AWS Identity and Access Management (IAM) role and Amazon Bedrock to interact with your Security Lake data.

Grant permission to the Security Lake database

Copy the Lambda’s role ARN from the “BedrockAppStack” CloudFormation stack. The resource in the stack is named “athenaAgentSecurityLakeActionGroupLambdaServiceRole********”.
Go to the Lake Formation console.
Select the amazon_security_lake_glue_db_<YOUR-REGION> database. For example, if your Security Lake is in us-east-1, the value would be amazon_security_lake_glue_db_us_east_1
For Actions, select Grant.
In Grant Data Permissions, select SAML Users and Groups.
Paste the Lambda function IAM role ARN from Step 1.
In Database Permissions, select Describe, and then choose Grant.

Grant permission to Security Lake tables

You must repeat the following steps for each source configured within Security Lake. For example, if you have four sources configured within Security Lake, you must grant permissions for the Lambda function IAM role to each table. If you have multiple sources that are in separate Regions and you don’t have a rollup Region configured in Security Lake, you must repeat the steps for each source in each Region.

The following example grants permissions to the Security Hub table within Security Lake. For more information about granting table permissions, see the AWS Lake Formation user guide.

Copy the Lambda’s role ARN from the “BedrockAppStack” CloudFormation stack. The resource in the stack is named as “athenaAgentSecurityLakeActionGroupLambdaServiceRole********”.
Go to the Lake Formation console.
Select the amazon_security_lake_glue_db_<YOUR-REGION> database.
For example, if your Security Lake database is in us-east-1, the value would be amazon_security_lake_glue_db_us_east-1
Choose View Tables.
Select the amazon_security_lake_table_<YOUR-REGION>_sh_findings_1_0 table.
For example, if your Security Lake table is in us-east-1, the value would be amazon_security_lake_table_us_east_1_sh_findings_1_0

Note: Each table must be granted access individually. Selecting All Tables won’t grant the access needed to query Security Lake.
For Actions, select Grant.
In Grant Data Permissions, select SAML Users and Groups.
Paste the Lambda function IAM role ARN from Step 1.
In Table Permissions, select Describe, and then Grant.

Sync data sources

After you deploy the infrastructure, you need to sync the data sources in the Amazon Bedrock knowledge bases so that the data in Amazon S3 can be vectorized and made available in Amazon OpenSearch Serverless, which is the service used as a vector source by the knowledge bases in this solution.

In the Amazon Bedrock console, select Knowledge base and find the two Amazon Bedrock knowledge bases deployed in this solution: gen-ai-sec-lake-table-schema and gen-ai-sec-lake-runbooks. Navigate to each knowledge base and its data source. Then choose Sync for each data source.

Get the CloudFront distribution URL

As part of the sample solution, the chatbot uses an externally available CloudFront distribution URL. It’s recommended that you add appropriate security controls that align to your organization’s security requirements to the sample solution. For example, you might want to add authentication to CloudFront using Amazon Cognito and Lambda@Edge to help prevent unauthorized users from accessing this chatbot. You can also configure secure access and restrict access to the content.

Navigate to CloudFormation in the console.
In the Stacks section, select the FrontendAppStack.
Select the Outputs tab.
Copy the value ReactAppUrl.

Investigate with your security chatbot

Now that you’ve deployed the sample solution and configured the appropriate permissions, you’re ready to use natural language input to generate and invoke SQL queries and to recommend internal incident response playbooks.

Generate and invoke SQL queries

Although most security professionals already know how to create and invoke SQL queries, time is extremely important during a possible security event. Adding natural-language-to-SQL capability to the security agent can help reduce your mean time to respond, which can reduce the overall impact of a security event. Furthermore, you can use these same techniques for threat hunting in your AWS environment.

Question 1: Which AWS accounts have the most security findings for EC2 instances in the last 7 days?

Answer 1: The query results show the top two AWS accounts with the most security findings related to EC2 instances in the last 7 days. The account with ID XXXXXX10 had the highest number of findings at 252, followed by account XXXXXX28 with 197 findings.

The query covered the key requirements, so I provided the final answer summarizing the top accounts and their finding counts.

The question and response are shown in Figure 2.

Figure 2: Response for most EC2 security findings

The more context you provide to the chatbot, the better the query that will be generated. Providing a time window also gives a smaller data set to review and give quicker results.

Question 2: In those AWS accounts, what were the top EC2 security finding types?

Answer 2: Based on the query results, the top EC2 security finding types in the AWS accounts with the most EC2 instance findings in the last 7 days were:

For account XXXXXX10:

“Compliance Finding: Update” with 252 findings

For account XXXXXX28:

“Compliance Finding Update” with 166 findings
“Compliance Finding Create” with 24 findings
“Vulnerability Finding: Update” with 2 findings
“Vulnerability Finding: Create” with 2 findings.

The question and response are shown in Figure 3.

Figure 3: Response for top EC2 security finding types

The results show that the issues are primarily around Security Hub controls mapped to the type_name OCSF column. If you want to look at a different column, you can provide additional instructions or update the question.

Question 3: Which EC2 instances had vulnerability security findings?

Answer 3: Based on the query results, the following EC2 instances had vulnerability security findings in the AWS account with the most EC2 instance findings in the last 7 days:

In account XXXXXX28:

arn:aws:ec2:us-east-1: XXXXXX28:instance/<instance-id>
arn:aws:ec2:us-east-1: XXXXXX28:instance/<instance-id>

There were no vulnerability findings for EC2 instances in account XXXXXX10 during this time period.

The question and response are shown in Figure 4.

Figure 4: Response for EC2 instances with vulnerability security findings

In this question, the search is narrowed down to two Amazon Elastic Compute Cloud (Amazon EC2) instances within a specific AWS account to take a deeper look into.

As part of this sample solution, we used the AWS Incident Response Playbook Samples GitHub repository and uploaded them to an S3 bucket. Alternatively, you can create or reference your own playbooks to investigate and respond to various security events to refer to. After each question, the chatbot provides an answer along with an Amazon S3 URL to the playbook. The security analyst must have the appropriate permissions to download the playbook from Amazon S3.

Question 1: Do I have any playbooks that would help investigate a compromised EC2 instance for crypto mining or malware?

Answer 1: Yes, the search results include a relevant playbook for investigating a potentially compromised EC2 instance that could be used for crypto mining, malware, or ransomware. The key steps outlined in the playbook are:

Acquire, preserve, and document evidence – Identify indicators of compromise, determine attack vector, document impacted applications/users
Contain the incident – Isolate the compromised instance if possible without service disruption, remove from auto-scaling group if applicable
Eradicate the incident – Follow the steps to remove the malware/crypto miner, restore from known good backups/AMIs
Recovery – Spin up new instances from trusted AMIs, restore the data, and monitor for persistence of the threat
Lessons learned – Analyze the incident, identify root causes, and implement preventative measures

The playbook provides detailed guidance on each of these phases in the context of an EC2 ransomware incident, but the overall process can be adapted for investigating crypto mining, malware, or other threats on EC2 as well.

The question and response are shown in Figure 5.

Figure 5: Response for EC2 IR playbooks

The previous threat-hunting exercise identified two EC2 instances that had vulnerabilities present. After further investigation, if a security analyst believes the EC2 instance is compromised, they can ask for an IR playbook to respond to the security event. Once they determine which playbook makes the most sense, they can either click on the S3 URL or look at other playbooks to use.

Question 2: Do I have any playbooks that would help respond to ransomware in my S3 bucket?

Answer 2: Yes, the search results include a playbook that can help respond to a ransomware incident involving an S3 bucket. The “Incident Response Playbook Template for Unintended Access to an Amazon S3 Bucket” outlines steps for:

Gathering evidence of the unintended access
Containing the incident by restricting access to the affected bucket(s)
Eradicating the issue by removing any malicious data/objects and restoring from backup
Recovering by re-enablement access to the bucket for legitimate users
Conducting post-incident activities like reviewing logs, determining root cause, and implementing preventative measures

While focused on generic unintended access rather than ransomware specifically, many of these steps could be adapted and applied to a ransomware scenario targeting data in S3. The playbook provides a solid framework for incident response on S3.

Additionally, the general “Ransomware Incident Response Playbook” template covers common steps for any ransomware incident, such as:

Acquiring and preserving evidence of the ransomware infection
Containing the incident by isolating affected systems
Identifying and remove the ransomware
Recovering data from backups or known good sources
Conducting post-incident review

While not S3-specific, this playbook provides complementary guidance that could be used in conjunction with the S3 access playbook for an S3 ransomware incident.

The question and response are shown in Figure 6.

Figure 6: Response for S3 ransomware IR playbooks

The response provides a high-level overview of the specific playbook that’s being referenced to provide the analyst with context before diving deeper into any single playbook.

Adding functionality to the security chatbot

This sample solution was developed to show the art of the possible. Each customer uses AWS resources to address their business needs in their own way, and security teams must be appropriately equipped to help secure their respective environments. Here are some possible enhancements that you can incorporate into the sample solution to align to your security use-cases and needs.

Incorporate an Amazon DynamoDB table to use as part of reporting interactions tied to a specific event or finding GUID. By incorporating an audit trail, you can tie actions taken by the agent and associated resources to a security event and validate the outcome of the investigation before taking action.
Tuning the backend chatbot agent to query Amazon Linux Security Center’s Common Vulnerabilities and Exposures (CVE) list or MITRE’s CVE list to see which AWS resources might be in scope and send out consolidated messages to resource owners with recommended actions.
Tuning the backend chatbot agent to take natural language requests and respond with detectors or correlation rules for Amazon OpenSearch or query language for custom detections in your security information and event management (SIEM) tool.
Adding a new data source to Athena, such as AWS Config, to provide the analyst with additional capabilities to query AWS resource configuration across the AWS environment that might have been impacted by a security event. For example, if a security finding shows that an S3 bucket has been made public, querying what and when other configuration changes were made to the S3 bucket.
Incorporating multi-agent-orchestration to scale the use of multiple Amazon Bedrock agents that can be tuned towards niche security use cases by respective teams. The chatbot can speak directly to a classifier or controller, which then addresses the user’s natural language request and orchestrates across one or more agents to generate a response. For example, if a user asks which EC2 instances might have been impacted by a security event and which playbook to use to respond, the classifier agent could direct the initial query to the agent in this sample solution. In the same chat window, the analyst could ask if there are any open CVEs for the EC2 instances in scope to get a list of CVEs to address within the AWS account.
For long running Athena queries, you can incorporate an AWS Step Function to the workflow and incorporate a task token to wait for the Athena results to return.

Clean up

If you deployed the security chatbot sample solution by using the Launch Stack button and the console with the CloudFormation template security_genai_chatbot_cfn, do the following to clean up:

In the CloudFormation console for the account and Region where you deployed the solution, choose the SecurityGenAIChatbot stack.
Choose the option to Delete the stack.

If you deployed the solution by using the AWS CDK, run the command cdk destroy --all.

Conclusion

The sample solution demonstrates how you can use task-oriented Amazon Bedrock agents and natural language input to help accelerate investigation and analysis and increase your overall security posture. We provided an example of a sample solution with a user interface that is powered by an Amazon Bedrock agent, which you can extend to add additional task-oriented agents, each with their own instructions, knowledge bases, and models. By extending the use of AI-powered agents, you can help your security team operate more efficiently across multiple security domains within your AWS environment.

The backend for the chatbot to investigate security events uses Security Lake, which normalizes data into Open Cybersecurity Schema Framework (OCSF); as long as the data schema is normalized, the solution can be applied to other data lakes within your AWS environment.

To learn more, see the other posts in this series:

Use the comments section to provide feedback. If you have questions about this post, start a new thread on the Generative AI on AWS re:Post or contact AWS Support.

Amazon Nova Premier: Our most capable model for complex tasks and teacher for model distillation

2025-05-01 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-nova-premier-our-most-capable-model-for-complex-tasks-and-teacher-for-model-distillation/

Today we’re expanding the Amazon Nova family of foundation models announced at AWS re:Invent with the general availability of Amazon Nova Premier, our most capable model for complex tasks and teacher for model distillation.

Nova Premier joins the existing Amazon Nova understanding models available in Amazon Bedrock. Similar to Nova Lite and Pro, Premier can process input text, images, and videos (excluding audio). With its advanced capabilities, Nova Premier excels at complex tasks that require deep understanding of context, multistep planning, and precise execution across multiple tools and data sources. With a context length of one million tokens, Nova Premier can process extremely long documents or large code bases.

With Nova Premier and Amazon Bedrock Model Distillation, you can create highly capable, cost-effective, and low-latency versions of Nova Pro, Lite, and Micro, for your specific needs. For example, we used Nova Premier to distill Nova Pro for complex tool selection and API calling. The distilled Nova Pro had a 20% higher accuracy for API invocations compared to the base model and consistently matched the performance of the teacher, with the speed and cost benefits of Nova Pro.

Amazon Nova Premier benchmark evaluation
We evaluated Nova Premier on a broad range of benchmarks across text intelligence, visual intelligence, and agentic workflows. Nova Premier is the most capable model in the Nova family as measured across 17 benchmarks as shown in the table below.

Nova Premier is also comparable to the best non-reasoning models in the industry and is equal or better on approximately half of these benchmarks when compared to other models in the same intelligence tier. Details of these evaluations are in the technical report.

Nova Premier is also the fastest and the most cost-effective model in Amazon Bedrock for its intelligence tier. For further details and comparison on pricing, please refer to the Bedrock pricing page.

Nova Premier can also be used as a teacher model for distillation, which means you can transfer its advanced capabilities for a specific use case into smaller, faster, and more efficient models like Nova Pro, Micro, and Lite for production deployments.

Using Amazon Nova Premier
To get started with Nova Premier, you first need to request access to the model in the Amazon Bedrock console. Navigate to Model access in the navigation pane, find Nova Premier, and toggle access.

Once you have access, you can use Nova Premier through the Amazon Bedrock Converse API providing in input a list of messages from the user and the assistant. Messages can include text, images, and videos. Here’s an example of a straightforward invocation using the AWS SDK for Python (Boto3):

import boto3
import json

AWS_REGION = "us-east-1"
MODEL_ID = "us.amazon.nova-premier-v1:0"

bedrock_runtime = boto3.client('bedrock-runtime', region_name=AWS_REGION)
messages = [
    {
        "role": "user",
        "content": [
            {
                "text": "Explain the differences between vector databases and traditional relational databases for AI applications."
            }
        ]
    }
]

response = bedrock_runtime.converse(
    modelId=MODEL_ID,
    messages=messages
)

response_text = response["output"]["message"]["content"][-1]["text"]

print(response_text)

This example shows how Nova Premier can provide detailed explanations for complex technical questions. But the real power of Premier comes with its ability to handle sophisticated workflows.

Multi-agent collaboration use case
Let’s explore a more complex scenario that showcases how Nova Premier works a multi-agent collaboration architecture for investment research.

The equity research process typically involves multiple stages: identifying relevant data sources for specific investments, retrieving required information from those sources, and synthesizing the data into actionable insights. This process becomes increasingly complex when dealing with different types of financial instruments like stock indices, individual equities, and currencies.

We can build this type of application using multi-agent collaboration in Amazon Bedrock, with Nova Premier powering the supervisor agent that orchestrates the entire workflow. The supervisor agent analyzes the initial query (for example, “What are the emerging trends in renewable energy investments?”), breaks it down into logical steps, determines which specialized subagents to engage, and synthesizes the final response.

For this scenario, I’ve created a system with the following components:

A supervisor agent powered by Nova Premier
Multiple specialized subagents powered by Nova Pro, each focusing on different financial data sources
Tools that connect to financial databases, market analysis tools, and other relevant information sources

When I submit a query about emerging trends in renewable energy investments, the supervisor agent powered by Nova Premier does the following:

Analyzes the query to determine the underlying topics and sources to cover
Selects the appropriate subagents specific to those topics and sources
Each subagent retrieves their relevant economic indicators, technical analysis, and market sentiment data
The supervisor agent synthesizes this information into a comprehensive report for review by a financial professional

Utilizing Nova Premier in a multi-agent collaboration architecture such as this streamlines the financial professional’s work and helps them formulate their investment analysis faster. The following video provides a visual description of this scenario.

The key advantage of using Nova Premier for the supervisor role is its accuracy in coordinating complex workflows, so that the right data sources are consulted in the optimal sequence and each subagent receives in input the correct information for their work, resulting in higher quality insights.

Multi-agent collaboration with model distillation
Although Nova Premier provides the highest level of accuracy of its family of models, you might want to optimize latency and cost in production environments. This is where the strength of Nova Premier as a teacher model for distillation becomes interesting. Using Amazon Bedrock Model Distillation, we can customize Nova Micro from the results of Nova Premier for this specific investment research use case.

Unlike traditional fine-tuning that requires human feedback and labeled examples, with model distillation you can generate high-quality training data by having a teacher model produce the desired outputs, streamlining the data acquisition process.

The process to distill a model involves:

Generating synthetic training data by capturing input and output from Nova Premier runs across multiple financial instruments
Using this data as a reference to train a customized version of Nova Micro through custom fine-tuning tools
Evaluating the difference in latency and performance of the customized Micro model
Deploying the customized Micro model as the supervisor agent in production

With Amazon Bedrock, you can further streamline the process and use invocation logs for data preparation. To do that, you need to set the model invocation logging on and set up an Amazon Simple Storage Service (Amazon S3) bucket as the destination for the logs.

Customer voices
Some of our customers had early access to Nova Premier. This is what they shared with us:

“Amazon Nova Premier has been outstanding in its ability to execute interactive analysis workflows, while still being faster and nearly half the cost compared to other leading models in our tests,” said Curtis Allen, Senior Staff Engineer at Slack, a company bringing conversations, apps, and customers together in one place.

“Implementing new solutions built on top of Amazon Nova has helped us with our mission of democratizing finance for all,” said Dev Tagare, Head of AI and Data at Robinhood Markets, a company on a mission to democratize finance for all. “We’re particularly excited about the ability to explore new avenues like complex multi-agent collaborations that are not just highly performing but also cost effective and fast. The intelligence of Nova Premier and what it can transfer to the other models like Nova Micro, Nova Lite, and Nova Pro unlocks multi-agent collaboration at a performance, price, and speed that will make it accessible to everyday customers.”

“Accelerating real-world AI deployments—not just prototypes—requires the ability to build models that are specialized for the unique needs of real world applications,” said Henry Ehrenberg, co-founder of Snorkel AI, a technology company that empowers data scientists and developers to quickly turn data into accurate and adaptable AI applications. “We’re excited to see AWS pushing efficient model customization forward with Amazon Bedrock Model Distillation and Amazon Nova Premier. These new model capabilities have the potential to accelerate our enterprise customers in building production AI applications, including Q&A applications with multimodal data and more.”

Things to know

Nova Premier is available in Amazon Bedrock in the US East (N. Virginia), US East (Ohio), and US West (Oregon) AWS Regions today via cross-Region inference. With Amazon Bedrock, you only pay for what you use. For more information, visit Amazon Bedrock pricing.

Customers in the US can also access Amazon Nova models at https://nova.amazon.com, a website to easily explore our FMs.

Nova Premier is our best teacher for distilling custom variants of Nova Pro, Micro, and Lite, which means you can capture the capabilities offered by Premier in smaller, faster models for production deployment.

Nova Premier includes built-in safety controls to promote responsible AI use, with content moderation capabilities that help maintain appropriate outputs across a wide range of applications.

To get started with Nova Premier, visit the Amazon Bedrock console today. For more information, see the Amazon Nova User Guide and send feedback to AWS re:Post for Amazon Bedrock. Explore the generative AI section of our community.aws site to see how our Builder communities are using Amazon Bedrock in their solutions.

– Danilo

How is the News Blog doing? Take this 1 minute survey!

(This survey is hosted by an external company. AWS handles your information as described in the AWS Privacy Notice. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.)

Llama 4 models from Meta now available in Amazon Bedrock serverless

2025-04-29 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/llama-4-models-from-meta-now-available-in-amazon-bedrock-serverless/

The newest AI models from Meta, Llama 4 Scout 17B and Llama 4 Maverick 17B, are now available as a fully managed, serverless option in Amazon Bedrock. These new foundation models (FMs) deliver natively multimodal capabilities with early fusion technology that you can use for precise image grounding and extended context processing in your applications.

Llama 4 uses an innovative mixture-of-experts (MoE) architecture that provides enhanced performance across reasoning and image understanding tasks while optimizing for both cost and speed. This architectural approach enables Llama 4 to offer improved performance at lower cost compared to Llama 3, with expanded language support for global applications.

The models were already available on Amazon SageMaker JumpStart, and you can now use them in Amazon Bedrock to streamline building and scaling generative AI applications with enterprise-grade security and privacy.

Llama 4 Maverick 17B – A natively multimodal model featuring 128 experts and 400 billion total parameters. It excels in image and text understanding, making it suitable for versatile assistant and chat applications. The model supports a 1 million token context window, giving you the flexibility to process lengthy documents and complex inputs.

Llama 4 Scout 17B – A general-purpose multimodal model with 16 experts, 17 billion active parameters, and 109 billion total parameters that delivers superior performance compared to all previous Llama models. Amazon Bedrock currently supports a 3.5 million token context window for Llama 4 Scout, with plans to expand in the near future.

Use cases for Llama 4 models
You can use the advanced capabilities of Llama 4 models for a wide range of use cases across industries:

Enterprise applications – Build intelligent agents that can reason across tools and workflows, process multimodal inputs, and deliver high-quality responses for business applications.

Multilingual assistants – Create chat applications that understand images and provide high-quality responses across multiple languages, making them accessible to global audiences.

Code and document intelligence – Develop applications that can understand code, extract structured data from documents, and provide insightful analysis across large volumes of text and code.

Customer support – Enhance support systems with image analysis capabilities, enabling more effective problem resolution when customers share screenshots or photos.

Content creation – Generate creative content across multiple languages, with the ability to understand and respond to visual inputs.

Research – Build research applications that can integrate and analyze multimodal data, providing insights across text and images.

Using Llama 4 models in Amazon Bedrock
To use these new serverless models in Amazon Bedrock, I first need to request access. In the Amazon Bedrock console, I choose Model access from the navigation pane to toggle access to Llama 4 Maverick 17B and Llama 4 Scout 17B models.

The Llama 4 models can be easily integrated into your applications using the Amazon Bedrock Converse API, which provides a unified interface for conversational AI interactions.

Here’s an example of how to use the AWS SDK for Python (Boto3) with Llama 4 Maverick for a multimodal conversation:

import boto3
import json
import os

AWS_REGION = "us-west-2"
MODEL_ID = "us.meta.llama4-maverick-17b-instruct-v1:0"
IMAGE_PATH = "image.jpg"


def get_file_extension(filename: str) -> str:
    """Get the file extension."""
    extension = os.path.splitext(filename)[1].lower()[1:] or 'txt'
    if extension == 'jpg':
        extension = 'jpeg'
    return extension


def read_file(file_path: str) -> bytes:
    """Read a file in binary mode."""
    try:
        with open(file_path, 'rb') as file:
            return file.read()
    except Exception as e:
        raise Exception(f"Error reading file {file_path}: {str(e)}")

bedrock_runtime = boto3.client(
    service_name="bedrock-runtime",
    region_name=AWS_REGION
)

request_body = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "text": "What can you tell me about this image?"
                },
                {
                    "image": {
                        "format": get_file_extension(IMAGE_PATH),
                        "source": {"bytes": read_file(IMAGE_PATH)},
                    }
                },
            ],
        }
    ]
}

response = bedrock_runtime.converse(
    modelId=MODEL_ID,
    messages=request_body["messages"]
)

print(response["output"]["message"]["content"][-1]["text"])

This example demonstrates how to send both text and image inputs to the model and receive a conversational response. The Converse API abstracts away the complexity of working with different model input formats, providing a consistent interface across models in Amazon Bedrock.

For more interactive use cases, you can also use the streaming capabilities of the Converse API:

response_stream = bedrock_runtime.converse_stream(
    modelId=MODEL_ID,
    messages=request_body['messages']
)

stream = response_stream.get('stream')
if stream:
    for event in stream:

        if 'messageStart' in event:
            print(f"\nRole: {event['messageStart']['role']}")

        if 'contentBlockDelta' in event:
            print(event['contentBlockDelta']['delta']['text'], end="")

        if 'messageStop' in event:
            print(f"\nStop reason: {event['messageStop']['stopReason']}")

        if 'metadata' in event:
            metadata = event['metadata']
            if 'usage' in metadata:
                print(f"Usage: {json.dumps(metadata['usage'], indent=4)}")
            if 'metrics' in metadata:
                print(f"Metrics: {json.dumps(metadata['metrics'], indent=4)}")

With streaming, your applications can provide a more responsive experience by displaying model outputs as they are generated.

Things to know
The Llama 4 models are available today with a fully managed, serverless experience in Amazon Bedrock in the US East (N. Virginia) and US West (Oregon) AWS Regions. You can also access Llama 4 in US East (Ohio) via cross-region inference.

As usual with Amazon Bedrock, you pay for what you use. For more information, see Amazon Bedrock pricing.

These models support 12 languages for text (English, French, German, Hindi, Italian, Portuguese, Spanish, Thai, Arabic, Indonesian, Tagalog, and Vietnamese) and English when processing images.

To start using these new models today, visit the Meta Llama models section in the Amazon Bedrock User Guide. You can also explore how our Builder communities are using Amazon Bedrock in their solutions in the generative AI section of our community.aws site.

— Danilo

How is the News Blog doing? Take this 1 minute survey!

Writer Palmyra X5 and X4 foundation models are now available in Amazon Bedrock

2025-04-28 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/writer-palmyra-x5-and-x4-foundation-models-are-now-available-in-amazon-bedrock/

One thing we’ve witnessed in recent months is the expansion of context windows in foundation models (FMs), with many now handling sequence lengths that would have been unimaginable just a year ago. However, building AI-powered applications that can process vast amounts of information while maintaining the reliability and security standards required for enterprise use remains challenging.

For these reasons, we’re excited to announce that Writer Palmyra X5 and X4 models are available today in Amazon Bedrock as a fully managed, serverless offering. AWS is the first major cloud provider to deliver fully managed models from Writer. Palmyra X5 is a new model launched today by Writer. Palmyra X4 was previously available in Amazon Bedrock Marketplace.

Writer Palmyra models offer robust reasoning capabilities that support complex agent-based workflows while maintaining enterprise security standards and reliability. Palmyra X5 features a one million token context window, and Palmyra X4 supports a 128K token context window. With these extensive context windows, these models remove some of the traditional constraints for app and agent development, enabling deeper analysis and more comprehensive task completion.

With this launch, Amazon Bedrock continues to bring access to the most advanced models and the tools you need to build generative AI applications with security, privacy, and responsible AI.

As a pioneer in FM development, Writer trains and fine-tunes its industry leading models on Amazon SageMaker HyperPod. With its optimized distributed training environment, Writer reduces training time and brings its models to market faster.

Palmyra X5 and X4 use cases
Writer Palmyra X5 and X4 are designed specifically for enterprise use cases, combining powerful capabilities with stringent security measures, including System and Organization Controls (SOC) 2, Payment Card Industry Data Security Standard (PCI DSS), and Health Insurance Portability and Accountability Act (HIPAA) compliance certifications.

Palmyra X5 and X4 models excel in various enterprise use cases across multiple industries:

Financial services – Palmyra models power solutions across investment banking and asset and wealth management, including deal transaction support, 10-Q, 10-K and earnings transcript highlights, fund and market research, and personalized client outreach at scale.

Healthcare and life science – Payors and providers use Palmyra models to build solutions for member acquisition and onboarding, appeals and grievances, case and utilization management, and employer request for proposal (RFP) response. Pharmaceutical companies use these models for commercial applications, medical affairs, R&D, and clinical trials.

Retail and consumer goods – Palmyra models enable AI solutions for product description creation and variation, performance analysis, SEO updates, brand and compliance reviews, automated campaign workflows, and RFP analysis and response.

Technology – Companies across the technology sector implement Palmyra models for personalized and account-based marketing, content creation, campaign workflow automation, account preparation and research, knowledge support, job briefs and candidate reports, and RFP responses.

Palmyra models support a comprehensive suite of enterprise-grade capabilities, including:

Adaptive thinking – Hybrid models combining advanced reasoning with enterprise-grade reliability, excelling at complex problem-solving and sophisticated decision-making processes.

Multistep tool-calling – Support for advanced tool-calling capabilities that can be used in complex multistep workflows and agentic actions, including interaction with enterprise systems to perform tasks like updating systems, executing transactions, sending emails, and triggering workflows.

Enterprise-grade reliability – Consistent, accurate results while maintaining strict quality standards required for enterprise use, with models specifically trained on business content to align outputs with professional standards.

Using Palmyra X5 and X4 in Amazon Bedrock
As for all new serverless models in Amazon Bedrock, I need to request access first. In the Amazon Bedrock console, I choose Model access from the navigation pane to enable access to Palmyra X5 and Palmyra X4 models.

When I have access to the models, I can start building applications with any AWS SDKs using the Amazon Bedrock Converse API. The models use cross-Region inference with these inference profiles:

For Palmyra X5: us.writer.palmyra-x5-v1:0
For Palmyra X4: us.writer.palmyra-x4-v1:0

Here’s a sample implementation with the AWS SDK for Python (Boto3). In this scenario, there is a new version of an existing product. I need to prepare a detailed comparison of what’s new. I have the old and new product manuals. I use the large input context of Palmyra X5 to read and compare the two versions of the manual and prepare a first draft of the comparison document.

import sys
import os
import boto3
import re

AWS_REGION = "us-west-2"
MODEL_ID = "us.writer.palmyra-x5-v1:0"
DEFAULT_OUTPUT_FILE = "product_comparison.md"

def create_bedrock_runtime_client(region: str = AWS_REGION):
    """Create and return a Bedrock client."""
    return boto3.client('bedrock-runtime', region_name=region)

def get_file_extension(filename: str) -> str:
    """Get the file extension."""
    return os.path.splitext(filename)[1].lower()[1:] or 'txt'

def sanitize_document_name(filename: str) -> str:
    """Sanitize document name."""
    # Remove extension and get base name
    name = os.path.splitext(filename)[0]
    
    # Replace invalid characters with space
    name = re.sub(r'[^a-zA-Z0-9\s\-\(\)\[\]]', ' ', name)
    
    # Replace multiple spaces with single space
    name = re.sub(r'\s+', ' ', name)
    
    # Strip leading/trailing spaces
    return name.strip()

def read_file(file_path: str) -> bytes:
    """Read a file in binary mode."""
    try:
        with open(file_path, 'rb') as file:
            return file.read()
    except Exception as e:
        raise Exception(f"Error reading file {file_path}: {str(e)}")

def generate_comparison(client, document1: bytes, document2: bytes, filename1: str, filename2: str) -> str:
    """Generate a markdown comparison of two product manuals."""
    print(f"Generating comparison for {filename1} and {filename2}")
    try:
        response = client.converse(
            modelId=MODEL_ID,
            messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "text": "Please compare these two product manuals and create a detailed comparison in markdown format. Focus on comparing key features, specifications, and highlight the main differences between the products."
                        },
                        {
                            "document": {
                                "format": get_file_extension(filename1),
                                "name": sanitize_document_name(filename1),
                                "source": {
                                    "bytes": document1
                                }
                            }
                        },
                        {
                            "document": {
                                "format": get_file_extension(filename2),
                                "name": sanitize_document_name(filename2),
                                "source": {
                                    "bytes": document2
                                }
                            }
                        }
                    ]
                }
            ]
        )
        return response['output']['message']['content'][0]['text']
    except Exception as e:
        raise Exception(f"Error generating comparison: {str(e)}")

def main():
    if len(sys.argv) < 3 or len(sys.argv) > 4:
        cmd = sys.argv[0]
        print(f"Usage: {cmd} <manual1_path> <manual2_path> [output_file]")
        sys.exit(1)

    manual1_path = sys.argv[1]
    manual2_path = sys.argv[2]
    output_file = sys.argv[3] if len(sys.argv) == 4 else DEFAULT_OUTPUT_FILE
    paths = [manual1_path, manual2_path]

    # Check each file's existence
    for path in paths:
        if not os.path.exists(path):
            print(f"Error: File does not exist: {path}")
            sys.exit(1)

    try:
        # Create Bedrock client
        bedrock_runtime = create_bedrock_runtime_client()

        # Read both manuals
        print("Reading documents...")
        manual1_content = read_file(manual1_path)
        manual2_content = read_file(manual2_path)

        # Generate comparison directly from the documents
        print("Generating comparison...")
        comparison = generate_comparison(
            bedrock_runtime,
            manual1_content,
            manual2_content,
            os.path.basename(manual1_path),
            os.path.basename(manual2_path)
        )

        # Save comparison to file
        with open(output_file, 'w') as f:
            f.write(comparison)

        print(f"Comparison generated successfully! Saved to {output_file}")

    except Exception as e:
        print(f"Error: {str(e)}")
        sys.exit(1)

if __name__ == "__main__":
    main()

To learn how to use Amazon Bedrock with AWS SDKs, browse the code samples in the Amazon Bedrock User Guide.

Things to know
Writer Palmyra X5 and X4 models are available in Amazon Bedrock today in the US West (Oregon) AWS Region with cross-Region inference. For the most up-to-date information on model support by Region, refer to the Amazon Bedrock documentation. For information on pricing, visit Amazon Bedrock pricing.

These models support English, Spanish, French, German, Chinese, and multiple other languages, making them suitable for global enterprise applications.

Using the expansive context capabilities of these models, developers can build more sophisticated applications and agents that can process extensive documents, perform complex multistep reasoning, and handle sophisticated agentic workflows.

To start using Writer Palmyra X5 and X4 models today, visit the Writer model section in the Amazon Bedrock User Guide. You can also explore how our Builder communities are using Amazon Bedrock in their solutions in the generative AI section of our community.aws site.

Let us know what you build with these powerful new capabilities!

— Danilo

How is the News Blog doing? Take this 1 minute survey!

Amazon introduces SWE-PolyBench, a multilingual benchmark for AI Coding Agents

2025-04-23 Christian Bock

Post Syndicated from Christian Bock original https://aws.amazon.com/blogs/devops/amazon-introduces-swe-polybench-a-multi-lingual-benchmark-for-ai-coding-agents/

Coding agents powered by large language models have shown impressive capabilities in software engineering tasks, but evaluating their performance across diverse programming languages and real-world scenarios remains challenging. This led to a recent explosion in benchmark creation to assess the coding effectiveness of said systems in controlled environments. In particular, SWE-Bench which measures the performance of systems in the context of GitHub issues has spurred the development of capable coding agents resulting in over 50 leaderboard submissions, thereby becoming the de-facto standard for coding agent benchmarking. Despite its significant impact as a pioneering benchmark, SWE-Bench, and in particular its “verified” subset, also shows some limitations. It contains only Python repositories, the majority of tasks are bug fixes, and at over 45% of all tasks, the Django repository is significantly over-represented.

Today, Amazon introduces SWE-PolyBench, the first industry benchmark to evaluate AI coding agents’ ability to navigate and understand complex codebases, introducing rich metrics to advance AI performance in real-world scenarios. SWE-PolyBench contains over 2,000 curated issues in four languages. In addition, it contains a stratified subset of 500 issues (SWE-PolyBench500) for the purpose of rapid experimentation. SWE-PolyBench evaluates the performance of AI coding agents through a comprehensive set of metrics: pass rates across different programming languages and task complexity levels, along with precision and recall measurements for code/file context identification. These evaluation metrics can help the community address challenges in understanding how well AI coding agents can navigate through and comprehend complex codebases

The leaderboard is accessible here. The SWE-PolyBench dataset is available on Hugging Face and the paper at arxiv. Evaluations can be run using the SWE-PolyBench codebase.

Below, we describe the key features, characteristics, and the creation process of our dataset alongside the new evaluation metrics, and performance of open source agents from our experiments.

Key features of SWE-PolyBench at a glance

Multi-Language Support: Java (165 tasks), JavaScript (1017 tasks), TypeScript (729 tasks), and Python (199 tasks).
Extensive Dataset: 2110 instances from 21 repositories ranging from web frameworks to code editors and ML tools, on the same scale as SWE-Bench full with more repository.
Task Variety: Includes bug fixes, feature requests, and code refactoring.
Faster Experimentation: SWE-PolyBench500 is a stratified subset for efficient experimentation.
Leaderboard: A leaderboard with a rich set of metrics for transparent benchmarking.

Building a comprehensive dataset

The creation of SWE-PolyBench involved a data collection and filtering process designed to ensure the quality and relevance of the benchmark tasks. SWE-Bench, a benchmark for Python code generation, evaluates agents on real-world programming tasks by utilizing GitHub issues and their corresponding code and test modifications. We extended the SWE-Bench data acquisition pipeline to support 3 additional languages besides Python and used it to gather and process coding challenges from real-world repositories as shown in Figure 1.

A flowchart diagram showing a software development process. It starts with an issue (#3039) and pull request (#3147) on the left, goes through a metadata filter in the middle, then splits into a runtime setup and testing phase on the right. The testing phase feeds into a test-based filter at the end. The diagram includes icons for programming languages like JavaScript, TypeScript, Python, and Java.

Figure 1: Overview of the SWE-PolyBench data generation pipeline, illustrating the process of collecting, filtering, and validating coding tasks.

The data acquisition pipeline collects pull requests (PRs) that close issues from popular repositories across Java, JavaScript, TypeScript, and Python. These PRs undergo filtering and are set up in containerized environments for consistent test execution. The process categorizes tests as fail-to-pass (F2P) or pass-to-pass (P2P) based on their outcomes before and after patch application. Only PRs with at least one F2P test are included in the final dataset, ensuring that each task represents a meaningful coding challenge. This streamlined approach results in a dataset that closely mimics real-world coding scenarios, providing a robust foundation for evaluating AI coding assistants.

Dataset characteristics

When constructing SWE-PolyBench, we aimed to collect GitHub issues that represent diverse programming scenarios: issues involving modifications across multiple code files and spanning different task categories (such as bug fixes, feature requests, and refactoring). Tables 1 and 2 provide descriptive statistics on the composition and complexity of SWE-PolyBench full (PB) and SWE-PolyBench500 (PB500). To offer a point of reference, we compare these statistics with those of SWE-Bench (SWE) and SWE-Bench verified (SWEv). Tasks in SWE-PolyBench require on average more files to be modified and more nodes to be changed, which indicates that they have higher complexity and are closer to tasks in real-world projects. The distribution of tasks is also more diverse, in particular for SWE-PolyBench500.

A comparison table showing statistics for different software benchmarks (SWE-PolyBench, SWE-PolyBench500, SWE-Bench, and SWE-Bench verified). The table has two main sections: Modified Files showing average changes across programming languages (Python, Java, JavaScript, TypeScript), and Task Category distribution showing percentages for Bug Fix, Feature Request, Refactoring, and Miscellaneous tasks

New evaluation metrics

To comprehensively evaluate AI coding assistants, SWE-Polybench introduces multiple new metrics in addition to the pass rate. The pass rate is the proportion of tasks successfully solved as measured by the generated patch passing all relevant tests. It is the primary metric for assessing coding agent performance, but it doesn’t provide a complete picture of an agent’s capabilities. In particular, it doesn’t give much information on an agent’s ability to navigate and understand complex code repositories. SWE-PolyBench introduces a new set of metrics based on Concrete Syntax Tree (CST) node analysis and the established file-level localization metric:

File-level Localization: assesses the agent’s ability to identify the correct files that need to be modified within a repository. Let us assume that we would need to modify file.py to solve our problem. If our coding agent implements a change in any other file, it would receive a file retrieval score of 0.
CST Node-level Retrieval: evaluates the agent’s ability to identify specific code structures that require changes. It uses the Concrete Syntax Tree (CST) representation of the code to measure how accurately the agent can locate the exact functions or classes that need modification.

A side-by-side comparison showing two Git version control diffs. Each diff shows a line being removed (in red, prefixed with '-') where my_var equals 3, and a line being added (in green, prefixed with '+') where my_var equals 2. Above the diffs are connected dots in different colors (green, pink, blue, and yellow) representing Git commit history visualization.

Figure 2: Illustration of CST node changes.

In Figure 2, we see a change in class node A materialized by a change in its initialization function on the left path starting from the file node. In contrast to the first change, the change in class B is considered a function node change as it doesn’t impact class construction.

Let us assume the change that would solve our problem is the change in the __init__ function. If our coding agent implements the change in my_func, it receives both a class and function node retrieval score of 0.

By combining pass rate assessment with both file-level and CST node-level retrieval metrics, SWE-PolyBench offers a detailed evaluation of AI coding assistants’ capabilities in real-world scenarios. This approach provides deeper insights into how well agents navigate and comprehend complex codebases, going beyond simple task completion to assess their understanding of code structure and organization.

Performance of open-source coding agents

Key Findings

Language Proficiency: Python is the strongest language for all agents, likely due to its prevalence in training data and existing benchmarks.
Complexity Challenges: Performance degrades as task complexity increases, particularly when modifications to 3 or more files are required.
Task Specialization: Different agents show strengths in various task categories (bug fixes, feature requests, refactoring).
Context Importance: The informativeness of problem statements impacts success rates across all agents (refer to Figure 5 of the appendix paper for details about this analysis).

Many existing open-source agents are designed primarily for Python. Adapting them to work for all four languages of SWE-PolyBench required adjusting test execution commands, modifying parsing mechanisms, and adapting containerization strategies for each language. We adapted and evaluated three open-source agents on SWE-PolyBench. The aforementioned adjustments are reflected by the added “-PB” suffix to the original agent names.

Two radar charts comparing three AI models: Aider-PB Sonnet 3.5, Agentless-PB Sonnet 3.5, and SWE-agent-PB Sonnet 3.5. The left chart shows performance across programming languages (Java, JavaScript, TypeScript, Python). The right chart displays performance in different coding styles (Functional only, Single Function, All, Mixed, No nodes, Single Class, Class only). Each model is represented by a different colored line, with Aider-PB generally showing the highest performance across categories.

Figure 3: Performance of coding agents across programming languages and task complexities, highlighting strengths and areas for improvement.

Figure 3 provides a visual representation of agent performance across different dimensions:

Language Proficiency: The left side of the chart shows that all three agents perform best in Python, with significantly lower pass rates in other languages. This highlights the current bias towards Python in many coding agents and their underlying large language models.
Task Complexity: The right side of the chart illustrates how performance degrades as task complexity increases. Agents show higher pass rates for tasks involving single class or function changes, but struggle with tasks requiring modifications to multiple classes or functions and in instances where both class and function changes are required.

This comprehensive view of agent performance underscores the value of SWE-PolyBench in identifying specific strengths and weaknesses of different coding assistants, paving the way for targeted improvements in future iterations.

In addition to these insights, the evaluation revealed interesting patterns across different task categories as shown in Table 2. The performance data across bug fixes, feature requests, and refactoring tasks reveals varying strengths among AI coding assistants. The performance on bug fixing tasks is relatively consistent. There is more variability between different agents and between multiple runs of a given agent for feature request tasks and refactoring tasks.

Table 3 showing average pass rates with standard error by task category for three agents: Agentless-PB, SWE-Agent-PB, and Aider-PB. The task categories are Bug Fix, Feature Request, and Refactoring. Aider-PB has the highest pass rates for Bug Fix (13.8) and Feature Request (15.1), while SWE-Agent-PB leads in Refactoring (16.1). Standard errors are provided for each value.

Join the SWE-PolyBench community

SWE-PolyBench and its evaluation framework are publicly available. This open approach invites the global developer community to build upon this work and advance the field of AI-assisted software engineering. As coding agents continue to evolve, benchmarks like SWE-PolyBench play a crucial role in ensuring they can meet the diverse needs of real-world software development across multiple programming languages and task types.

Explore SWE-PolyBench today and contribute to the future of AI-powered software engineering!

Resources

Authors

Announcing AWS Security Reference Architecture Code Examples for Generative AI

2025-04-17 Ievgeniia Ieromenko

Post Syndicated from Ievgeniia Ieromenko original https://aws.amazon.com/blogs/security/announcing-aws-security-reference-architecture-code-examples-for-generative-ai/

Amazon Web Services (AWS) is pleased to announce the release of new Security Reference Architecture (SRA) code examples for securing generative AI workloads. The examples include two comprehensive capabilities focusing on secure model inference and RAG implementations, covering a wide range of security controls and best practices for AWS generative AI services.

These new code examples are available in the AWS SRA Examples Repository and include ready-to-deploy CloudFormation templates for implementing detective security controls such as network segmentation, identity management, encryption, prompt injection detection, and logging and monitoring. The solutions align with the AWS SRA Design Guidance page and demonstrate our commitment to helping customers secure their generative AI implementations.

Customers can get started with these examples by following the implementation instructions for each solution in the AWS SRA Examples Repository Solutions GenAI page. Additional documentation and implementation guidance is available in the AWS SRA Design Guidance Generative AI Architecture Deep Dive.

AWS strives to continuously provide security solutions that help customers meet their security architecture needs. Customers can reach out to the team by submitting an issue in the code repository.

If you have feedback about this post, submit comments in the Comments section below.

Introducing Amazon Nova Sonic: Human-like voice conversations for generative AI applications

2025-04-08 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-nova-sonic-human-like-voice-conversations-for-generative-ai-applications/

Voice interfaces are essential to enhance customer experience in different areas such as customer support call automation, gaming, interactive education, and language learning. However, there are challenges when building voice-enabled applications.

Traditional approaches in building voice-enabled applications require complex orchestration of multiple models, such as speech recognition to convert speech to text, language models to understand and generate responses, and text-to-speech to convert text back to audio.

This fragmented approach not only increases development complexity but also fails to preserve crucial linguistic context such as tone, prosody, and speaking style that are essential for natural conversations. This can affect conversational AI applications that need low latency and nuanced understanding of verbal and non-verbal cues for fluid dialog handling and natural turn-taking.

To streamline the implementation of speech-enabled applications, today we are introducing Amazon Nova Sonic, the newest addition to the Amazon Nova family of foundation models (FMs) available in Amazon Bedrock.

Amazon Nova Sonic unifies speech understanding and generation into a single model that developers can use to create natural, human-like conversational AI experiences with low latency and industry-leading price performance. This integrated approach streamlines development and reduces complexity when building conversational applications.

Its unified model architecture delivers expressive speech generation and real-time text transcription without requiring a separate model. The result is an adaptive speech response that dynamically adjusts its delivery based on prosody, such as pace and timbre, of input speech.

When using Amazon Nova Sonic, developers have access to function calling (also known as tool use) and agentic workflows to interact with external services and APIs and perform tasks in the customer’s environment, including knowledge grounding with enterprise data using Retrieval-Augmented Generation.

At launch, Amazon Nova Sonic provides robust speech understanding for American and British English across various speaking styles and acoustic conditions, with additional languages coming soon.

Amazon Nova Sonic is developed with responsible AI at the forefront of innovation, featuring built-in protections for content moderation and watermarking.

Amazon Nova Sonic in action
The scenario for this demo is a contact center in the telecommunication industry. A customer reaches out to improve their subscription plan, and Amazon Nova Sonic handles the conversation.

With tool use, the model can interact with other systems and use agentic RAG with Amazon Bedrock Knowledge Bases to gather updated, customer-specific information such as account details, subscription plans, and pricing info.

The demo shows streaming transcription of speech input and displays streaming speech responses as text. The sentiment of the conversation is displayed in two ways: a time chart illustrating how it evolves, and a pie chart representing the overall distribution. There’s also an AI insights section providing contextual tips for a call center agent. Other interesting metrics shown in the web interface are the overall talk time distribution between the customer and the agent, and the average response time.

During the conversation with the support agent, you can observe through the metrics and hear in the voices how customer sentiment improves.

The video includes an example of how Amazon Nova Sonic handles interruptions smoothly, stopping to listen and then continuing the conversation in a natural way.

Now, let’s explore how you can integrate voice capabilities in your applications.

Using Amazon Nova Sonic
To get started with Amazon Nova Sonic, you first need to toggle model access in the Amazon Bedrock console, similar to how you would enable other FMs. Navigate to the Model access section of the navigation pane, find Amazon Nova Sonic under the Amazon models, and enable it for your account.

Amazon Bedrock provides a new bidirectional streaming API (InvokeModelWithBidirectionalStream) to help you implement real-time, low-latency conversational experiences on top of the HTTP/2 protocol. With this API, you can stream audio input to the model and receive audio output in real time, so that the conversation flows naturally.

You can use Amazon Nova Sonic with the new API with this model ID: amazon.nova-sonic-v1:0

After the session initialization, where you can configure inference parameters, the model operate through an event-driven architecture on both the input and output streams.

There are three key event types in the input stream:

System prompt – To set the overall system prompt for the conversation

Audio input streaming – To process continuous audio input in real-time

Tool result handling – To send the result of tool use calls back to the model (after tool use is requested in the output events)

Similarly, there are three groups of events in the output streams:

Automatic speech recognition (ASR) streaming – Speech-to-text transcript is generated, containing the result of realtime speech recognition.

Tool use handling – If there are a tool use events, they need to be handled using the information provided here, and the results sent back as input events.

Audio output streaming – To play output audio in real-time, a buffer is needed, because Amazon Nova Sonic model generates audio faster than real-time playback.

You can find examples of using Amazon Nova Sonic in the Amazon Nova model cookbook repository.

Prompt engineering for speech
When crafting prompts for Amazon Nova Sonic, your prompts should optimize content for auditory comprehension rather than visual reading, focusing on conversational flow and clarity when heard rather than seen.

When defining roles for your assistant, focus on conversational attributes (such as warm, patient, concise) rather than text-oriented attributes (detailed, comprehensive, systematic). A good baseline system prompt might be:

You are a friend. The user and you will engage in a spoken dialog exchanging the transcripts of a natural real-time conversation. Keep your responses short, generally two or three sentences for chatty scenarios.

More generally, when creating prompts for speech models, avoid requesting visual formatting (such as bullet points, tables, or code blocks), voice characteristic modifications (accent, age, or singing), or sound effects.

Things to know
Amazon Nova Sonic is available today in the US East (N. Virginia) AWS Region. Visit Amazon Bedrock pricing to see the pricing models.

Amazon Nova Sonic can understand speech in different speaking styles and generates speech in expressive voices, including both masculine-sounding and feminine-sounding voices, in different English accents, including American and British. Support for additional languages will be coming soon.

Amazon Nova Sonic handles user interruptions gracefully without dropping the conversational context and is robust to background noise. The model supports a context window of 32K tokens for audio with a rolling window to handle longer conversations and has a default session limit of 8 minutes.

The following AWS SDKs support the new bidirectional streaming API:

Python developers can use this new experimental SDK that makes it easier to use the bidirectional streaming capabilities of Amazon Nova Sonic. We’re working to add support to the other AWS SDKs.

I’d like to thank Reilly Manton and Chad Hendren, who set up the demo with the contact center in the telecommunication industry, and Anuj Jauhari, who helped me understand the rich landscape in which speech-to-speech models are being deployed.

To learn more, these articles that enter into the details of how to use the new bidirectional streaming API with compelling demos:

Whether you’re creating customer service solutions, language learning applications, or other conversational experiences, Amazon Nova Sonic provides the foundation for natural, engaging voice interactions. To get started, visit the Amazon Bedrock console today. To learn more, visit the Amazon Nova section of the user guide.

– Danilo

How is the News Blog doing? Take this 1 minute survey!