Tag Archives: launch

AWS Lambda enhances event processing with provisioned mode for SQS event-source mapping

2025-11-14 Micah Walter

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/aws-lambda-enhances-sqs-processing-with-new-provisioned-mode-3x-faster-scaling-16x-higher-capacity/

Today, we’re announcing the general availability of provisioned mode for AWS Lambda with Amazon Simple Queue Service (Amazon SQS) Event Source Mapping (ESM), a new feature that customers can use to optimize the throughput of their event-driven applications by configuring dedicated polling resources. Using this new capability, which provides 3x faster scaling, and 16x higher concurrency, you can process events with lower latency, handle sudden traffic spikes more effectively, and maintain precise control over your event processing resources.

Modern applications increasingly rely on event-driven architectures where services communicate through events and messages. Amazon SQS is commonly used as an event source for Lambda functions, so developers can build loosely coupled, scalable applications. Although the SQS ESM automatically handles queue polling and function invocation, customers with stringent performance requirements have asked for more control over the polling behavior to handle spiky traffic patterns and maintain low processing latency.

Provisioned mode for SQS ESM addresses these needs by introducing event pollers, which are dedicated resources that remain ready to handle expected traffic patterns. These event pollers can auto scale up to 1000 per concurrent executions per minute, more than three times faster than before to handle sudden spikes in event traffic and provide up to 20,000 concurrency–16 times higher capacity to process millions of events with Lambda functions. This enhanced scaling behavior helps customers maintain predictable low latency even during traffic surges.

Enterprises across various industries, from financial services to gaming companies, are using AWS Lambda with Amazon SQS to process real-time events for their mission-critical applications. These organizations, which include some of the largest online gaming platforms and financial institutions, require consistent subsecond processing times for their event-driven workloads, particularly during periods of peak usage. Provisioned mode for SQS ESM is a capability you can use to meet your stringent performance requirements while maintaining cost controls.

Enhanced control and performance

With provisioned mode, you can configure both minimum and maximum numbers of event pollers for your SQS ESM. Each event poller represents a unit of compute that handles queue polling, event batching, and filtering before invoking Lambda functions. Each event poller can handle up to 1 MB/sec of throughput, up to 10 concurrent invokes, or up to 10 SQS polling API calls per second. By setting a minimum number of event pollers, you enable your application to maintain a baseline processing capacity that can immediately handle sudden traffic increases. We recommend that you set the minimum event pollers required to handle your known peak workload requirements. The optional maximum setting helps prevent overloading downstream systems by limiting the total processing throughput.

The new mode delivers significant improvements in how your event-driven applications handle varying workloads. When traffic increases, your ESM detects the growing backlog within seconds and dynamically scales event pollers between your configured minimum and maximum values three times faster than before. This enhanced scaling capability is complemented by a substantial increase in processing capacity, with support for up to 2 GBps of aggregate traffic, and up to 20K concurrent requests—16x higher than previously possible. By maintaining a minimum number of ready-to-use event pollers, your application achieves predictable performance, handling sudden traffic spikes without the delay typically associated with scaling up resources. During low traffic periods, your ESM automatically scales down to your configured minimum number of event pollers, which means you can optimize costs while maintaining responsiveness.

Let’s try it out

Enabling provisioned mode is straightforward in the AWS Management Console. You need to already have an SQS queue configured and a Lambda function. To get started, in the Configuration tab for your Lambda function, choose Triggers, then Add trigger. This will bring up a user interface where you can configure your trigger. Choose SQS from the dropdown menu for source and then select the SQS queue you want to use.

Under Event poller configuration, you will now see a new option called Provisioned mode. Select Configure to reveal settings for Minimum event pollers and Maximum event pollers, each with defaults and minimum and maximum values displayed.

Configuration panel for SQS provisioned Mode

After you have configured Provisioned mode, you can save your trigger. If you need to make changes later, you can find the current configuration under the Triggers tab in the AWS Lambda configuration section, and you can modify your current settings there.

SQS Provisioned Poller confiig

Monitoring and observability

You can monitor your provisioned mode usage through Amazon CloudWatch metrics. The ProvisionedPollers metric shows the number of active event pollers processing events in one-minute windows.

Now available

Provisioned mode for Lambda SQS ESM is available today in all commercial AWS Regions. You can start using this feature through the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDKs. Pricing is based on the number of event pollers provisioned and the duration they’re provisioned for, measured in Event Poller Units (EPUs). Each EPU supports up to 1 MB per second throughput capacity per event poller, with minimum 2 event pollers per ESM. See the AWS pricing page for more information on EPU charges.

To learn more about provisioned mode for SQS ESM, visit the AWS Lambda documentation. Start building more responsive event-driven applications today with enhanced control over your event processing resources.

Introducing AWS IoT Core Device Location integration with Amazon Sidewalk

2025-11-13 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-aws-iot-core-device-location-integration-with-amazon-sidewalk/

Today, I’m happy to announce a new capability to resolve location data for Amazon Sidewalk enabled devices with the AWS IoT Core Device Location service. This feature removes the requirement to install GPS modules in a Sidewalk device and also simplifies the developer experience of resolving location data. Devices powered by small coin cell batteries, such as smart home sensor trackers, use Sidewalk to connect. Supporting built-in GPS modules for products that move around is not only expensive, it can creates challenge in ensuring optimal battery life performance and longevity.

With this launch, Internet of Things (IoT) device manufacturers and solution developers can build asset tracking and location monitoring solutions using Sidewalk-enabled devices by sending Bluetooth Low Energy (BLE), Wi-Fi, or Global Navigation Satellite System (GNSS) information to AWS IoT for location resolution. They can then send the resolved location data to an MQTT topic or AWS IoT rule and route the data to other Amazon Web Services (AWS) services, thus using different capabilities of AWS Cloud through AWS IoT Core. This would simplify their software development and give them more options to choose the optimal location source, thereby improving their product performance.

This launch addresses previous challenges and architecture complexity. You don’t need location sensing on network-based devices when you use the Sidewalk network infrastructure itself to determine device location, which eliminates the need for power-hungry and costly GPS hardware on the device. And, this feature also allows devices to efficiently measure and report location data from GNSS and Wi-Fi, thus extending the product battery life. Therefore, you can build a more compelling solution for asset tracking and location-aware IoT applications with these enhancements.

For those unfamiliar with Amazon Sidewalk and the AWS IoT Core Device Location service, I’ll briefly explain their history and context. If you’re already familiar with them, you can skip to the section on how to get started.

AWS IoT Core integrations with Amazon Sidewalk
Amazon Sidewalk is a shared network that helps devices work better through improved connectivity options. It’s designed to support a wide range of customer devices with capabilities ranging from locating pets or valuables, to smart home security and lighting control and remote diagnostics for appliances and tools.

Amazon Sidewalk is a secure community network that uses Amazon Sidewalk Gateways (also called Sidewalk Bridges), such as compatible Amazon Echo and Ring devices, to provide cloud connectivity for IoT endpoint devices. Amazon Sidewalk enables low-bandwidth and long-range connectivity at home and beyond using BLE for short-distance communication and LoRa and frequency-shift keying (FSK) radio protocols at 900MHz frequencies to cover longer distances.

Sidewalk now provides coverage to more than 90% of the US population and supports long-range connected solutions for communities and enterprises. Users with Ring cameras or Alexa devices that act as a Sidewalk Bridge can choose to contribute a small portion of their internet bandwidth, which is pooled to create a shared network that benefits all Sidewalk-enabled devices in a community.

In March 2023, AWS IoT Core deepened its integration with Amazon Sidewalk to seamlessly provision, onboard, and monitor Sidewalk devices with qualified hardware development kits (HDKs), SDKs, and sample applications. As of this writing, AWS IoT Core is the only way for customers to connect the Sidewalk network.

In the AWS IoT Core console, you can add your Sidewalk device, provision and register your devices, and connect your Sidewalk endpoint to the cloud. To learn more about onboarding your Sidewalk devices, visit the Getting started with AWS IoT Core for Amazon Sidewalk in the AWS IoT Wireless Developer Guide.

In November 2022, we announced AWS IoT Core Device Location service, a new feature that you can use to get the geo-coordinates of their IoT devices even when the device doesn’t have a GPS module. You can use the Device Location service as a simple request and response HTTP API, or you can use it with IoT connectivity pathways like MQTT, LoRaWAN, and now with Amazon Sidewalk.

In the AWS IoT Core console, you can test the Device Location service to resolve the location of your device by importing device payload data. Resource location is reported as a GeoJSON payload. To learn more, visit the AWS IoT Core Device Location in the AWS IoT Core Developer Guide.

Customers across multiple industries like automotive, supply chain, and industrial tools have requested a simplified solution such as the Device Location service to extract location-data from Sidewalk products. This would streamline customer software development and give them more options to choose the optimal location source, thereby improving their product.

Get started with a Device Location integration with Amazon Sidewalk
To enable Device Location for Sidewalk devices, go to the AWS IoT Core for Amazon Sidewalk section under LPWAN devices in the AWS IoT Core console. Choose Provision device or your existing device to edit the setting and select Activate positioning in the Geolocation option when creating and updating your Sidewalk devices.

While activating position, you need to specify a destination where you want to send your location data. The destination can either be an AWS IoT rule or an MQTT topic.

Here is a sample AWS Command Line Interface (AWS CLI) command to enable position while provisioning a new Sidewalk device:

$ aws iotwireless createwireless device --type Sidewalk \
  --name "demo-1" --destination-name "New-1" \
  --positioning Enabled

After your Sidewalk device establishes a connection to the Amazon Sidewalk network, the device SDK will send the GNSS-, Wi-Fi- or BLE-based information to AWS IoT Core for Amazon Sidewalk. If the customer has enabled Positioning, then AWS IoT Core Device Location will resolve the location data and send the location data to the specified Destination. After your Sidewalk device transmits location measurement data, the resolved geographic coordinates and a map pin will also be displayed in the Position section for the selected device.

You will also get location information delivered to your destination in GeoJSON format, as shown in the following example:

{
    "coordinates": [
        13.376076698303223,
        52.51823043823242
    ],
    "type": "Point",
    "properties": {
        "verticalAccuracy": 45,
        "verticalConfidenceLevel": 0.68,
        "horizontalAccuracy": 303,
        "horizontalConfidenceLevel": 0.68,
        "country": "USA",
        "state": "CA",
        "city": "Sunnyvale",
        "postalCode": "91234",
        "timestamp": "2025-11-18T12:23:58.189Z"
    }
}

You can monitor the Device Location data between your Sidewalk devices and AWS Cloud by enabling Amazon CloudWatch Logs for AWS IoT Core. To learn more, visit the AWS IoT Core for Amazon Sidewalk in the AWS IoT Wireless Developer Guide.

Now available
AWS IoT Core Device Location integration with Amazon Sidewalk is now generally available in the US East (N. Virginia) Region. To learn more about use cases, documentation, sample codes, and partner devices, visit the AWS IoT Core for Amazon Sidewalk product page.

Give it a try in the AWS IoT Core console and send feedback to AWS re:Post for AWS IoT Core or through your usual AWS Support contacts.

— Channy

Introducing Our Final AWS Heroes of 2025

2025-11-12 Taylor Jacobsen

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/introducing-our-final-aws-heroes-of-2025/

With AWS re:Invent approaching, we’re celebrating three exceptional AWS Heroes whose diverse journeys and commitment to knowledge sharing are empowering builders worldwide. From advancing women in tech and rural communities to bridging academic and industry expertise and pioneering enterprise AI solutions, these leaders exemplify the innovative spirit that drives our community forward. Their stories showcase how technical excellence, combined with passionate advocacy and mentorship, strengthens the global AWS community.

Dimple Vaghela – Ahmedabad, India

Community Hero Dimple Vaghela leads both the AWS User Group Ahmedabad and AWS User Group Vadodara, where she drives cloud education and technical growth across the region. Her impact spans organizing numerous AWS meetups, workshops, and AWS Community Days that have helped thousands of learners advance their cloud careers. Dimple launched the “Cloud for Her” project to empower girls from rural areas in technology careers and serves as co-organizer of the Women in Tech India User Group. Her exceptional leadership and community contributions were recognized at AWS re:Invent 2024 with the AWS User Group Leader Award in the Ownership category, while she continues building a more inclusive cloud community through speaking, mentoring, and organizing impactful tech events.

Rola Dali – Montreal, Canada

Community Hero Rola Dali is a senior Data, ML, and AI expert specializing in AWS cloud, bringing unique perspective from her PhD in neuroscience and bioinformatics with expertise in human genomics. As co-organizer of the AWS Montreal User Group and a former AWS Community Builder, her commitment to the cloud community earned her the prestigious Golden Jacket recognition in 2024. She actively shapes the tech community by architecting AWS solutions, sharing knowledge through blogs and lectures, and mentoring women entering tech, academics transitioning to industry, and students starting their careers.

Vivek Velso – Toronto, Canada

Machine Learning Hero Vivek Velso is a seasoned technology leader with over 27 years of IT industry experience, specializing in helping organizations modernize their cloud infrastructure for generative AI workloads. His deep AWS expertise earned him the prestigious Golden Jacket award for completing all AWS certifications, and he actively contributes to the AWS Subject Matter Expert (SME) program for multiple certification exams. A former AWS Community Builder and AWS Ambassador, he continues to share his knowledge through more than 100 technical blogs, articles, conference engagements, and AWS livestreams, helping the community confidently embrace cloud innovation.

Learn More

Visit the AWS Heroes webpage if you’d like to learn more about the AWS Heroes program, or to connect with a Hero near you.

— Taylor

Secure EKS clusters with the new support for Amazon EKS in AWS Backup

2025-11-10 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/secure-eks-clusters-with-the-new-support-for-amazon-eks-in-aws-backup/

Today, we’re announcing support for Amazon EKS in AWS Backup to provide the capability to secure Kubernetes applications using the same centralized platform you trust for your other Amazon Web Services (AWS) services. This integration eliminates the complexity of protecting containerized applications while providing enterprise-grade backup capabilities for both cluster configurations and application data. AWS Backup is a fully managed service to centralize and automate data protection across AWS and on-premises workloads. Amazon Elastic Kubernetes Service (Amazon EKS) is a fully managed Kubernetes service to manage availability and scalability of the Kubernetes clusters. With this new capability, you can centrally manage and automate data protection across your Amazon EKS environments alongside other AWS services.

Until now, for backups, customers relied on custom solutions or third-party tools to back up their EKS clusters, requiring complex scripting and maintenance for each cluster. The support for Amazon EKS in AWS Backup eliminates this overhead by providing a single, centralized, and policy-driven solution that protects both EKS clusters (Kubernetes deployments and resources) and stateful data (stored in Amazon Elastic Block Store (Amazon EBS), Amazon Elastic File System (Amazon EFS), and Amazon Simple Storage Service (Amazon S3) only) without the need to manage custom scripts across clusters. For restores, customers were previously required to restore their EKS backups to a target EKS cluster which was either the source EKS cluster, or a new EKS cluster, requiring that an EKS cluster infrastructure is provisioned ahead of time prior to the restore. With this new capability, during a restore of EKS cluster backups, customers also have the option to create a new EKS cluster based on previous EKS cluster configuration settings and restore to this new EKS cluster, with AWS Backup managing the provisioning of the EKS cluster on the customer’s behalf.

This support includes policy-based automation for protecting single or multiple EKS clusters. This single data protection policy provides a consistent experience across all services AWS Backup supports. It allows creation of immutable backups to prevent malicious or inadvertent changes, helping customers meet their regulatory compliance needs. In case there is a customer data loss or cluster downtime event, customers can easily recover their EKS cluster data from encrypted, immutable backups using an easy-to-use interface and maintain business continuity of running their EKS clusters at scale.

How it works
Here’s how I set up support for on-demand backup of my EKS cluster in AWS Backup. First, I’ll show a walkthrough of the backup process, then demonstrate a restore of the EKS cluster.

Backup
In the AWS Backup console, in the left navigation pane, I choose Settings and then Configure resources to opt in to enable protection of EKS clusters in AWS Backup.

Now that I’ve enabled Amazon EKS, in Protected resources I choose Create on-demand backup to create a backup for my already existing EKS cluster floral-electro-unicorn.

Enabling EKS in Settings ensures that it shows up as a Resource type when I create on-demand backup for the EKS cluster. I proceed to select the EKS resource type and the cluster.

I leave the rest of the information as default, then select Choose an IAM role to select a role (test-eks-backup) that I’ve created and customized with the necessary permissions for AWS Backup to assume when creating and managing backups on my behalf. I choose Create on-demand backup to finalize the process.

The job is initiated, and it will start running to back up both the EKS cluster state and the persistent volumes. If Amazon S3 buckets are attached to the backup, you’ll need to add the additional Amazon S3 backup permissions AWSBackupServiceRolePolicyForS3Backup to your role. This policy contains the permissions necessary for AWS Backup to back up any Amazon S3 bucket, including access to all objects in a bucket and any associated AWS KMS key.

The job is completed successfully and now EKS clusterfloral-electro-unicorn is backed up by AWS Backup.

Restore
Using the AWS Backup Console, I choose the EKS backup composite recovery point to start the process of restoring the EKS cluster backups, then choose Restore.

I choose Restore full EKS cluster to restore the full EKS backup. To restore to an existing cluster, I Choose an existing cluster then select the cluster from the drop-down list. I choose the Default order as the order in which individual Kubernetes resources will be restored.

I then configure the restore for the persistent storage resources, that will be restored alongside my EKS clusters.

Next, I Choose an IAM role to execute the restore action. The Protected resource tags checkbox is selected by default and I’ll leave it as is, then choose Next.

I review all the information before I finalize the process by choosing Restore, to start the job.

Selecting the drop-down arrow gives details of the restore status for both the EKS cluster state and persistent volumes attached. In this walkthrough, all the individual recovery points are restored successfully. If portions of the backup fail, it’s possible to restore the successfully backed up persistent stores (for example, Amazon EBS volumes) and cluster configuration settings individually. However, it’s not possible to restore full EKS backup. The successfully backed up resources will be available for restore, listed as nested recovery points under the EKS cluster recovery point. If there’s a partial failure, there will be a notification of the portion(s) that failed.

Benefits
Here are some of the benefits provided by the support for Amazon EKS in AWS Backup:

A fully managed multi-cluster backup experience, removing the overhead associated with managing custom scripts and third-party solutions.
Centralized, policy-based backup management that simplifies backup lifecycle management and makes it seamless to back up and recover your application data across AWS services, including EKS.
The ability to store and organize your backups with backup vaults. You assign policies to the backup vaults to grant access to users to create backup plans and on-demand backups but limit their ability to delete recovery points after they’re created.

Good to know
The following are some helpful facts to know:

Use either the AWS Backup Console, API, or AWS Command Line Interface (AWS CLI) to protect EKS clusters using AWS Backup. Alternatively, you can create an on-demand backup of the cluster after it has been created.
You can create secondary copies of your EKS backups across different accounts and AWS Regions to minimize risk of accidental deletion.
Restoration of EKS backups is available using the AWS Backup Console, API, or AWS CLI.
Restoring to an existing cluster will not override the Kubernetes versions, or any data as restores are non-destructive. Instead, there will be a restore of the delta between the backup and source resource.
Namespaces can only be restored to an existing cluster to ensure a successful restore as Kubernetes resources may be scoped at the cluster level.

Voice of the customer

Srikanth Rajan, Sr. Director of Engineering at Salesforce said “Losing a Kubernetes control plane because of software bugs or unintended cluster deletion can be catastrophic without a solid backup and restore plan. That’s why it’s exciting to see AWS rolling out the new EKS Backup and Restore feature, it’s a big step forward in closing a critical resiliency gap for Kubernetes platforms.”

Now available
Support for Amazon EKS in AWS Backup is available today in all AWS commercial Regions (except China) and in the AWS GovCloud (US) where AWS Backup and Amazon EKS are available. Check the full Region list for future updates.

To learn more, check out the AWS Backup product page and the AWS Backup pricing page.

Try out this capability for protecting your EKS clusters in AWS Backup and let us know what you think by sending feedback to AWS re:Post for AWS Backup or through your usual AWS Support contacts.

– Veliswa.

Introducing AWS Capabilities by Region for easier Regional planning and faster global deployments

2025-11-06 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-aws-capabilities-by-region-for-easier-regional-planning-and-faster-global-deployments/

At AWS, a common question we hear is: “Which AWS capabilities are available in different Regions?” It’s a critical question whether you’re planning Regional expansion, ensuring compliance with data residency requirements, or architecting for disaster recovery.

Today, I’m excited to introduce AWS Capabilities by Region, a new planning tool that helps you discover and compare AWS services, features, APIs, and AWS CloudFormation resources across Regions. You can explore service availability through an interactive interface, compare multiple Regions side-by-side, and view forward-looking roadmap information. This detailed visibility helps you make informed decisions about global deployments and avoid project delays and costly rework.

Getting started with Regional comparison
To get started, go to AWS Builder Center and choose AWS Capabilities and Start Exploring. When you select Services and features, you can choose the AWS Regions you’re most interested in from the dropdown list. You can use the search box to quickly find specific services or features. For example, I chose US (N. Virginia), Asia Pacific (Seoul), and Asia Pacific (Taipei) Regions to compare Amazon Simple Storage Service (Amazon S3) features.

Now I can view the availability of services and features in my chosen Regions and also see when they’re expected to be released. Select Show only common features to identify capabilities consistently available across all selected Regions, ensuring you design with services you can use everywhere.

The result will indicate availability using the following states: Available (live in the region); Planning (evaluating launch strategy); Not Expanding (will not launch in region); and 2026 Q1 (directional launch planning for the speciﬁed quarter).

In addition to exploring services and features, AWS Capabilities by Region also helps you explore available APIs and CloudFormation resources. As an example, to explore API operations, I added Europe (Stockholm) and Middle East (UAE) Regions to compare Amazon DynamoDB features across different geographies. The tool lets you view and search the availability of API operations in each Region.

The CloudFormation resources tab helps you verify Regional support for specific resource types before writing your templates. You can search by Service, Type, Property, and Config.For instance, when planning an Amazon API Gateway deployment, you can check the availability of resource types like AWS::ApiGateway::Account.

You can also search detailed resources such as Amazon Elastic Compute Cloud (Amazon EC2) instance type availability, including specialized instances such as Graviton-based, GPU-enabled, and memory-optimized variants. For example, I searched 7th generation compute-optimized metal instances and could find c7i.metal-24xl and c7i.metal-48xl instances are available across all targeted Regions.

Beyond the interactive interface, the AWS Capabilities by Region data is also accessible through the AWS Knowledge MCP Server. This allows you to automate Region expansion planning, generate AI-powered recommendations for Region and service selection, and integrate Regional capability checks directly into your development workflows and CI/CD pipelines.

Now available
You can begin exploring AWS Capabilities by Region in AWS Builder Center immediately. The Knowledge MCP server is also publicly accessible at no cost and does not require an AWS account. Usage is subject to rate limits. Follow the getting started guide for setup instructions.

We would love to hear your feedback, so please send us any suggestions through the Builder Support page.

— Channy

AWS Weekly Roundup: Project Rainier online, Amazon Nova, Amazon Bedrock, and more (November 3, 2025)

2025-11-03 Betty Zheng (郑予彬)

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-project-rainier-online-amazon-nova-amazon-bedrock-and-more-november-3-2025/

Last week I met Jeff Barr at the AWS Shenzhen Community Day. Jeff shared stories about how builders around the world are experimenting with generative AI and encouraged local developers to keep pushing ideas into real prototypes. Many attendees stayed after the sessions to discuss model grounding, evaluation, and how to bring generative AI into real applications.

Community builders showcased creative Kiro-themed demos, AI-powered IoT projects, and student-led experiments. It was inspiring to see new developers, students, and long-time Amazon Web Services (AWS) community leaders connecting over shared curiosity and excitement for generative AI innovation.

Project Rainier, one of the world’s most powerful operational AI supercomputers is now online. Built by AWS in close collaboration with Anthropic, Project Rainier brings nearly 500,000 AWS custom-designed Trainium2 chips into service using a new Amazon Elastic Compute (Amazon EC2) UltraServer and EC2 UltraCluster architecture designed for high-bandwidth, low-latency model training at hyperscale.

Anthropic is already training and running inference for Claude on Project Rainier, and is expected to scale to more than one million Trainium2 chips across direct usage and Amazon Bedrock by the end of 2025. For architecture details, deployment insights, and behind-the-scenes video of an UltraServer coming online, refer to AWS activates Project Rainier for the full announcement.

Last week’s launches
Here are the launches that got my attention this week:

Amazon Nova – Adds Web Grounding as a new built-in tool for real-time, citation-based web retrieval, and introduces Multimodal Embeddings, a state-of-the-art model that produces unified cross-modal vectors, improving accuracy for Retrieval Augmented Generation (RAG) and semantic search. Both capabilities are available in Amazon Bedrock.
Amazon Bedrock – TwelveLabs’ Marengo Embed 3.0 is now available for long-form, video-native multimodal embeddings across video, images, audio, and text with improved domain accuracy. Stability AI Image Services added four new tools: Outpaint, Fast Upscale, Conservative Upscale, and Creative Upscale for high-resolution upscaling, outpainting, and controlled variations.
Model Context Protocol (MCP) Proxy for AWS – Now generally available as a client-side proxy that connects MCP clients to remote AWS hosted MCP servers using SigV4 authentication. It works with tools like Amazon Q Developer CLI, Kiro, Cursor, and Strands Agents, and provides safety controls such as read-only mode, retry logic, and logging. The Proxy is open-source. You can visit the AWS GitHub repository to view the installation and configuration options and start connecting with remote AWS MCP servers.
Amazon Elastic Container Service (Amazon ECS) – Now supports built-in linear and canary deployment strategies, providing gradual traffic shifting, canary testing with small production slices, deployment bake times for safe rollback, and Amazon CloudWatch alarm-based automated rollbacks.
Amazon DocumentDB – Adds a new query planner in Amazon DocumentDB 5.0 that delivers up to 10 times faster query performance with more optimal index plans and support for $neq, $nin, and nested $elementMatch, and can be enabled through cluster parameter groups without downtime.
Amazon Elastic Block Store (Amazon EBS) – You can now use new per-volume CloudWatch metrics, VolumeAvgIOPS and VolumeAvgThroughput, to get minute-level visibility into average IOPS and throughput for EBS volumes on AWS Nitro based instances. These metrics help monitor performance trends, troubleshoot bottlenecks, and optimize provisioned capacity.
Amazon Kinesis Data Streams – You can now send individual records up to 10 MiB, a tenfold increase from the previous limit, helping support larger Internet of Things (IoT), change data capture (CDC), and AI-generated payloads.
Amazon SageMaker – Unified Studio search results now provide additional search context, showing matched metadata fields and ranking rationale to improve transparency and relevance in data discovery.

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

Building production-ready 3D pipelines with AWS VAMS and 4D Pipeline – A reference architecture for creating scalable, cloud-based 3D asset pipelines using AWS Visual Asset Management System (VAMS) and 4D Pipeline, supporting ingest, validation, collaborative review, and distribution across games, visual effects (VFX), and digital twins.
Amazon Location Service introduces new API key restrictions – You can now create granular security policies with bundle IDs to restrict API access to specific mobile applications, improving access control and strengthening application-level security across location-based workloads.
AWS Clean Rooms launches advanced SQL configurations – A performance enhancement for Spark SQL workloads that supports runtime customization of Spark properties and compute sizes, plus table caching for faster and more cost-efficient processing of large analytical queries.
AWS Serverless MCP Server adds event source mappings (ESM) tools – A capability for event-driven serverless applications that supports configuration, performance tuning, and troubleshooting of AWS Lambda event source mappings, including AWS Serverless Application Model (AWS SAM) template generation and diagnostic insights.
AWS IoT Greengrass releases an AI agent context pack – A development accelerator for cloud-connected edge applications that provides ready-to-use instructions, examples, and templates, helping teams integrate generative AI tools such as Amazon Q for faster software creation, testing, and fleet-wide deployment. It’s available as open source on the GitHub repository.
AWS Step Functions introduces a new metrics dashboard – You can now view usage, billing, and performance metrics at the state-machine level for standard and express workflows in a single console view, improving visibility and troubleshooting for distributed applications.

Upcoming AWS events
Check your calendars so that you can sign up for these upcoming events:

AWS Builder Loft – A community tech space in San Francisco where you can learn from expert sessions, join hands-on workshops, explore AI and emerging technologies, and collaborate with other builders to accelerate their ideas. Browse the upcoming sessions and join the events that interest you.
AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by experienced AWS users and industry leaders from around the world: Hong Kong (November 2), Abuja (November 8), Cameroon (November 8), and Spain (November 15).
AWS Skills Center Seattle 4th Anniversary Celebration – A free, public event on November 20 with a keynote, learned panels, recruiter insights, raffles, and virtual participation options.

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person events, developer-focused events, and events for startups.

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Betty

Build more accurate AI applications with Amazon Nova Web Grounding

2025-10-29 Matheus Guimaraes

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/build-more-accurate-ai-applications-with-amazon-nova-web-grounding/

Imagine building AI applications that deliver accurate, current information without the complexity of developing intricate data retrieval systems. Today, we’re excited to announce the general availability of Web Grounding, a new built-in tool for Nova models on Amazon Bedrock.

Web Grounding provides developers with a turnkey Retrieval Augmented Generation (RAG) option that allows the Amazon Nova foundation models to intelligently decide when to retrieve and incorporate relevant up-to-date information based on the context of the prompt. This helps to ground the model output by incorporating cited public sources as context, aiming to reduce hallucinations and improve accuracy.

When should developers use Web Grounding?

Developers should consider using Web Grounding when building applications that require access to current, factual information or need to provide well-cited responses. The capability is particularly valuable across a range of applications, from knowledge-based chat assistants providing up-to-date information about products and services, to content generation tools requiring fact-checking and source verification. It’s also ideal for research assistants that need to synthesize information from multiple current sources, as well as customer support applications where accuracy and verifiability are crucial.

Web Grounding is especially useful when you need to reduce hallucinations in your AI applications or when your use case requires transparent source attribution. Because it automatically handles the retrieval and integration of information, it’s an efficient solution for developers who want to focus on building their applications rather than managing complex RAG implementations.

Getting started
Web Grounding seamlessly integrates with supported Amazon Nova models to handle information retrieval and processing during inference. This eliminates the need to build and maintain complex RAG pipelines, while also providing source attributions that verify the origin of information.

Let’s see an example of asking a question to Nova Premier using Python to call the Amazon Bedrock Converse API with Web Grounding enabled.

First, I created an Amazon Bedrock client using AWS SDK for Python (Boto3) in the usual way. For good practice, I’m using a session, which helps to group configurations and make them reusable. I then create a BedrockRuntimeClient.

try:
    session = boto3.Session(region_name='us-east-1')
    client = session.client(
        'bedrock-runtime')

I then prepare the Amazon Bedrock Converse API payload. It includes a “role” parameter set to “user”, indicating that the message comes from our application’s user (compared to “assistant” for AI-generated responses).

For this demo, I chose the question “What are the current AWS Regions and their locations?” This was selected intentionally because it requires current information, making it useful to demonstrate how Amazon Nova can automatically invoke searches using Web Grounding when it determines that up-to-date knowledge is needed.

# Prepare the conversation in the format expected by Bedrock
question = "What are the current AWS regions and their locations?"
conversation = [
   {
     "role": "user",  # Indicates this message is from the user
     "content": [{"text": question}],  # The actual question text
      }
    ]

First, let’s see what the output is without Web Grounding. I make a call to Amazon Bedrock Converse API.

# Make the API call to Bedrock 
model_id = "us.amazon.nova-premier-v1:0" 
response = client.converse( 
    modelId=model_id, # Which AI model to use 
    messages=conversation, # The conversation history (just our question in this case) 
    )
print(response['output']['message']['content'][0]['text'])

I get a list of all the current AWS Regions and their locations.

Now let’s use Web Grounding. I make a similar call to the Amazon Bedrock Converse API, but declare nova_grounding as one of the tools available to the model.

model_id = "us.amazon.nova-premier-v1:0" 
response = client.converse( 
    modelId=model_id, 
    messages=conversation, 
    toolConfig= {
          "tools":[ 
              {
                "systemTool": {
                   "name": "nova_grounding" # Enables the model to search real-time information
                 }
              }
          ]
     }
)

After processing the response, I can see that the model used Web Grounding to access up-to-date information. The output includes reasoning traces that I can use to follow its thought process and see where it automatically queried external sources. The content of the responses from these external calls appear as [HIDDEN] – a standard practice in AI systems that both protects sensitive information and helps manage output size.

Additionally, the output also includes citationsContent objects containing information about the sources queried by Web Grounding.

Finally, I can see the list of AWS Regions. It finishes with a message right at the end stating that “These are the most current and active AWS regions globally.”

Web Grounding represents a significant step forward in making AI applications more reliable and current with minimum effort. Whether you’re building customer service chat assistants that need to provide up-to-date accurate information, developing research applications that analyze and synthesize information from multiple sources, or creating travel applications that deliver the latest details about destinations and accommodations, Web Grounding can help you deliver more accurate and relevant responses to your users with a convenient turnkey solution that is straightforward to configure and use.

Things to know
Amazon Nova Web Grounding is available today in US East (N. Virginia). Web Grounding will also soon launch on US East (Ohio), and US West (Oregon).

Web Grounding incurs additional cost. Refer to the Amazon Bedrock pricing page for more details.

Currently, you can only use Web Grounding with Nova Premier but support for other Nova models will be added soon.

If you haven’t used Amazon Nova before or are looking to go deeper, try this self-paced online workshop where you can learn how to effectively use Amazon Nova foundation models and related features for text, image, and video processing through hands-on exercises.

Matheus Guimaraes | @codingmatheus

Amazon Nova Multimodal Embeddings: State-of-the-art embedding model for agentic RAG and semantic search

2025-10-28 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-nova-multimodal-embeddings-now-available-in-amazon-bedrock/

Today, we’re introducing Amazon Nova Multimodal Embeddings, a state-of-the-art multimodal embedding model for agentic retrieval-augmented generation (RAG) and semantic search applications, available in Amazon Bedrock. It is the first unified embedding model that supports text, documents, images, video, and audio through a single model to enable crossmodal retrieval with leading accuracy.

Embedding models convert textual, visual, and audio inputs into numerical representations called embeddings. These embeddings capture the meaning of the input in a way that AI systems can compare, search, and analyze, powering use cases such as semantic search and RAG.

Organizations are increasingly seeking solutions to unlock insights from the growing volume of unstructured data that is spread across text, image, document, video, and audio content. For example, an organization might have product images, brochures that contain infographics and text, and user-uploaded video clips. Embedding models are able to unlock value from unstructured data, however traditional models are typically specialized to handle one content type. This limitation drives customers to either build complex crossmodal embedding solutions or restrict themselves to use cases focused on a single content type. The problem also applies to mixed-modality content types such as documents with interleaved text and images or video with visual, audio, and textual elements where existing models struggle to capture crossmodal relationships eﬀectively.

Nova Multimodal Embeddings supports a unified semantic space for text, documents, images, video, and audio for use cases such as crossmodal search across mixed-modality content, searching with a reference image, and retrieving visual documents.

Evaluating Amazon Nova Multimodal Embeddings performance
We evaluated the model on a broad range of benchmarks, and it delivers leading accuracy out-of-the-box as described in the following table.

Nova Multimodal Embeddings supports a context length of up to 8K tokens, text in up to 200 languages, and accepts inputs via synchronous and asynchronous APIs. Additionally, it supports segmentation (also known as “chunking”) to partition long-form text, video, or audio content into manageable segments, generating embeddings for each portion. Lastly, the model oﬀers four output embedding dimensions, trained using Matryoshka Representation Learning (MRL) that enables low-latency end-to-end retrieval with minimal accuracy changes.

Let’s see how the new model can be used in practice.

Using Amazon Nova Multimodal Embeddings
Getting started with Nova Multimodal Embeddings follows the same pattern as other models in Amazon Bedrock. The model accepts text, documents, images, video, or audio as input and returns numerical embeddings that you can use for semantic search, similarity comparison, or RAG.

Here’s a practical example using the AWS SDK for Python (Boto3) that shows how to create embeddings from different content types and store them for later retrieval. For simplicity, I’ll use Amazon S3 Vectors, a cost-optimized storage with native support for storing and querying vectors at any scale, to store and search the embeddings.

Let’s start with the fundamentals: converting text into embeddings. This example shows how to transform a simple text description into a numerical representation that captures its semantic meaning. These embeddings can later be compared with embeddings from documents, images, videos, or audio to find related content.

To make the code easy to follow, I’ll show a section of the script at a time. The full script is included at the end of this walkthrough.

import json
import base64
import time
import boto3

MODEL_ID = "amazon.nova-2-multimodal-embeddings-v1:0"
EMBEDDING_DIMENSION = 3072

# Initialize Amazon Bedrock Runtime client
bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

print(f"Generating text embedding with {MODEL_ID} ...")

# Text to embed
text = "Amazon Nova is a multimodal foundation model"

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "text": {"truncationMode": "END", "value": text},
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")

Now we’ll process visual content using the same embedding space using a photo.jpg file in the same folder as the script. This demonstrates the power of multimodality: Nova Multimodal Embeddings is able to capture both textual and visual context into a single embedding that provides enhanced understanding of the document.

Nova Multimodal Embeddings can generate embeddings that are optimized for how they are being used. When indexing for a search or retrieval use case, embeddingPurpose can be set to GENERIC_INDEX. For the query step, embeddingPurpose can be set depending on the type of item to be retrieved. For example, when retrieving documents, embeddingPurpose can be set to DOCUMENT_RETRIEVAL.

# Read and encode image
print(f"Generating image embedding with {MODEL_ID} ...")

with open("photo.jpg", "rb") as f:
    image_bytes = base64.b64encode(f.read()).decode("utf-8")

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "image": {
            "format": "jpeg",
            "source": {"bytes": image_bytes}
        },
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")

To process video content, I use the asynchronous API. That’s a requirement for videos that are larger than 25MB when encoded as Base64. First, I upload a local video to an S3 bucket in the same AWS Region.

aws s3 cp presentation.mp4 s3://my-video-bucket/videos/

This example shows how to extract embeddings from both visual and audio components of a video file. The segmentation feature breaks longer videos into manageable chunks, making it practical to search through hours of content efficiently.

# Initialize Amazon S3 client
s3 = boto3.client("s3", region_name="us-east-1")

print(f"Generating video embedding with {MODEL_ID} ...")

# Amazon S3 URIs
S3_VIDEO_URI = "s3://my-video-bucket/videos/presentation.mp4"
S3_EMBEDDING_DESTINATION_URI = "s3://my-embedding-destination-bucket/embeddings-output/"

# Create async embedding job for video with audio
model_input = {
    "taskType": "SEGMENTED_EMBEDDING",
    "segmentedEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "video": {
            "format": "mp4",
            "embeddingMode": "AUDIO_VIDEO_COMBINED",
            "source": {
                "s3Location": {"uri": S3_VIDEO_URI}
            },
            "segmentationConfig": {
                "durationSeconds": 15  # Segment into 15-second chunks
            },
        },
    },
}

response = bedrock_runtime.start_async_invoke(
    modelId=MODEL_ID,
    modelInput=model_input,
    outputDataConfig={
        "s3OutputDataConfig": {
            "s3Uri": S3_EMBEDDING_DESTINATION_URI
        }
    },
)

invocation_arn = response["invocationArn"]
print(f"Async job started: {invocation_arn}")

# Poll until job completes
print("\nPolling for job completion...")
while True:
    job = bedrock_runtime.get_async_invoke(invocationArn=invocation_arn)
    status = job["status"]
    print(f"Status: {status}")

    if status != "InProgress":
        break
    time.sleep(15)

# Check if job completed successfully
if status == "Completed":
    output_s3_uri = job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"]
    print(f"\nSuccess! Embeddings at: {output_s3_uri}")

    # Parse S3 URI to get bucket and prefix
    s3_uri_parts = output_s3_uri[5:].split("/", 1)  # Remove "s3://" prefix
    bucket = s3_uri_parts[0]
    prefix = s3_uri_parts[1] if len(s3_uri_parts) > 1 else ""

    # AUDIO_VIDEO_COMBINED mode outputs to embedding-audio-video.jsonl
    # The output_s3_uri already includes the job ID, so just append the filename
    embeddings_key = f"{prefix}/embedding-audio-video.jsonl".lstrip("/")

    print(f"Reading embeddings from: s3://{bucket}/{embeddings_key}")

    # Read and parse JSONL file
    response = s3.get_object(Bucket=bucket, Key=embeddings_key)
    content = response['Body'].read().decode('utf-8')

    embeddings = []
    for line in content.strip().split('\n'):
        if line:
            embeddings.append(json.loads(line))

    print(f"\nFound {len(embeddings)} video segments:")
    for i, segment in enumerate(embeddings):
        print(f"  Segment {i}: {segment.get('startTime', 0):.1f}s - {segment.get('endTime', 0):.1f}s")
        print(f"    Embedding dimension: {len(segment.get('embedding', []))}")
else:
    print(f"\nJob failed: {job.get('failureMessage', 'Unknown error')}")

With our embeddings generated, we need a place to store and search them efficiently. This example demonstrates setting up a vector store using Amazon S3 Vectors, which provides the infrastructure needed for similarity search at scale. Think of this as creating a searchable index where semantically similar content naturally clusters together. When adding an embedding to the index, I use the metadata to specify the original format and the content being indexed.

# Initialize Amazon S3 Vectors client
s3vectors = boto3.client("s3vectors", region_name="us-east-1")

# Configuration
VECTOR_BUCKET = "my-vector-store"
INDEX_NAME = "embeddings"

# Create vector bucket and index (if they don't exist)
try:
    s3vectors.get_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Vector bucket {VECTOR_BUCKET} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Created vector bucket: {VECTOR_BUCKET}")

try:
    s3vectors.get_index(vectorBucketName=VECTOR_BUCKET, indexName=INDEX_NAME)
    print(f"Vector index {INDEX_NAME} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_index(
        vectorBucketName=VECTOR_BUCKET,
        indexName=INDEX_NAME,
        dimension=EMBEDDING_DIMENSION,
        dataType="float32",
        distanceMetric="cosine"
    )
    print(f"Created index: {INDEX_NAME}")

texts = [
    "Machine learning on AWS",
    "Amazon Bedrock provides foundation models",
    "S3 Vectors enables semantic search"
]

print(f"\nGenerating embeddings for {len(texts)} texts...")

# Generate embeddings using Amazon Nova for each text
vectors = []
for text in texts:
    response = bedrock_runtime.invoke_model(
        body=json.dumps({
            "taskType": "SINGLE_EMBEDDING",
            "singleEmbeddingParams": {
                "embeddingDimension": EMBEDDING_DIMENSION,
                "text": {"truncationMode": "END", "value": text}
            }
        }),
        modelId=MODEL_ID,
        accept="application/json",
        contentType="application/json"
    )

    response_body = json.loads(response["body"].read())
    embedding = response_body["embeddings"][0]["embedding"]

    vectors.append({
        "key": f"text:{text[:50]}",  # Unique identifier
        "data": {"float32": embedding},
        "metadata": {"type": "text", "content": text}
    })
    print(f"  ✓ Generated embedding for: {text}")

# Add all vectors to store in a single call
s3vectors.put_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    vectors=vectors
)

print(f"\nSuccessfully added {len(vectors)} vectors to the store in one put_vectors call!")

This final example demonstrates the capability of searching across different content types with a single query, finding the most similar content regardless of whether it originated from text, images, videos, or audio. The distance scores help you understand how closely related the results are to your original query.

# Text to query
query_text = "foundation models"  

print(f"\nGenerating embeddings for query '{query_text}' ...")

# Generate embeddings
response = bedrock_runtime.invoke_model(
    body=json.dumps({
        "taskType": "SINGLE_EMBEDDING",
        "singleEmbeddingParams": {
            "embeddingPurpose": "GENERIC_RETRIEVAL",
            "embeddingDimension": EMBEDDING_DIMENSION,
            "text": {"truncationMode": "END", "value": query_text}
        }
    }),
    modelId=MODEL_ID,
    accept="application/json",
    contentType="application/json"
)

response_body = json.loads(response["body"].read())
query_embedding = response_body["embeddings"][0]["embedding"]

print(f"Searching for similar embeddings...\n")

# Search for top 5 most similar vectors
response = s3vectors.query_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    queryVector={"float32": query_embedding},
    topK=5,
    returnDistance=True,
    returnMetadata=True
)

# Display results
print(f"Found {len(response['vectors'])} results:\n")
for i, result in enumerate(response["vectors"], 1):
    print(f"{i}. {result['key']}")
    print(f"   Distance: {result['distance']:.4f}")
    if result.get("metadata"):
        print(f"   Metadata: {result['metadata']}")
    print()

Crossmodal search is one of the key advantages of multimodal embeddings. With crossmodal search, you can query with text and find relevant images. You can also search for videos using text descriptions, find audio clips that match certain topics, or discover documents based on their visual and textual content. For your reference, the full script with all previous examples merged together is here:

import json
import base64
import time
import boto3

MODEL_ID = "amazon.nova-2-multimodal-embeddings-v1:0"
EMBEDDING_DIMENSION = 3072

# Initialize Amazon Bedrock Runtime client
bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

print(f"Generating text embedding with {MODEL_ID} ...")

# Text to embed
text = "Amazon Nova is a multimodal foundation model"

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "text": {"truncationMode": "END", "value": text},
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")
# Read and encode image
print(f"Generating image embedding with {MODEL_ID} ...")

with open("photo.jpg", "rb") as f:
    image_bytes = base64.b64encode(f.read()).decode("utf-8")

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "image": {
            "format": "jpeg",
            "source": {"bytes": image_bytes}
        },
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")
# Initialize Amazon S3 client
s3 = boto3.client("s3", region_name="us-east-1")

print(f"Generating video embedding with {MODEL_ID} ...")

# Amazon S3 URIs
S3_VIDEO_URI = "s3://my-video-bucket/videos/presentation.mp4"

# Amazon S3 output bucket and location
S3_EMBEDDING_DESTINATION_URI = "s3://my-video-bucket/embeddings-output/"

# Create async embedding job for video with audio
model_input = {
    "taskType": "SEGMENTED_EMBEDDING",
    "segmentedEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "video": {
            "format": "mp4",
            "embeddingMode": "AUDIO_VIDEO_COMBINED",
            "source": {
                "s3Location": {"uri": S3_VIDEO_URI}
            },
            "segmentationConfig": {
                "durationSeconds": 15  # Segment into 15-second chunks
            },
        },
    },
}

response = bedrock_runtime.start_async_invoke(
    modelId=MODEL_ID,
    modelInput=model_input,
    outputDataConfig={
        "s3OutputDataConfig": {
            "s3Uri": S3_EMBEDDING_DESTINATION_URI
        }
    },
)

invocation_arn = response["invocationArn"]
print(f"Async job started: {invocation_arn}")

# Poll until job completes
print("\nPolling for job completion...")
while True:
    job = bedrock_runtime.get_async_invoke(invocationArn=invocation_arn)
    status = job["status"]
    print(f"Status: {status}")

    if status != "InProgress":
        break
    time.sleep(15)

# Check if job completed successfully
if status == "Completed":
    output_s3_uri = job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"]
    print(f"\nSuccess! Embeddings at: {output_s3_uri}")

    # Parse S3 URI to get bucket and prefix
    s3_uri_parts = output_s3_uri[5:].split("/", 1)  # Remove "s3://" prefix
    bucket = s3_uri_parts[0]
    prefix = s3_uri_parts[1] if len(s3_uri_parts) > 1 else ""

    # AUDIO_VIDEO_COMBINED mode outputs to embedding-audio-video.jsonl
    # The output_s3_uri already includes the job ID, so just append the filename
    embeddings_key = f"{prefix}/embedding-audio-video.jsonl".lstrip("/")

    print(f"Reading embeddings from: s3://{bucket}/{embeddings_key}")

    # Read and parse JSONL file
    response = s3.get_object(Bucket=bucket, Key=embeddings_key)
    content = response['Body'].read().decode('utf-8')

    embeddings = []
    for line in content.strip().split('\n'):
        if line:
            embeddings.append(json.loads(line))

    print(f"\nFound {len(embeddings)} video segments:")
    for i, segment in enumerate(embeddings):
        print(f"  Segment {i}: {segment.get('startTime', 0):.1f}s - {segment.get('endTime', 0):.1f}s")
        print(f"    Embedding dimension: {len(segment.get('embedding', []))}")
else:
    print(f"\nJob failed: {job.get('failureMessage', 'Unknown error')}")
# Initialize Amazon S3 Vectors client
s3vectors = boto3.client("s3vectors", region_name="us-east-1")

# Configuration
VECTOR_BUCKET = "my-vector-store"
INDEX_NAME = "embeddings"

# Create vector bucket and index (if they don't exist)
try:
    s3vectors.get_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Vector bucket {VECTOR_BUCKET} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Created vector bucket: {VECTOR_BUCKET}")

try:
    s3vectors.get_index(vectorBucketName=VECTOR_BUCKET, indexName=INDEX_NAME)
    print(f"Vector index {INDEX_NAME} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_index(
        vectorBucketName=VECTOR_BUCKET,
        indexName=INDEX_NAME,
        dimension=EMBEDDING_DIMENSION,
        dataType="float32",
        distanceMetric="cosine"
    )
    print(f"Created index: {INDEX_NAME}")

texts = [
    "Machine learning on AWS",
    "Amazon Bedrock provides foundation models",
    "S3 Vectors enables semantic search"
]

print(f"\nGenerating embeddings for {len(texts)} texts...")

# Generate embeddings using Amazon Nova for each text
vectors = []
for text in texts:
    response = bedrock_runtime.invoke_model(
        body=json.dumps({
            "taskType": "SINGLE_EMBEDDING",
            "singleEmbeddingParams": {
                "embeddingPurpose": "GENERIC_INDEX",
                "embeddingDimension": EMBEDDING_DIMENSION,
                "text": {"truncationMode": "END", "value": text}
            }
        }),
        modelId=MODEL_ID,
        accept="application/json",
        contentType="application/json"
    )

    response_body = json.loads(response["body"].read())
    embedding = response_body["embeddings"][0]["embedding"]

    vectors.append({
        "key": f"text:{text[:50]}",  # Unique identifier
        "data": {"float32": embedding},
        "metadata": {"type": "text", "content": text}
    })
    print(f"  ✓ Generated embedding for: {text}")

# Add all vectors to store in a single call
s3vectors.put_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    vectors=vectors
)

print(f"\nSuccessfully added {len(vectors)} vectors to the store in one put_vectors call!")
# Text to query
query_text = "foundation models"  

print(f"\nGenerating embeddings for query '{query_text}' ...")

# Generate embeddings
response = bedrock_runtime.invoke_model(
    body=json.dumps({
        "taskType": "SINGLE_EMBEDDING",
        "singleEmbeddingParams": {
            "embeddingPurpose": "GENERIC_RETRIEVAL",
            "embeddingDimension": EMBEDDING_DIMENSION,
            "text": {"truncationMode": "END", "value": query_text}
        }
    }),
    modelId=MODEL_ID,
    accept="application/json",
    contentType="application/json"
)

response_body = json.loads(response["body"].read())
query_embedding = response_body["embeddings"][0]["embedding"]

print(f"Searching for similar embeddings...\n")

# Search for top 5 most similar vectors
response = s3vectors.query_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    queryVector={"float32": query_embedding},
    topK=5,
    returnDistance=True,
    returnMetadata=True
)

# Display results
print(f"Found {len(response['vectors'])} results:\n")
for i, result in enumerate(response["vectors"], 1):
    print(f"{i}. {result['key']}")
    print(f"   Distance: {result['distance']:.4f}")
    if result.get("metadata"):
        print(f"   Metadata: {result['metadata']}")
    print()

For production applications, embeddings can be stored in any vector database. Amazon OpenSearch Service offers native integration with Nova Multimodal Embeddings at launch, making it straightforward to build scalable search applications. As shown in the examples before, Amazon S3 Vectors provides a simple way to store and query embeddings with your application data.

Things to know
Nova Multimodal Embeddings offers four output dimension options: 3,072, 1,024, 384, and 256. Larger dimensions provide more detailed representations but require more storage and computation. Smaller dimensions offer a practical balance between retrieval performance and resource efficiency. This flexibility helps you optimize for your specific application and cost requirements.

The model handles substantial context lengths. For text inputs, it can process up to 8,192 tokens at once. Video and audio inputs support segments of up to 30 seconds, and the model can segment longer files. This segmentation capability is particularly useful when working with large media files—the model splits them into manageable pieces and creates embeddings for each segment.

The model includes responsible AI features built into Amazon Bedrock. Content submitted for embedding goes through Amazon Bedrock content safety filters, and the model includes fairness measures to reduce bias.

As described in the code examples, the model can be invoked through both synchronous and asynchronous APIs. The synchronous API works well for real-time applications where you need immediate responses, such as processing user queries in a search interface. The asynchronous API handles latency insensitive workloads more efficiently, making it suitable for processing large content such as videos.

Availability and pricing
Amazon Nova Multimodal Embeddings is available today in Amazon Bedrock in the US East (N. Virginia) AWS Region. For detailed pricing information, visit the Amazon Bedrock pricing page.

To learn more, see the Amazon Nova User Guide for comprehensive documentation and the Amazon Nova model cookbook on GitHub for practical code examples.

If you’re using an AI–powered assistant for software development such as Amazon Q Developer or Kiro, you can set up the AWS API MCP Server to help the AI assistants interact with AWS services and resources and the AWS Knowledge MCP Server to provide up-to-date documentation, code samples, knowledge about the regional availability of AWS APIs and CloudFormation resources.

Start building multimodal AI-powered applications with Nova Multimodal Embeddings today, and share your feedback through AWS re:Post for Amazon Bedrock or your usual AWS Support contacts.

— Danilo

Customer Carbon Footprint Tool Expands: Additional emissions categories including Scope 3 are now available

2025-10-22 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/aws-customer-carbon-footprint-tool-now-includes-scope-3-emissions/

Since it launched in 2022, the Customer Carbon Footprint Tool (CCFT) has supported our customers’ sustainability journey to track, measure, and review their carbon emissions by providing the estimated carbon emissions associated with their usage of Amazon Web Services (AWS) services.

In April, we made major updates in the CCFT, including easier access to carbon emissions data, visibility into emissions by AWS Region, inclusion of location-based emissions (LBM), an updated, independently-verified methodology as well as moving to a dedicated page in the AWS Billing console.

The CCFT is informed by the Greenhouse Gas (GHG) Protocol’s classification of emissions, which classifies a company’s emissions. Today, we’re announcing the inclusion of Scope 3 emissions data and an update to Scope 1 emissions in the CCFT. The new emission categories complement the existing Scope 1 and 2 data, and they’ll give our customers a comprehensive look into their carbon emissions data.

In this updated methodology we incorporate new emissions categories. We’ve added Scope 1 refrigerants and natural gas, alongside the existing Scope 1 emissions from fuel combustion in emergency backup generators (diesel). Although Scope 1 emissions represent a small share of overall emissions, we provide our customers with a complete image of their carbon emissions.

To decide which categories of Scope 3 to include in our model we looked at how material each of them were to the overall carbon impact and confirmed the vast majority of emissions were represented. With that in mind, the methodology now includes:

Fuel- and energy-related activities (“FERA” under the GHG Protocol) – This includes upstream emissions from purchased fuels, upstream emissions of purchased electricity, and transmission and distribution (T&D) losses. AWS calculates these emissions using both LBM and the market-based method (MBM).
IT hardware – AWS uses a comprehensive cradle-to-gate approach that tracks emissions from raw material extraction through manufacturing and transportation to AWS data centers. We use four calculation pathways: process-based life cycle assessment (LCA) with engineering attributes, extrapolation, representative category average LCA, and economic input-output LCA. AWS prioritizes the most detailed and accurate methods for components that contribute significantly to overall emissions.
Buildings and equipment – AWS follows established whole building life cycle assessment (wbLCA) standards, considering emissions from construction, use, and end-of-life phases. The analysis covers data center shells, rooms, and long-lead equipment such as air handling units and generators. The methodology uses both process-based life cycle assessment models and economic input-output analysis to provide comprehensive coverage.

The Scope 3 emissions are then amortized over the assets’ service life (6 years for IT hardware, 50 years for buildings) to calculate monthly emissions that can be allocated to customers. This amortization means that we fairly distribute the total embodied carbon of each asset across its operational lifetime, accounting for scenarios such as early retirement or extended use.

All these updates are part of methodology version 3.0.0 and are explained in detail in our methodology document, which has been independently verified by a third party.

How to access the CCFT
To get started, go to the AWS Billing and Cost Management console and choose Customer Carbon Footprint Tool under Cost and Usage Analysis. You can access your carbon emissions data in the dashboard, download a csv file, or export all data using basic SQL and visualize your data by integrating with AWS Data Exports and Amazon Quick Sight.

To ensure you can make meaningful year-over-year comparisons, we’ve recalculated historical data back to January 2022 using version 3 of the methodology. All the data displayed in the CCFT now uses version 3. To see historical data using v3, choose Create custom data export. A new data export now includes new columns breaking down emissions by Scope 1, 2, and 3.

You can see estimated AWS emissions and estimated emissions savings. The tool shows emissions calculated using the MBM for 38 months of data by default. You can find your emissions calculated using the LBM by choosing LBM in the Calculation method filter on the dashboard. The unit of measurement for carbon emissions is metric tons of carbon dioxide equivalent (MTCO2e), an industry-standard measure.

In the Carbon emissions summary, it shows trends of your carbon emissions over time. You can also find emissions resulting from your usage of AWS services and across all AWS Regions. To learn more, visit Viewing your carbon footprint in the AWS documentation.

Voice of the customer
Some of our customers had early access to these updates. This is what they shared with us:

Sunya Norman, senior vice president, Impact at Salesforce shared “Effective decarbonization begins with visibility into our carbon footprint, especially in Scope 3 emissions. Industry averages are only a starting point. The granular carbon data we get from cloud providers like AWS are critical to helping us better understand the actual emissions associated with our cloud infrastructure and focus reductions where they matter most.”

Gerhard Loske, Head of Environmental Management at SAP said “The latest updates to the CCFT are a big step forward in helping us managing SAP’s sustainability goals. With new Region-specific data, we can now see better where emissions are coming from and take targeted action. The upcoming addition of Scope 3 emissions will give us a much fuller picture of our carbon footprint across AWS workloads. These improvements make it easier for us to turn data into meaningful climate action.”

Pinterest’s Global Sustainability Lead, Mia Ketterling highlighted the benefits of the Scope 3 emission data, saying, “By including Scope 3 emissions data in their CCFT, AWS empowers customers like Pinterest to more accurately measure and report the full carbon footprint of our digital operations. Enhanced transparency helps us drive meaningful climate action across our value chain.”

If you’re attending AWS re:Invent in person in December, join technical leaders from AWS, Adobe, and Salesforce as they reveal how the Customer Carbon Footprint Tool supports their environmental initiatives.

Now available
With Scope 1, 2, and 3 coverage in the CCFT, you can track your emissions over time to understand how you’re trending towards your sustainability goals and see the impact of any carbon reduction projects you’ve implemented. To learn more, visit the Customer Carbon Footprint Tool (CCFT) page.

Give these new features a try in the AWS Billing and Cost Management console and send feedback to AWS re:Post for the CCFT or through your usual AWS Support contacts.

— Channy

AWS Weekly Roundup: Kiro waitlist, EBS Volume Clones, EC2 Capacity Manager, and more (October 20, 2025)

2025-10-20 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-kiro-waitlist-ebs-volume-clones-ec2-capacity-manager-and-more-october-20-2025/

I’ve been inspired by all the activities that tech communities around the world have been hosting and participating in throughout the year. Here in the southern hemisphere we’re starting to dream about our upcoming summer breaks and closing out on some of the activities we’ve initiated this year. The tech community in South Africa is participating in Amazon Q Developer coding challenges that my colleagues and I are hosting throughout this month as a fun way to wind down activities for the year. The first one was hosted in Johannesburg last Friday with Durban and Cape Town coming up next.

Last week’s launches
These are the launches from last week that caught my attention:

Kiro is now available for every developer — Since its launch more than 90 days ago, more than 100,000 developers have joined the waitlist to try Kiro out. The waitlist is gone so if you want to try out this spec-driven approach to coding with AI, sign up now.
Amazon EC2 Capacity Manager — If you’re using Amazon Elastic Compute Cloud (Amazon EC2) at scale operate hundreds of instance types across multiple Availability Zones and accounts, using On-Demand Instances, Spot Instances, and Capacity Reservations, you’ll be pleased to learn that EC2 Capacity Manager is now available to provide you a centralized solution to monitor, analyze, and manage capacity usage across all account and AWS Regions from a single interface.
Amazon EBS Volume Clones — Sometimes you need production data to test a fix in a non-production environment before implementing it in production. Usually you’d take an EBS snapshot of this data and then create a new volume from that snapshot, meanwhile dealing with the operational overhead of this process. Learn about the availability of Amazon EBS Volume Clones, a new capability for you to create instant point-in-time copies of your EBS volumes within the same Availability Zone.

Additional updates
I thought these projects, blog posts, and news items were also interesting:

AWS Transfer Family SFTP connectors now support VPC-based connectivity — AWS Transfer Family SFTP connectors now support connectivity to remote SFTP servers through Amazon Virtual Private Cloud (Amazon VPC) environments.
As your business evolves, you might need to migrate workloads between AWS Regions. Perhaps you’re looking to reduce latency for users in new geographic areas, meet Region-specific compliance requirements, or you’re an ISV expanding your product’s availability. Whatever your reason, cross-Region migration needs careful planning, especially when dealing with encrypted resources. Read how to migrate encrypted Amazon EC2 instances across AWS Regions without sharing AWS KMS keys.
Internet of Things (IoT) devices have transformed how we interact with our environments, from homes to industrial settings. However, as the number of connected devices grows, so does the complexity of managing them. Learn how to build a device management agent with Amazon Bedrock AgentCore.

Upcoming AWS events
Keep a look out and be sure to sign up for these upcoming events:

AWS re:Invent 2025 (December 1-5, 2025, Las Vegas) — AWS flagship annual conference offering collaborative innovation through peer-to-peer learning, expert-led discussions, and invaluable networking opportunities.

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Veliswa.

Introducing Amazon EBS Volume Clones: Create instant copies of your EBS volumes

2025-10-15 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/introducing-amazon-ebs-volume-clones-create-instant-copies-of-your-ebs-volumes/

As someone that used to work at Sun Microsystems, where ZFS was invented, I’ve always loved working with storage systems that offer instant volume copies for my development and testing needs.

Today, I’m excited to share that AWS is bringing similar capabilities to Amazon Elastic Block Store (Amazon EBS) with the launch of Amazon EBS Volume Clones, a new capability that lets you create instant point-in-time copies of your EBS volumes within the same Availability Zone.

Many customers need to create copies of their production data to support development and testing activities in a separate nonproduction environment. Until now, this process required taking an EBS snapshot (stored in Amazon Simple Storage Service (Amazon S3)) and then creating a new volume from that snapshot. Although this approach works, the process creates operational overhead due to multiple steps.

With Amazon EBS Volume Clones, you can now create copies of your EBS volumes with a single API call or console click. The copied volumes are available within seconds and provide immediate access to your data with single-digit millisecond latency. This makes Volume Clones particularly useful for quickly setting up test environments with production data or creating temporary copies of databases for development purposes.

Let me show you how Volume Clones works
For this post, I created a small Amazon Elastic Compute Cloud (Amazon EC2) instance, with an attached volume. I created a file on the root file system with the command echo "Hello CopyVolumes" > hello.txt.

To initiate the copy, I open a browser on the AWS Management Console and I navigate to EC2, Elastic Block Store, Volumes. I select the volume I want to copy.

Note that, at the time of publication of this post, only encrypted volumes can be copied.

On the Actions menu, I choose the Copy Volume option.

Next, I choose the details of the target volume. I can change the Volume type and adjust the Size, IOPS, and Throughput parameters. I choose Copy volume to start the Volume Clone operation.

The copied volume enters the Creating state and becomes available within seconds. I can then attach it to an EC2 instance and start using it immediately.

Data blocks are copied from the source volume and written to the volume copy in the background. The volume remains in the Initializing state until the process is complete. I can monitor its progress with the describe-volume-status API. The initializing operation doesn’t affect the performance of the source volume. I can continue using it normally during the copy process.

I love that the copied volume is available immediately. I don’t need to wait for its initialization to complete. During the initialization phase, my copied volume delivers performance based on the lowest of: a baseline of 3,000 IOPS and 125 MiB/s, the source volume’s provisioned performance, or the copied volume’s provisioned performance.

After initialization is completed, the copied volume becomes fully independent of the source volume and delivers its full provisioned performance.

Alternatively, I can use the AWS Command Line Interface (AWS CLI) to initiate the copy:

aws ec2 copy-volumes                          \
     --source-volume-id vol-1234567890abcdef0 \
     --size 500                               \
     --volume-type gp3

After the volume copy is created, I attach it to my EC2 instance and mount it. I can check the file I created at start is present.

First, I attach the volume from my laptop, using the attach-volume command:

aws ec2 attach-volume \
         --volume-id 'vol-09b700e3a23a9b4ad' \
         --instance-id 'i-079e6504ad25b029e'   \
         --device '/dev/sdb'

Then, I connect to the instance, and I type these commands:

$ sudo lsblk -f
NAME          FSTYPE FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
nvme0n1                                                                              
├─nvme0n1p1   xfs          /     49e26d9d-0a9d-4667-b93e-a23d1de8eacd    6.2G    22% /
└─nvme0n1p128 vfat   FAT16       3105-2F44                               8.6M    14% /boot/efi
nvme1n1                                                                              
├─nvme1n1p1   xfs          /     49e26d9d-0a9d-4667-b93e-a23d1de8eacd                
└─nvme1n1p128 vfat   FAT16       3105-2F44     

$ sudo mount -t xfs /dev/nvme1n1p1 /data

$ df -h
Filesystem        Size  Used Avail Use% Mounted on
devtmpfs          4.0M     0  4.0M   0% /dev
tmpfs             924M     0  924M   0% /dev/shm
tmpfs             370M  476K  369M   1% /run
/dev/nvme0n1p1    8.0G  1.8G  6.2G  22% /
tmpfs             924M     0  924M   0% /tmp
/dev/nvme0n1p128   10M  1.4M  8.7M  14% /boot/efi
tmpfs             185M     0  185M   0% /run/user/1000
/dev/nvme1n1p1    8.0G  1.8G  6.2G  22% /data

$ cat /data/home/ec2-user/hello.txt 
Hello CopyVolumes

Things to know
Volume Clones creates copies within the same Availability Zone as your source volume. You can create copies from encrypted volumes only, and the size of your copy must be equal to or greater than the source volume.

Volume Clones creates crash-consistent copies of your volumes, exactly like snapshots. For application consistency, you need to pause application I/O operations before creating the copy. For example, with PostgreSQL databases, you can use the pg_start_backup() and pg_stop_backup() functions to pause writes and create a consistent copy. At the operating system level on Linux with XFS, you can use the xfs_freeze command to temporarily suspend and resume access to the file system and ensure all cached updates are written to disk.

Although Volume Clones creates point-in-time copies, it complements rather than replaces EBS snapshots for backup purposes. EBS snapshots remain the recommended solution for data backup and protection against AZ-level and volume failures. Snapshots provide incremental backups to Amazon S3 with 11 nines of durability, compared to Volume Clones which maintains EBS volume durability (99.999% for io2, 99.9% for other volume types). Consider using Volume Clones specifically for test and development environment scenarios where you need instant access to volume copies.

Copied volumes exist independently of their source volumes and continue to incur standard EBS volume charges until you delete them. To manage costs effectively, implement governance rules to identify and remove copied volumes that are no longer needed for your development or testing activities.

Pricing and availability
Volume Clones supports all EBS volume types and works with volumes in the same AWS account and Availability Zone. This new capability is available in all AWS commercial Regions, selected Local Zones, and in the AWS GovCloud (US).

For pricing, you’re charged a one-time fee per GiB of data on the source volume at initiation and standard EBS pricing for the new volume.

I find Volume Clones particularly valuable for database workloads and continuous integration (CI) scenarios. For instance, you can quickly create a copy of your production database for testing new features or troubleshooting issues without impacting your production environment or waiting for data to hydrate from Amazon S3.

To get started with Amazon EBS Volume Clones, visit the Amazon EBS section on the console or check out the EBS documentation. I look forward to hearing how you use this capability to improve your development workflows.

— seb

AWS Weekly Roundup: Amazon Quick Suite, Amazon EC2, Amazon EKS, and more (October 13, 2025)

2025-10-13 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-quick-suite-amazon-ec2-amazon-eks-and-more-october-13-2025/

This week I was at the inaugural AWS AI in Practice meetup from the AWS User Group UK. AI-assisted software development and agents were the focus of the evening! Next week I’ll be in Italy for Codemotion (Milan) and an AWS User Group meetup (Rome). I am also excited to try the new Amazon Quick Suite that brings AI-powered research, business intelligence, and automation capabilities into a single workspace.

Last week’s launches
Here are the launches that got my attention this week:

Amazon Quick Suite – A new agentic teammate that quickly answers your questions at work and turns those insights into actions for you. Read more in Esra’s launch post.
Amazon EC2 – General-purpose M8a instances powered by the 5th Generation AMD EPYC (codename Turin) processors and compute-optimized C8i and C8i-flex instances powered by custom Intel Xeon 6 processors are now available.
Amazon EKS – EKS and EKS Distro now support Kubernetes version 1.34 with several improvements.
AWS IAM Identity Center – AWS Key Management Service keys can now be used to encrypt identity data stored in IAM Identity Center organization instances.
Amazon VPC Lattice – You can now configure the number of IPv4 addresses assigned to resource gateway elastic network interfaces (ENIs). The IPv4 addresses are used for network address translation and determine the maximum number of concurrent IPv4 connections to a resource
Amazon Q Developer – Amazon Q Developer can help you get information about AWS product and service pricing, availability, and attributes, making it easier to select the right resources and estimate workload costs using natural language. More info in this blog post.
Amazon RDS for Db2 – You can now perform native database-level backups, offering greater flexibility in database management and migration.
AWS Service Quotas – Get notified of your quota usage with automatic quota management. Configure your preferred notifications channels, such as email, SMS, or Slack. Notifications are also available in AWS Health, and you can subscribe to related AWS Cloudtrail events for automation workflows.
Amazon Connect – You can now programmatically enrich case data with the new case APIs to link related cases, add custom related items, and search across them. You can now also customize service level calculations to your specific needs. New capabilities that have just been introduced include copy and bulk edit of agent scheduling configuration and agent schedule adherence notifications.
AWS Client VPN – Now supports MacOS Tahoe.

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

Serverless ICYMI Q3 2025 – A quarterly recap of serverless news, in case you missed it.
Best practices for migrating from Apache Airflow 2.x to Apache Airflow 3.x on Amazon MWAA – A guide to help get the benefit of the new release.
Building self-managed RAG applications with Amazon EKS and Amazon S3 Vectors – A reference architecture for building and deploying a self-managed RAG application using open source tools such as Ray, Hugging Face, and LangChain.
BBVA: Building a multi-region, multi-country global Data and ML Platform at scale – A six-part series of posts describing the journey to transform BBVA entire data analytics infrastructure with one of the largest and most complex cloud migrations in the banking sector.
Customizing text content moderation with Amazon Nova – Fine-tuned for content moderation tasks tailored to your requirements using domain-specific training data and organization-specific moderation guidelines.

Upcoming AWS events
Check your calendars so that you can sign up for these upcoming events:

AWS AI Agent Global Hackathon – This is your chance to dive deep into our powerful generative AI stack and create something truly awesome. From September 8th to October 20th, you have the opportunity to create AI agents using AWS suite of AI services, competing for over $45,000 in prizes and exclusive go-to-market opportunities.
AWS Gen AI Lofts – You can learn AWS AI products and services with exclusive sessions, meet industry-leading experts, and have valuable networking opportunities with investors and peers. Register in your nearest city: Paris (October 7–21), London (Oct 13–21), and Tel Aviv (November 11–19).
AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Budapest (October 16).

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here upcoming in-person events, developer-focused events, and events for startups.

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Danilo

Announcing Amazon Quick Suite: your agentic teammate for answering questions and taking action

2025-10-09 Esra Kayabali

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/reimagine-the-way-you-work-with-ai-agents-in-amazon-quick-suite/

Today, we’re announcing Amazon Quick Suite, a new agentic teammate that quickly answers your questions at work and turns those insights into actions for you. Instead of switching between multiple applications to gather data, find important signals and trends, and complete manual tasks, Quick Suite brings AI-powered research, business intelligence, and automation capabilities into a single workspace. You can now analyze data through natural language queries, find critical information across enterprise and external sources in minutes, and automate processes from simple tasks to complex multi-department workflows.

Here’s a look into Quick Suite.

Business users often need to gather data across multiple applications—pulling customer details, checking performance metrics, reviewing internal product information, and performing competitive intelligence. This fragmented process often requires consultation with specialized teams to analyze advanced datasets, and in some cases, must be repeated regularly, reducing efficiency and leading to incomplete insights for decision-making.

Quick Suite helps you overcome these challenges by combining agentic teammates for research, business intelligence, and automation into a unified digital workspace for your day-to-day work.

Integrated capabilities that power productivity
Quick Suite includes the following integrated capabilities:

Research – Quick Research accelerates complex research by combining enterprise knowledge, premium third-party data, and data from the internet for more comprehensive insights.
Business intelligence – Quick Sight provides AI-powered business intelligence capabilities that transform data into actionable insights through natural language queries and interactive visualizations, helping everyone make faster decisions and achieve better business outcomes.
Automation – Quick Flows and Quick Automate help users and technical teams to automate any business process from simple, routine tasks to complex multi-department workflows, enabling faster execution and reducing manual work across the organization.

Let’s dive into some of these key capabilities.

Quick Index: Your unified knowledge foundation
Quick Index creates a secure, searchable repository that consolidates documents, files, and application data to power AI-driven insights and responses across your organization.

As a foundational component of Quick Suite, Quick Index operates in the background to bring together all your data—from databases and data warehouses to documents and email. This creates a single, intelligent knowledge base that makes AI responses more accurate and reduces time spent searching for information.

Quick Index automatically indexes and prepares any uploaded files or unstructured data you add to your Quick Suite, enabling efficient searching, sorting, and data access. For example, when you search for a specific project update, Quick Index instantly returns results from uploaded documents, meeting notes, project files, and reference materials—all from one unified search instead of checking different repositories and file systems.

To learn more, visit the Quick Index overview page.

Quick Research: From complex business challenges to expert-level insights
Quick Research is a powerful agent that conducts comprehensive research across your enterprise data and external sources to deliver contextualized, actionable insights in minutes or hours — work that previously could take longer.

Quick Research systematically breaks down complex questions into organized research plans. Starting with a simple prompt, it automatically creates detailed research frameworks that outline the approach and data sources needed for comprehensive analysis.

After Quick Research creates the plan, you can easily refine it through natural language conversations. When you are happy with the plan, it works in the background to gather information from multiple sources, using advanced reasoning to validate findings and provide thorough analysis with citations.

Quick Research integrates with your enterprise data connected to Quick Suite, the unified knowledge foundation that connects to your dashboards, documents, databases, and external sources, including Amazon S3, Snowflake, Google Drive, and Microsoft SharePoint. Quick Research grounds key insights to original sources and reveals clear reasoning paths, helping you verify accuracy, understand the logic behind recommendations, and present findings with confidence. You can trace findings back to their original sources and validate conclusions through source citations. This makes it ideal for complex topics requiring in-depth analysis.

To learn more, visit the Quick Research overview page.

Quick Sight: AI-powered business intelligence
Quick Sight provides AI-powered business intelligence capabilities that transform data into actionable insights through natural language queries and interactive visualizations.

You can create dashboards and executive summaries using conversational prompts, reducing dashboard development time while making advanced analytics accessible without specialized skills.

Quick Sight helps you ask questions about your data in natural language and receive instant visualizations, executive summaries, and insights. This generative AI integration provides you with answers from your dashboards and datasets without requiring technical expertise.

Using the scenarios capability, you can perform what-if analysis in natural language with step-by-step guidance, exploring complex business scenarios and finding answers faster than before.

Additionally, you can respond to insights with one-click actions by creating tickets, sending alerts, updating records, or triggering automated workflows directly from your dashboards without switching applications.

To learn more, visit Quick Sight overview page.

Quick Flows: Automation for everyone
With Quick Flows, any user can automate repetitive tasks by describing their workflow using natural language without requiring any technical knowledge. Quick Flows fetches information from internal and external sources, takes action in business applications, generates content, and handles process-specific requirements.

Starting with straightforward business requirements, it creates a multi-step flow including input steps for gathering information, reasoning groups for AI-powered processing, and output steps for generating and presenting results.

After the flow is configured, you can share it with a single click to your coworkers and other teams. To execute the flow, users can open it from the library or invoke it from chat, provide the necessary inputs, and then chat with the agent to refine the outputs and further customize the results.

To learn more, visit the Quick Flows overview page.

Quick Automate: Enterprise-scale process automation
Quick Automate helps technical teams build and deploy sophisticated automation for complex, multistep processes that span departments, systems, and third-party integrations. Using AI-powered natural language processing, Quick Automate transforms complex business processes into multi-agent workflows that can be created merely by describing what you want to automate or uploading process documentation.

While Quick Flows handles straightforward workflows, Quick Automate is designed for comprehensive and complex business processes like customer onboarding, procurement automations, or compliance procedures that involve multiple approval steps, system integrations, and cross-departmental coordination. Quick Automate offers advanced orchestration capabilities with extensive monitoring, debugging, versioning, and deployment features.

Quick Automate then generates a comprehensive automation plan with detailed steps and actions. You will find a UI agent that understands natural language instructions to autonomously navigate websites, complete form inputs, extract data, and produces structured outputs for downstream automation steps.

Additionally, you can define a custom agent, complete with instructions, knowledge, and tools, to complete process-specific tasks using the visual building experience – no code required.

Quick Automate includes enterprise-grade features such as user role management and human-in-the-loop capabilities that route specific tasks to users or groups for review and approval before continuing workflows. The service provides comprehensive observability with real-time monitoring, success rate tracking, and audit trails for compliance and governance.

To learn more, visit the Quick Automate overview page.

Additional foundational capabilities
Quick Suite includes other foundational capabilities that deliver seamless data organization and contextual AI interactions across your enterprise.

Spaces – Spaces provide a straightforward way for every business user to add their own context by uploading files or connecting to specific datasets and repositories specific to their work or to a particular function. For example, you might create a space for quarterly planning that includes budget spreadsheets, market research reports, and strategic planning documents. Or you could set up a product launch space that connects to your project management system and customer feedback databases. Spaces can scale from personal use to enterprise-wide deployment while maintaining access permissions and seamless integration with Quick Suite capabilities.

Chat agents – Quick Suite includes insights agents that you can use to interact with your data and workflows through natural language. Quick Suite includes a built-in agent to answer questions across all of your data and custom chat agents that you can configure with specific expertise and business context. Custom chat agents can be tailored for particular departments or use cases—such as a sales agent connected to your product catalog data and pricing information stored in a space or a compliance agent configured with your regulatory requirements and actions to request approvals.

Additional things to know
If you’re an existing Amazon QuickSight customer – Amazon QuickSight customers will be upgraded to Quick Suite, a unified digital workspace that includes all your existing QuickSight business intelligence capabilities (now called “Quick Sight”) plus new agentic AI capabilities. This is an interface and capability change—your data connectivity, user access, content, security controls, user permissions, and privacy settings remain exactly the same. No data is moved, migrated, or changed.

Quick Suite offers per-user subscription-based pricing with consumption-based charges for the Quick Index and other optional features. You can find more detail on the Quick Suite pricing page.

Now available
Amazon Quick Suite gives you a set of agentic teammates that helps you get the answers you need using all your data and move instantly from answers to action so you can focus on high value activities that drive better business and customer outcomes.

Visit the getting started page to start using Amazon Quick Suite today.

Happy building
— Esra and Donnie

New general-purpose Amazon EC2 M8a instances are now available

2025-10-08 Betty Zheng (郑予彬)

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/new-general-purpose-amazon-ec2-m8a-instances-are-now-available/

Today, we’re announcing the availability of Amazon Elastic Compute Cloud (Amazon EC2) M8a instances, the latest addition to the general-purpose M instance family. These instances are powered by the 5th Generation AMD EPYC (codename Turin) processors with a maximum frequency of 4.5GHz. Customers can expect up to 30% higher performance and up to 19% better price performance compared to M7a instances. They also provide higher memory bandwidth, improved networking and storage throughput, and flexible configuration options for a broad set of general-purpose workloads.

Improvements in M8a
M8a instances deliver up to 30% better performance per vCPU compared to M7a instances, making them ideal for applications that require benefit from high performance and high throughput such as financial applications, gaming, rendering, application servers, simulation modeling, midsize data stores, application development environments, and caching fleets.

They provide 45% more memory bandwidth compared to M7a instances, accelerating in-memory databases, distributed caches, and real-time analytics.

For workloads with high I/O requirements, M8a instances provide up to 75 Gbps of networking bandwidth and 60 Gbps of Amazon Elastic Block Store (Amazon EBS) bandwidth, a 50% improvement over the previous generation. These enhancements support modern applications that rely on rapid data transfer and low-latency network communication.

Each vCPU on an M8a instance corresponds to a physical CPU core, meaning there is no simultaneous multithreading (SMT). In application benchmarks, M8a instances delivered up to 60% faster performance for GroovyJVM and up to 39% faster performance for Cassandra compared to M7a instances.

M8a instances support instance bandwidth configuration (IBC), which provides flexibility to allocate resources between networking and EBS bandwidth. This gives customers the flexibility to scale network or EBS bandwidth by up to 25% and improve database performance, query processing, and logging speeds.

M8a is available in ten virtualized sizes and two bare metal options (metal-24xl and metal-48xl), providing deployment choices that scale from small applications to large enterprise workloads. All of these improvements are built on the AWS Nitro System, which delivers low virtualization overhead, consistent performance, and advanced security across all instance sizes. These instances are built using the latest sixth generation AWS Nitro Cards, which offload and accelerate I/O for functions, increasing overall system performance.

M8a instances feature sizes of up to 192 vCPU with 768GiB RAM. Here are the detailed specs:

M8a	vCPUs	Memory (GiB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
medium	1	4	Up to 12.5	Up to 10
large	2	8	Up to 12.5	Up to 10
xlarge	4	16	Up to 12.5	Up to 10
2xlarge	8	32	Up to 15	Up to 10
4xlarge	16	64	Up to 15	Up to 10
8xlarge	32	128	15	10
12xlarge	48	192	22.5	15
16xlarge	64	256	30	20
24xlarge	96	384	40	30
48xlarge	192	768	75	60
metal-24xl	96	384	40	30
metal-48xl	192	768	75	60

For a complete list of instance sizes and specifications, refer to the Amazon EC2 M8a instances page.

When to use M8a instances
M8a is a strong fit for general-purpose applications that need a balance of compute, memory, and networking. M8a instances are ideal for web and application hosting, microservices architectures, and databases where predictable performance and efficient scaling are important.

These instances are SAP certified and also well suited for enterprise workloads such as financial applications and enterprise resource planning (ERP) systems. They’re equally effective for in-memory caching and customer relationship management (CRM), in addition to development and test environments that require cost efficiency and flexibility. With this versatility, M8a supports a wide spectrum of workloads while helping customers improve price performance.

Now available
Amazon EC2 M8a instances are available today in US East (Ohio) US West (Oregon) and Europe (Spain) AWS Regions. M8a instances can be purchased as On-Demand, Savings Plans, and Spot Instances. M8a instances are also available on Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

To learn more, visit the Amazon EC2 M8a instances page and send feedback to AWS re:Post for EC2 or through your usual AWS support contacts.

— Betty

Introducing new compute-optimized Amazon EC2 C8i and C8i-flex instances

2025-10-06 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-new-compute-optimized-amazon-ec2-c8i-and-c8i-flex-instances/

After launching Amazon Elastic Compute Cloud (Amazon EC2) memory-optimized R8i and R8i-flex instances and general-purpose M8i and M8i-flex instances, I am happy to announce the general availability of compute-optimized C8i and C8i-flex instances powered by custom Intel Xeon 6 processors available only on AWS with sustained all-core 3.9 GHz turbo frequency and feature a 2:1 ratio of memory to vCPU. These instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

The C8i and C8i-flex instances offer up to 15 percent better price-performance, and 2.5 times more memory bandwidth compared to C7i and C7i-flex instances. The C8i and C8i-flex instances are up to 60 percent faster for NGINX web applications, up to 40 percent faster for AI deep learning recommendation models, and 35 percent faster for Memcached stores compared to C7i and C7i-flex instances.

C8i and C8i-flex instances are ideal for running compute-intensive workloads, such as web servers, caching, Apache.Kafka, ElasticSearch, batch processing, distributed analytics, high performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding.

As like other 8th generation instances, these instances use the new sixth generation AWS Nitro Cards, delivering up to two times more network and Amazon Elastic Block Storage (Amazon EBS) bandwidth compared to the previous generation instances. They also support bandwidth configuration with 25 percent allocation adjustments between network and Amazon EBS bandwidth, enabling better database performance, query processing, and logging speeds.

C8i instances
C8i instances provide up to 384 vCPUs and 768 TB memory including bare metal instances that provide dedicated access to the underlying physical hardware. These instances help you to run compute-intensive workloads, such as CPU-based inference, and video streaming that need the largest instance sizes or high CPU continuously.

Here are the specs for C8i instances:

Instance size	vCPUs	Memory (GiB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
c8i.large	2	4	Up to 12.5	Up to 10
c8i.xlarge	4	8	Up to 12.5	Up to 10
c8i.2xlarge	8	16	Up to 15	Up to 10
c8i.4xlarge	16	32	Up to 15	Up to 10
c8i.8xlarge	32	64	15	10
c8i.12xlarge	48	96	22.5	15
c8i.16xlarge	64	128	30	20
c8i.24xlarge	96	192	40	30
c8i.32xlarge	128	256	50	40
c8i.48xlarge	192	384	75	60
c8i.96xlarge	384	768	100	80
c8i.metal-48xl	192	384	75	60
c8i.metal-96xl	384	768	100	80

C8i-flex instances
C8i-flex instances are a lower-cost variant of the C8i instances, with 5 percent better price performance at 5 percent lower prices. These instances are designed for workloads that benefit from the latest generation performance but don’t fully utilize all compute resources. These instances can reach up to the full CPU performance 95 percent of the time.

Here are the specs for the C8i-flex instances:

Instance size	vCPUs	Memory (GiB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
c8i-flex.large	2	4	Up to 12.5	Up to 10
c8i-flex.xlarge	4	8	Up to 12.5	Up to 10
c8i-flex.2xlarge	8	16	Up to 15	Up to 10
c8i-flex.4xlarge	16	32	Up to 15	Up to 10
c8i-flex.8xlarge	32	64	Up to 15	Up to 10
c8i-flex.12xlarge	48	96	Up to 22.5	Up to 15
c8i-flex.16xlarge	64	128	Up to 30	Up to 20

If you’re currently using earlier generations of compute-optimized instances, you can adopt C8i-flex instances without having to make changes to your application or your workload.

Now available
Amazon EC2 C8i and C8i-flex instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Spain) AWS Regions. C8i and C8i-flex instances can be purchased as On-Demand, Savings Plan, and Spot instances. C8i instances are also available in Dedicated Instances and Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

Give C8i and C8i-flex instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 C8i instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

AWS IAM Identity Center now supports customer-managed KMS keys for encryption at rest

2025-10-06 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-iam-identity-center-now-supports-customer-managed-kms-keys-for-encryption-at-rest/

Starting today, you can use your own AWS Key Management Service (AWS KMS) keys to encrypt identity data, such as user and group attributes, stored in AWS IAM Identity Center organization instances.

Many organizations operating in regulated industries need complete control over encryption key management. While Identity Center already encrypts data at rest using AWS-owned keys, some customers require the ability to manage their own encryption keys for audit and compliance purposes.

With this launch, you can now use customer-managed KMS keys (CMKs) to encrypt Identity Center identity data at rest. CMKs provide you with full control over the key lifecycle, including creation, rotation, and deletion. You can configure granular access controls to keys with AWS Key Management Service (AWS KMS) key policies and IAM policies, helping to ensure that only authorized principals can access your encrypted data. At launch time, the CMK must reside in the same AWS account and Region as your IAM Identity Center instance. The integration between Identity Center and KMS provides detailed AWS CloudTrail logs for auditing key usage and helps meet regulatory compliance requirements.

Identity Center supports both single-Region and multi-Region keys to match your deployment needs. While Identity Center instances can currently only be deployed in a single Region, we recommend using multi-Region AWS KMS keys unless your company policies restrict you to single-Region keys. Multi-Region keys provide consistent key material across Regions while maintaining independent key infrastructure in each Region. This gives you more flexibility in your encryption strategy and helps future-proof your deployment.

Let’s get started
Let’s imagine I want to use a CMK to encrypt the identity data of my Identity Center organization instance. My organization uses Identity Center to give employees access to AWS managed applications, such as Amazon Q Business or Amazon Athena.

As of today, some AWS managed applications cannot be used with Identity Center configured with a customer managed KMS key. See AWS managed applications that you can use with Identity Center to keep you updated with the ever evolving list of compatible applications.

The high-level process requires first to create a symmetric customer managed key (CMK) in AWS KMS. The key must be configured for encrypt and decrypt operations. Next, I configure the key policies to grant access to Identity Center, AWS managed applications, administrators, and other principals who need access the Identity Center and IAM Identity Center service APIs. Depending on your usage of Identity Center, you’ll have to define different policies for the key and IAM policies for IAM principals. The service documentation has more details to help you cover the most common use cases.

This demo is in three parts. I first create a customer managed key in AWS KMS and configure it with permissions that will authorize Identity Center and AWS managed applications to use it. Second, I update the IAM policies for the principals that will use the key from another AWS account, such as AWS applications administrators. Finally, I configure Identity Center to use the key.

Part 1: Create the key and define permissions

First, let’s create a new CMK in AWS KMS.

The key must be in the same AWS Region and AWS account as the Identity Center instance. You must create the Identity Center instance and the key in the management account of your organization within AWS Organization.

I navigate to the AWS Key Management Service (AWS KMS) console in the same Region as my Identity Center instance, then I choose Create a key. This launches me into the key creation wizard.

Under Step 1–Configure key, I select the key type–either Symmetric (a single key used for both encryption and decryption) or Asymmetric (a public-private key pair for encryption/decryption and signing/verification). Identity Center requires symmetric keys for encryption at rest. I select Symmetric.

For key usage, I select Encrypt and decrypt which allows the key to be used only for encrypting and decrypting data.

Under Advanced options, I select KMS – recommended for Key material origin, so AWS KMS creates and manages the key material.

For Regionality, I choose between Single-Region or Multi-Region key. I select Multi-Region key to allow key administrators to replicate the key to other Regions. As explained already, Identity Center doesn’t require this today but it helps to future-proof your configuration. Remember that you can not transform a single-Region key to a multi-Region one after its creation (but you can change the key used by Identity Center).

Then, I choose Next to proceed with additional configuration steps, such as adding labels, defining administrative permissions, setting usage permissions, and reviewing the final configuration before creating the key.

Under Step 2–Add Labels, I enter an Alias name for my key and select Next.

In this demo, I am editing the key policy by adding policy statements using templates provided in the documentation. I skip Step 3 and Step 4 and navigate to Step 5–Edit key policy.

Identity Center requires, at the minimum, permissions allowing Identity Center and its administrators to use the key. Therefore, I add three policy statements, the first and second authorize the administrators of the service, the third one to authorize the Identity Center service itself.

{
	"Version": "2012-10-17",
	"Id": "key-consolepolicy-3",
	"Statement": [
		{
			"Sid": "Allow_IAMIdentityCenter_Admin_to_use_the_KMS_key_via_IdentityCenter_and_IdentityStore",
			"Effect": "Allow",
			"Principal": {
				"AWS": "ARN_OF_YOUR_IDENTITY_CENTER_ADMIN_IAM_ROLE"
			},
			"Action": [
				"kms:Decrypt",
				"kms:Encrypt",
				"kms:GenerateDataKeyWithoutPlaintext"
			],
			"Resource": "*",
			"Condition": {
				"StringLike": {
					"kms:ViaService": [
						"sso.*.amazonaws.com",
						"identitystore.*.amazonaws.com"
					]
				}
			}
		},
		{
			"Sid": "Allow_IdentityCenter_admin_to_describe_the_KMS_key",
			"Effect": "Allow",
			"Principal": {
				"AWS": "ARN_OF_YOUR_IDENTITY_CENTER_ADMIN_IAM_ROLE"
			},
			"Action": "kms:DescribeKey",
			"Resource": "*"
		},
		{
			"Sid": "Allow_IdentityCenter_and_IdentityStore_to_use_the_KMS_key",
			"Effect": "Allow",
			"Principal": {
				"Service": [
					"sso.amazonaws.com",
					"identitystore.amazonaws.com"
				]
			},
			"Action": [
				"kms:Decrypt",
				"kms:ReEncryptTo",
				"kms:ReEncryptFrom",
				"kms:GenerateDataKeyWithoutPlaintext"
			],
			"Resource": "*",
            "Condition": {
    	       "StringEquals": { 
                      "aws:SourceAccount": "<Identity Center Account ID>" 
	           }
            }		
		},
		{
			"Sid": "Allow_IdentityCenter_and_IdentityStore_to_describe_the_KMS_key",
			"Effect": "Allow",
			"Principal": {
				"Service": [
					"sso.amazonaws.com",
					"identitystore.amazonaws.com"
				]
			},
			"Action": [
				"kms:DescribeKey"
			],
			"Resource": "*"
		}		
	]
}

I also have to add additional policy statements to allow my use case: the use of AWS managed applications. I add these two policy statements to authorize AWS managed applications and their administrators to use the KMS key. The document lists additional use cases and their respective policies.

{
    "Sid": "Allow_AWS_app_admins_in_the_same_AWS_organization_to_use_the_KMS_key",
    "Effect": "Allow",
    "Principal": "*",
    "Action": [
        "kms:Decrypt"
    ],
    "Resource": "*",
    "Condition": {
        "StringEquals" : {
           "aws:PrincipalOrgID": "MY_ORG_ID (format: o-xxxxxxxx)"
        },
        "StringLike": {
            "kms:ViaService": [
                "sso.*.amazonaws.com", "identitystore.*.amazonaws.com"
            ]
        }
    }
},
{
   "Sid": "Allow_managed_apps_to_use_the_KMS_Key",
   "Effect": "Allow",
   "Principal": "*",
   "Action": [
      "kms:Decrypt"
    ],
   "Resource": "*",
   "Condition": {
      "Bool": { "aws:PrincipalIsAWSService": "true" },
      "StringLike": {
         "kms:ViaService": [
             "sso.*.amazonaws.com", "identitystore.*.amazonaws.com"
         ]
      },
      "StringEquals": { "aws:SourceOrgID": "MY_ORG_ID (format: o-xxxxxxxx)" }
   }
}

You can further restrict the key usage to a specific Identity Center instance, specific application instances, or specific application administrators. The documentation contains examples of advanced key policies for your use cases.

To help protect against IAM role name changes when permission sets are recreated, use the approach described in the Custom trust policy example.

Part 2: Update IAM policies to allow use of the KMS key from another AWS account

Any IAM principal that uses the Identity Center service APIs from another AWS account, such as Identity Center delegated administrators and AWS application administrators, need an IAM policy statement that allows use of the KMS key via these APIs.

I grant permissions to access the key by creating a new policy and attaching the policy to the IAM role relevant for my use case. You can also add these statements to the existing identity-based policies of the IAM role.

To do so, after the key is created, I locate its ARN and replace the key_ARNin the template below. Then, I attach the policy to the managed application administrator IAM principal. The documentation also covers IAM policies that grants Identity Center delegated administrators permissions to access the key.

Here is an example for managed application administrators:

{
      "Sid": "Allow_app_admins_to_use_the_KMS_key_via_IdentityCenter_and_IdentityStore",
      "Effect": "Allow",
      "Action": 
        "kms:Decrypt",
      "Resource": "<key_ARN>",
      "Condition": {
        "StringLike": {
          "kms:ViaService": [
            "sso.*.amazonaws.com",
            "identitystore.*.amazonaws.com"
          ]
        }
      }
    }

The documentation shares IAM policies template for the most common use cases.

Part 3: Configure IAM Identity Center to use the key

I can configure a CMK either during the enablement of an Identity Center organization instance or on an existing instance, and I can change the encryption configuration at any time by switching between CMKs or reverting to AWS-owned keys.

Please note that an incorrect configuration of KMS key permissions can disrupt Identity Center operations and access to AWS managed applications and accounts through Identity Center. Proceed carefully to this final step and ensure you have read and understood the documentation.

After I have created and configured my CMK, I can select it under Advanced configuration when enabling Identity Center.

To configure a CMK on an existing Identity Center instance using the AWS Management Console, I start by navigating to the Identity Center section of the AWS Management Console. From there, I select Settings from the navigation pane, then I select the Management tab, and select Manage encryption in the Key for encrypting IAM Identity Center data at rest section.

At any time, I can select another CMK from the same AWS Account, or switch back to an AWS-managed key.

After choosing Save, the key change process takes a few seconds to complete. All service functionalities continue uninterrupted during the transition. If, for whatever reasons, Identity Center can not access the new key, an error message will be returned and Identity Center will continue to use the current key, keeping your identity data encrypted with the mechanism it is already encrypted with.

Things to keep in mind
The encryption key you create becomes a crucial component of your Identity Center. When you choose to use your own managed key to encrypt identity attributes at rest, you have to verify the following points.

Have you configured the necessary permissions to use the KMS key? Without proper permissions, enabling the CMK may fail or disrupt IAM Identity Center administration and AWS managed applications.
Have you verified that your AWS managed applications are compatible with CMK keys? For a list of compatible applications, see AWS managed applications that you can use with IAM Identity Center. Enabling CMK for Identity Center that is used by AWS managed applications incompatible with CMK will result in operational disruption for those applications. If you have incompatible applications, do not proceed.
Is your organization using AWS managed applications that require additional IAM role configuration to use the Identity Center and Identity Store APIs? For each such AWS managed application that’s already deployed, check the managed application’s User Guide for updated KMS key permissions for IAM Identity Centre usage and update them as instructed to prevent application disruption.

For brevity, the KMS key policy statements in this post omit the encryption context, which allows you to restrict the use of the KMS key to Identity Center including a specific instance. For your production scenarios, you can add a condition like this for Identity Center:

"Condition": {
   "StringLike": {
      "kms:EncryptionContext:aws:sso:instance-arn": "${identity_center_arn}",
      "kms:ViaService": "sso.*.amazonaws.com"
    }
}

or this for Identity Store:

"Condition": {
   "StringLike": {
      "kms:EncryptionContext:aws:identitystore:identitystore-arn": "${identity_store_arn}",
      "kms:ViaService": "identitystore.*.amazonaws.com"
    }
}

Pricing and availability
Standard AWS KMS charges apply for key storage and API usage. Identity Center remains available at no additional cost.

This capability is now available in all AWS commercial Regions, AWS GovCloud (US), and AWS China Regions. To learn more, visit the IAM Identity Center User Guide.

We look forward to learning how you use this new capability to meet your security and compliance requirements.

— seb

AWS Weekly Roundup: Amazon Bedrock, AWS Outposts, Amazon ECS Managed Instances, AWS Builder ID, and more (October 6, 2025)

2025-10-06 Prasad Rao

Post Syndicated from Prasad Rao original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-bedrock-aws-outposts-amazon-ecs-managed-instances-aws-builder-id-and-more-october-6-2025/

Last week, Anthropic’s Claude Sonnet 4.5—the world’s best coding model according to SWE-Bench – became available in Amazon Q command line interface (CLI) and Kiro. I’m excited about this for two reasons:

First, a few weeks ago I spent 4 intensive days with a global customer delivering an AI-assisted development workshop, where I experienced firsthand how Amazon Q CLI boosts developer productivity. During the workshop, the customer was able to add a new feature in their application within a day using Amazon Q CLI, which would have traditionally taken them at least a couple of weeks. With Anthropic’s Claude Sonnet 4.5 in Amazon Q CLI, I know developer productivity will be enhanced further.

Second, I’ve started preparing for my code talk at AWS re:Invent 2025, where my co-speaker and I will show live coding to modernize a legacy codebase using Kiro. I can’t wait to use Anthropic’s Claude Sonnet 4.5 in Kiro to create a live demo. If you want to see this demo and over a thousand other sessions on cloud and AI, join us at AWS re:Invent 2025 in Las Vegas from December 1–5.

Last week’s launches
Here are some launches that got my attention:

Availability of Claude Sonnet 4.5 in Amazon Bedrock – Anthropic’s most intelligent model, best for coding and complex agents, is now available in Amazon Bedrock. By using Claude Sonnet 4.5 in Amazon Bedrock, developers gain access to a fully managed service that not only provides a unified API for foundation models (FMs) but keeps their data under complete control with enterprise-grade tools for security, and optimization.
AWS Outposts supports third-party storage integration with Dell and HPE – AWS Outposts third-party storage integration now includes Dell PowerStore and HPE Alletra Storage MP B10000 systems, joining the list of existing integrations with NetApp on-premises enterprise storage arrays and Pure Storage FlashArray. This integration serves three key purposes. First, it helps you maintain your existing storage infrastructure while migrating VMware workloads to AWS. Second, it helps you meet strict data residency requirements by keeping your data on premises while using AWS services. Third, it means you can use AWS Outposts with third-party storage arrays through AWS tooling.
Amazon ECS Managed Instances now available – Amazon ECS Managed Instances for containerized applications is a new fully managed compute option for Amazon ECS designed to eliminate infrastructure management overhead while giving you access to the full capabilities of Amazon EC2. ECS Managed Instances helps you quickly launch and scale your workloads while enhancing performance and reducing your total cost of ownership.
Application map is now generally available for Amazon CloudWatch – Amazon CloudWatch now helps you monitor large-scale distributed applications by automatically discovering and organizing services into groups based on configurations and their relationships. With this new application performance monitoring (APM) capability, you can quickly visualize which applications and dependencies to focus on while troubleshooting your distributed applications.
Amazon Bedrock AgentCore Model Context Protocol (MCP) server now available – With built-in support for runtime, gateway integration, identity management, and agent memory, the AgentCore MCP server is purpose-built to speed up creation of components compatible with Bedrock AgentCore. You can use the AgentCore MCP server for rapid prototyping, production AI solutions, or to scale your agent infrastructure.

Additional Updates
Here are some additional news items and blog posts that I found interesting:

AWS Builder ID now supports Sign in with Google – You can now create an AWS Builder ID using sign in with Google. AWS Builder ID is a personal profile that provides access to AWS applications including Kiro, AWS Builder Center, AWS Training and Certification, AWS re:Post and AWS Startups.
AWS API MCP Server v1.0.0 release – AWS API MCP server acts as a bridge between AI assistants and AWS services enabling foundation models to interact with any AWS API through natural language by creating and executing syntactically correct CLI commands. The AWS API MCP Server is open-source and available now on AWS Labs GitHub repository.
AWS Knowledge MCP Server now generally available – The AWS Knowledge server gives AI agents and MCP clients access to authoritative knowledge, including documentation, blog posts, What’s New announcements, and Well-Architected best practices, in an LLM-compatible format. With this release, the server also includes knowledge about the regional availability of AWS APIs and CloudFormation resources.
AWS Transform now enables Terraform for VMware network automation – AWS Transform now offers Terraform as an additional option to generate network infrastructure code automatically from VMware environments. The service converts your source network definitions into reusable Terraform modules, complementing current AWS CloudFormation and AWS Cloud Development Kit (CDK) support.

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

AWS AI Agent Global Hackathon – This is your chance to dive deep into our powerful generative AI stack and create something truly awesome. From September 8th to October 20th, you have the opportunity to create AI agents using AWS suite of AI services, competing for over $45,000 in prizes and exclusive go-to-market opportunities.
AWS Gen AI Lofts – You can learn AWS AI products and services with exclusive sessions, meet industry-leading experts, and have valuable networking opportunities with investors and peers. Register in your nearest city: Paris (October 7–21), London (Oct 13–21), and Tel Aviv (November 11–19).
AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Munich (October 7), Budapest (October 16).

You can browse all upcoming AWS events and AWS startup events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Prasad

Announcing Amazon ECS Managed Instances for containerized applications

2025-09-30 Micah Walter

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/announcing-amazon-ecs-managed-instances-for-containerized-applications/

Today, we’re announcing Amazon ECS Managed Instances, a new compute option for Amazon Elastic Container Service (Amazon ECS) that enables developers to use the full range of Amazon Elastic Compute Cloud (Amazon EC2) capabilities while offloading infrastructure management responsibilities to Amazon Web Service (AWS). This new offering combines the operational simplicity of offloading infrastructure with the flexibility and control of Amazon EC2, which means customers can focus on building applications that drive innovation, while reducing total cost of ownership (TCO) and maintaining AWS best practices.

Customers running containerized workloads told us they want to combine the simplicity of serverless with the flexibility of self-managed EC2 instances. Although serverless options provide an excellent general-purpose solution, some applications require specific compute capabilities, such as GPU acceleration, particular CPU architectures, or enhanced networking performance. Additionally, customers with existing Amazon EC2 capacity investments through EC2 pricing options couldn’t fully use these commitments with serverless offerings.

Amazon ECS Managed Instances provides a fully managed container compute environment that supports a broad range of EC2 instance types and deep integration with AWS services. By default, it automatically selects the most cost-optimized EC2 instances for your workloads, but you can specify particular instance attributes or types when needed. AWS handles all aspects of infrastructure management, including provisioning, scaling, security patching, and cost optimization, enabling you to concentrate on building and running your applications.

Let’s try it out

Looking at the AWS Management Console experience for creating a new Amazon ECS cluster, I can see the new option for using ECS Managed Instances. Let’s take a quick tour of all the new options.

Creating a ECS cluster with Managed Instances

After I’ve selected Fargate and Managed Instances, I’m presented with two options. If I select Use ECS default, Amazon ECS will choose general purpose instance types based on grouping together pending Tasks, and picking the optimum instance type based on cost and resilience metrics. This is the most straightforward and recommended way to get started. Selecting Use custom – advanced opens up additional configuration parameters, where I can fine-tune the attributes of instances Amazon ECS will use.

Creating a ECS cluster with Managed Instances

By default, I see CPU and Memory as attributes, but I can select from 20 additional attributes to continue to filter the list of available instance types Amazon ECS can access.

Creating a ECS cluster with Managed Instances

After I’ve made my attribute selections, I see a list of all the instance types that match my choices.

Creating a ECS cluster with Managed Instances

From here, I can create my ECS cluster as usual and Amazon ECS will provision instances for me on my behalf based on the attributes and criteria I’ve defined in the previous steps.

Key features of Amazon ECS Managed Instances

With Amazon ECS Managed Instances, AWS takes full responsibility for infrastructure management, handling all aspects of instance provisioning, scaling, and maintenance. This includes implementing regular security patches initiated every 14 days (due to instance connection draining, the actual lifetime of the instance may be longer), with the ability to schedule maintenance windows using Amazon EC2 event windows to minimize disruption to your applications.

The service provides exceptional flexibility in instance type selection. Although it automatically selects cost-optimized instance types by default, you maintain the power to specify desired instance attributes when your workloads require specific capabilities. This includes options for GPU acceleration, CPU architecture, and network performance requirements, giving you precise control over your compute environment.

To help optimize costs, Amazon ECS Managed Instances intelligently manages resource utilization by automatically placing multiple tasks on larger instances when appropriate. The service continually monitors and optimizes task placement, consolidating workloads onto fewer instances to dry up, utilize and terminate idle (empty) instances, providing both high availability and cost efficiency for your containerized applications.

Integration with existing AWS services is seamless, particularly with Amazon EC2 features such as EC2 pricing options. This deep integration means that you can maximize existing capacity investments while maintaining the operational simplicity of a fully managed service.

Security remains a top priority with Amazon ECS Managed Instances. The service runs on Bottlerocket, a purpose-built container operating system, and maintains your security posture through automated security patches and updates. You can see all the updates and patches applied to the Bottlerocket OS image on the Bottlerocket website. This comprehensive approach to security keeps your containerized applications running in a secure, maintained environment.

Available now

Amazon ECS Managed Instances is available today in US East (North Virginia), US West (Oregon), Europe (Dublin), Africa (Cape Town), Asia Pacific (Singapore), and Asia Pacific (Tokyo) AWS Regions. You can start using Managed Instances through the AWS Management Console, AWS Command Line Interface (AWS CLI), or infrastructure as code (IaC) tools such as AWS Cloud Development Kit (AWS CDK) and AWS CloudFormation. You pay for the EC2 instances you use plus a management fee for the service.

To learn more about Amazon ECS Managed Instances, visit the documentation and get started simplifying your container infrastructure today.

Announcing AWS Outposts third-party storage integration with Dell and HPE

2025-09-30 Micah Walter

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/announcing-aws-outposts-third-party-storage-integration-with-dell-and-hpe/

Since announcing second-generation AWS Outposts racks in April with breakthrough performance and scalability, we’ve continued to innovate on behalf of our customers at the edge of the cloud. Today, we’re expanding AWS Outposts third-party storage integration program to include Dell PowerStore and HPE Alletra Storage MP B10000 systems, joining our list of existing integrations with NetApp on-premises enterprise storage arrays and Pure Storage FlashArray. This program makes it easy for customers to use AWS Outposts with third-party storage arrays through AWS native tooling. The solution integration is particularly important for organizations migrating VMware workloads to AWS who need to maintain their existing storage infrastructure during the transition, and for those who must meet strict data residency requirements by keeping their data on-premises while using AWS services.

Outposts compute rack_Gen2_front_45 This announcement builds upon two significant storage integration milestones we achieved in the past year. In December 2024, we introduced the ability to attach block data volumes from third-party storage arrays to Amazon EC2 instances on Outposts directly through the AWS Management Console. Then in July 2025, we enabled booting Amazon EC2 instances directly from these external storage arrays. Now, with the addition of Dell and HPE, customers have even more choice in how they integrate their on-premises storage investments with AWS Outposts.

Enhanced storage integration capabilities

Our third-party storage integration supports both data and boot volumes, offering two boot methods: iSCSI SANboot and Localboot. The iSCSI SANboot option enables both read-only and read-write boot volumes, while Localboot supports read-only boot volumes using either iSCSI or NVMe-over-TCP protocols. With this comprehensive approach, customers can centrally manage their storage resources while maintaining the consistent hybrid experience that Outposts provides.

Through the Amazon EC2 Launch Instance Wizard in the AWS Management Console, customers can configure their instances to use external storage from any of our supported partners. For boot volumes, we provide AWS-verified AMIs for Windows Server 2022 and Red Hat Enterprise Linux 9, with automation scripts available through AWS Samples to simplify the setup process.

Support for various Outposts configurations

All third-party storage integration features are supported on Outposts 2U servers and both generations of Outposts racks. Support for second-generation Outposts racks means customers can combine the enhanced performance of our latest EC2 instances on Outposts—including twice the vCPU, memory, and network bandwidth—with their preferred storage solutions. The integration works seamlessly with both our new simplified network scaling capabilities and specialized Amazon EC2 instances designed for ultra-low latency and high throughput workloads.

Things to know

Customers can begin using these capabilities today with their existing Outposts deployments or when ordering new Outposts through the AWS Management Console. If you are using third-party storage integration with Outposts servers, you can have either your onsite personnel or a third-party IT provider install the servers for you. After the Outposts servers are connected to your network, AWS will remotely provision compute and storage resources so you can start launching applications. For Outposts rack deployments, the process involves a setup where AWS technicians verify site conditions and network connectivity before the rack installation and activation. Storage partners assist with the implementation of the third-party storage components.

Third-party storage integration for Outposts with all compatible storage vendors is available at no additional charge in all AWS Regions where Outposts is supported. See the FAQs for Outposts servers and Outposts racks for the latest list of supported Regions.

This expansion of our Outposts third-party storage integration program demonstrates our continued commitment to providing flexible, enterprise-grade hybrid cloud solutions, meeting customers where they are in their cloud migration journey. To learn more about this capability and our supported storage vendors, visit the AWS Outposts partner page and our technical documentation for Outposts servers, second-generation Outposts racks, and first-generation Outposts racks. To learn more about partner solutions, check out Dell PowerStore integration with AWS Outposts and HPE Alletra Storage MP B10000 integration with AWS Outposts.

Introducing Claude Sonnet 4.5 in Amazon Bedrock: Anthropic’s most intelligent model, best for coding and complex agents

2025-09-29 Matheus Guimaraes

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/introducing-claude-sonnet-4-5-in-amazon-bedrock-anthropics-most-intelligent-model-best-for-coding-and-complex-agents/

Today, we’re excited to announce that Claude Sonnet 4.5, powered by Anthropic, is now available in Amazon Bedrock, a fully managed service that offers a choice of high- performing foundation models from leading AI companies. This new model builds upon Claude 4’s foundation to achieve state-of-the-art performance in coding and complex agentic applications.

Claude Sonnet 4.5 demonstrates advancements in agent capabilities, with enhanced performance in tool handling, memory management, and context processing. The model shows marked improvements in code generation and analysis, from identifying optimal improvements to exercising stronger judgment in refactoring decisions. It particularly excels at autonomous long-horizon coding tasks, where it can effectively plan and execute complex software projects spanning hours or days while maintaining consistent performance and reliability throughout the development cycle.

Source: https://www.anthropic.com/news/claude-sonnet-4-5

By using Claude Sonnet 4.5 in Amazon Bedrock, developers gain access to a fully managed service that not only provides a unified API for foundation models but ensures their data stays under complete control with enterprise-grade tools for security, and optimization.

Claude Sonnet 4.5 also seamlessly integrates with Amazon Bedrock AgentCore, enabling developers to maximize the model’s capabilities for building complex agents. AgentCore’s purpose-built infrastructure complements the model’s enhanced abilities in tool handling, memory management, and context understanding. Developers can leverage complete session isolation, 8-hour long-running support, and comprehensive observability features to deploy and monitor production-ready agents from autonomous security operations to complex enterprise workflows.

Business applications and use cases
Beyond its technical capabilities, Sonnet 4.5 delivers practical business value through consistent performance and advanced problem-solving abilities. The model excels at producing and editing business documents while maintaining reliable performance across complex workflows.

The model demonstrates strength in several key industries:

Cybersecurity – Claude Sonnet 4.5 can be used to deploy agents that autonomously patch vulnerabilities before exploitation, shifting from reactive detection to proactive defense.
Finance – Sonnet 4.5 handles everything from entry-level financial analysis to advanced predictive analysis, helping transform manual audit preparation into intelligent risk management.
Research – Sonnet 4.5 can better handle tools, context, and deliver ready-to-go office files to drive expert analysis into final deliverables and actionable insights.

Sonnet 4.5 features in the Amazon Bedrock API
Here are some highlights of Sonnet 4.5 in the Amazon Bedrock API:

Smart Context Window Management – The new API introduces intelligent handling when AI models reach their maximum capacity. Instead of returning errors when conversations get too long, Claude Sonnet 4.5 will now generate responses up to the available limit and clearly indicate why it stopped. This eliminates frustrating interruptions and allows users to maximize their available context window.

Tool Use Clearing for Efficiency – Claude Sonnet 4.5 enables automatic cleanup of tool interaction history during long conversations. When conversations involve multiple tool calls, the system can automatically remove older tool results while preserving recent ones. This keeps conversations efficient and prevents unnecessary token consumption, reducing costs while maintaining conversation quality.

Cross-Conversation Memory – A new memory capability enables Sonnet 4.5 to remember information across different conversations through the use of a local memory file. Users can explicitly ask the model to remember preferences, context, or important information that persists beyond a single chat session. This creates more personalized and contextually aware interactions while keeping the information safe within the local file.

With these new capabilities for managing context, developers can build AI agents capable of handling long-running tasks at higher intelligence without hitting context limits or losing critical information as frequently.

Getting started
To begin working with Claude Sonnet 4.5, you can access it through Amazon Bedrock using the correct model ID. A good practice is to use the Amazon Bedrock Converse API to write code once and seamlessly switch between different models, making it easier to experiment with Sonnet 4.5 or any of the other models available in Amazon Bedrock.

Let’s see this in action with a simple example. I’m going to use the Amazon Bedrock Converse API to send a prompt to Sonnet 4.5. I start by importing the modules I’m going to use. For this short example, I only need AWS SDK for Python (Boto3) so I can create a BedrockRuntimeClient. I’m also importing the rich package so I can format my output nicely later on.

Following best practices, I create a boto3 session and create an Amazon Bedrock client from it instead of creating one directly. This gives you explicit control over configuration, improves thread safety, and makes your code more predictable and testable compared to relying on the default session.

I want to give the model something with a bit of complexity instead of asking a simple question to demonstrate the power of Sonnet 4.5. So I’m going to give the model the current state of an imaginary legacy monolithic application written in Java with a single database and ask for a digital transformation plan which includes a migration strategy, risk assessment, estimated timeline and key milestones and specific AWS services recommendations.

Because the prompt is quite long I put it in a text file locally and just load it up in code. I then set up the Amazon Bedrock converse payload setting the role to “user” to indicate that this is a message by the user of the application and add the prompt to the content.

This is where the magic happens! We put it all together and call Claude Sonnet 4.5 using its model ID. Well, kind of. You can only access Sonnet 4.5 through an inference profile. This defines which AWS Regions will process your model requests and helps manage throughput and performance.

For this demo, I’ll be using one of Amazon Bedrock’s system-defined cross-Region inference profiles, which automatically routes requests across multiple Regions for optimal performance.

Now I just need to print to the screen to see the results. This is where I use the rich package I imported earlier just so we may have a nicely formatted output as I’m expecting a long response for this one. I also save the output to a file so I can have it handy as something to share with my teams.

Ok, let’s check the results! As expected, Sonnet 4.5 worked through my requirements and provided extensive and deep guidance for my digital transformation plan that I could start putting into practice. It included an executive summary, a step-by-step migration strategy split into phases with time estimates, and even some code samples to seed the development process and start breaking things down into microservices. It also provided the business cases for introducing technology and recommended the correct AWS services for each scenario. Here are some highlights from the report.

Claude Sonnet 4.5 is able to maintain consistency while delivering creative solutions making it an ideal choice for businesses seeking to use AI for complex problem-solving and development tasks. Its enhanced capabilities in following directions and using tools effectively translate into more reliable and innovative solutions across various business contexts.

Things to know
Claude Sonnet 4.5 represents a significant step forward in agent capabilities, particularly excelling in areas where consistent performance and creative problem-solving are essential. Its enhanced abilities in tool handling, memory management, and context processing make it particularly valuable across key industries such as finance, research, and cybersecurity. Whether handling complex development lifecycles, executing long-running tasks, or tackling business-critical workflows, Claude Sonnet 4.5 combines technical excellence with practical business value.

Claude Sonnet 4.5 is available today. For detailed information about its availability please visit the documentation.

To learn more about Amazon Bedrock explore our self-paced Amazon Bedrock Workshop and discover how to use available models and their capabilities in your applications.

Noise

Tag Archives: launch

AWS Lambda enhances event processing with provisioned mode for SQS event-source mapping

Introducing AWS IoT Core Device Location integration with Amazon Sidewalk

Introducing Our Final AWS Heroes of 2025

Dimple Vaghela – Ahmedabad, India

Rola Dali – Montreal, Canada

Vivek Velso – Toronto, Canada

Learn More

Secure EKS clusters with the new support for Amazon EKS in AWS Backup

Introducing AWS Capabilities by Region for easier Regional planning and faster global deployments

AWS Weekly Roundup: Project Rainier online, Amazon Nova, Amazon Bedrock, and more (November 3, 2025)

Build more accurate AI applications with Amazon Nova Web Grounding

Amazon Nova Multimodal Embeddings: State-of-the-art embedding model for agentic RAG and semantic search

Customer Carbon Footprint Tool Expands: Additional emissions categories including Scope 3 are now available

AWS Weekly Roundup: Kiro waitlist, EBS Volume Clones, EC2 Capacity Manager, and more (October 20, 2025)

Introducing Amazon EBS Volume Clones: Create instant copies of your EBS volumes

AWS Weekly Roundup: Amazon Quick Suite, Amazon EC2, Amazon EKS, and more (October 13, 2025)

Announcing Amazon Quick Suite: your agentic teammate for answering questions and taking action

New general-purpose Amazon EC2 M8a instances are now available

Introducing new compute-optimized Amazon EC2 C8i and C8i-flex instances

AWS IAM Identity Center now supports customer-managed KMS keys for encryption at rest

AWS Weekly Roundup: Amazon Bedrock, AWS Outposts, Amazon ECS Managed Instances, AWS Builder ID, and more (October 6, 2025)

Announcing Amazon ECS Managed Instances for containerized applications

Announcing AWS Outposts third-party storage integration with Dell and HPE

Introducing Claude Sonnet 4.5 in Amazon Bedrock: Anthropic’s most intelligent model, best for coding and complex agents

The collective thoughts of the interwebz