Tag Archives: launch

AWS Weekly Roundup: Amazon Aurora DSQL, MCP Servers, Amazon FSx, AI on EKS, and more (June 2, 2025)

2025-06-02 Prasad Rao

Post Syndicated from Prasad Rao original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-aurora-dsql-mcp-servers-amazon-fsx-ai-on-eks-and-more-june-2-2025/

It’s AWS Summit Season! AWS Summits are free in-person events that take place across the globe in major cities, bringing cloud expertise to local communities. Each AWS Summit features keynote presentations highlighting the latest innovations, technical sessions, live demos, and interactive workshops led by Amazon Web Services (AWS) experts. Last week, events took place at AWS Summit Tel Aviv and AWS Summit Singapore.

The following photo shows the packed keynote at AWS Summit Tel Aviv.

AWS Summit Tel Aviv Keynote

Find an AWS Summit near you and join thousands of AWS customers and cloud professionals taking the next step in their cloud journey.

Last week, the announcement that piqued my interest most was the general availability of Amazon Aurora DSQL, which was introduced in preview at re:Invent 2024. Aurora DSQL is the fastest serverless distributed SQL database that enables you to build always available applications with virtually unlimited scalability, the highest availability, and zero infrastructure management.

Aurora DSQL active-active distributed architecture is designed for 99.99% single-Region and 99.999% multi-Region availability with no single point of failure and automated failure recovery. This means your applications can continue to read and write with strong consistency, even in the rare case an application is unable to connect to a Region cluster endpoint.

What’s more fascinating is the journey behind building Aurora DSQL, a story that goes beyond the technology in the pursuit of engineering efficiency. Read the full story in Dr. Werner Vogels’ blog post, Just make it scale: An Aurora DSQL story.

Last week’s launches
Here are the other launches that got my attention:

Announcing new Model Context Protocol (MCP) servers for AWS Serverless and Containers – MCP servers are now available for AWS Lambda, Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), and Finch. With MCP servers, you can get from idea to production faster by giving your AI assistants access to an up-to-date framework on how to correctly interact with your AWS service of choice. To download and try out the open source MCP servers, visit the aws-labs GitHub repository.
Announcing the general availability of Amazon FSx for Lustre Intelligent-Tiering – FSx for Lustre Intelligent-Tiering, a new storage class, automatically optimizes costs by tiering cold data to the applicable lower-cost storage tier based on access patterns and includes an optional SSD read cache to improve performance for your most latency-sensitive workloads.
Amazon FSx for NetApp ONTAP now supports write-back mode for ONTAP FlexCache volumes – Write-back mode is a new ONTAP capability that helps you achieve faster performance for your write-intensive workloads that are distributed across multiple AWS Regions and on-premises file systems.
AWS Network Firewall Adds Support for Multiple VPC Endpoints – AWS Network Firewall now supports configuring up to 50 Amazon Virtual Private Cloud (Amazon VPC) endpoints per Availability Zone for a single firewall. This new capability gives you more options to scale your Network Firewall deployment across multiple VPCs, using a centralized security policy.
Cost Optimization Hub now supports Savings Plans and reservations preferences – You can now use Cost Optimization Hub, a feature within the Billing and Cost Management Console, to configure preferred Savings Plans and reservation term and payment options preferences, so you can see your resulting recommendations and savings potential based on your preferred commitments.
AWS Neuron introduces NxD Inference GA, new features, and improved tools – With the release of Neuron 2.23, the NxD Inference library (NxDI) moves from beta to general availability and is now recommended for all multi-chip inference use cases. Neuron 2.23 also introduces new training capabilities, including context parallelism and Odds Ratio Preference Optimization (ORPO), and adds support for PyTorch 2.6 and JAX 0.5.3.
AWS Pricing Calculator, now generally available, supports discounts and purchase commitment – We announced the general availability of the AWS Pricing Calculator in the AWS console. You can now create more accurate and comprehensive cost estimates by providing two types of cost estimates: cost estimation for a workload, and estimation of a full AWS bill. You can also import your historical usage or create net new usage when creating a cost estimate. Additionally, with the new rate configuration inclusive of both pricing discounts and purchase commitments, you can gain a clearer picture of potential savings and cost optimizations for your cost scenarios.
AWS CDK Toolkit Library is now generally available – AWS CDK Toolkit Library provides programmatic access to core AWS CDK functionalities such as synthesis, deployment, and destruction of stacks. You can use this library to integrate CDK operations directly into your applications, custom CLIs, and automation workflows, offering greater flexibility and control over infrastructure management.
Announcing Red Hat Enterprise Linux for AWS – Red Hat Enterprise Linux (RHEL) for AWS, starting with RHEL 10, is now generally available, combining Red Hat’s enterprise-grade Linux software with native AWS integration. RHEL for AWS is built to achieve optimum performance of RHEL running on AWS.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS? page.

Additional updates
Here are some additional projects, blog posts, and news items that you might find interesting:

Introducing AI on EKS: powering scalable AI workloads with Amazon EKS – AI on EKS is a new open source initiative from AWS designed to help you deploy, scale, and optimize AI/ML workloads on Amazon EKS. AI on EKS repository includes deployment-ready blueprints for distributed training, LLM inference, generative AI pipelines, multi-model serving, agentic AI, GPU and Neuron-specific benchmarks, and MLOps best practices.
Revolutionizing earth observation with geospatial foundation models on AWS – Emerging transformer-based vision models for geospatial data—also called geospatial foundation models (GeoFMs)—offer a new and powerful technology for mapping the earth’s surface at a continental scale. This post explores how Clay Foundation’s Clay foundation model can be deployed for large-scale inference and fine-tuning on Amazon SageMaker. You can use the ready-to-deploy code samples to get started quickly with deploying GeoFMs in your own applications on AWS.

Going beyond AI assistants: Examples from Amazon.com reinventing industries with generative AI – Non-conversational applications offer unique advantages, such as higher latency tolerance, batch processing, and caching, but their autonomous nature requires stronger guardrails and exhaustive quality assurance compared to conversational applications, which benefit from real-time user feedback and supervision. This post examines four diverse Amazon.com examples of non-conversational generative AI applications.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Stockholm (June 4), Sydney (June 4–5), Hamburg (June 5), Washington (June 10–11), Madrid (June 11), Milan (June 18), Shanghai (June 19–20), and Mumbai (June 19).
AWS re:Inforce – Mark your calendars for AWS re:Inforce (June 16–18) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity.
AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Milwaukee, USA (June 5), Mexico (June 14), Nairobi, Kenya (June 14), and Colombia (June 28).

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Prasad

Amazon FSx for Lustre launches new storage class with the lowest-cost and only fully elastic Lustre ﬁle storage

2025-05-29 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/amazon-fsx-for-lustre-adds-new-storage-class-with-the-lowest-cost-and-only-fully-elastic-lustre-file-storage/

Seismic imaging is a geophysical technique used to create detailed pictures of the Earth’s subsurface structure. It works by generating seismic waves that travel into the ground, reﬂect oﬀ various rock layers and structures, and return to the surface where they’re detected by sensitive instruments known as geophones or hydrophones. The huge volumes of acquired data often reach petabytes for a single survey and this presents signiﬁcant storage, processing, and management challenges for researchers and energy companies.

Customers who run these seismic imaging workloads or other high performance computing (HPC) workloads, such as weather forecasting, advanced driver-assistance system (ADAS) training, or genomics analysis, already store the huge volumes of data on either hard disk drive (HDD)-based or a combination of HDD and solid state drive (SSD) file storage on premises. However, as these on premises datasets and workloads scale, customers find it increasingly challenging and expensive due to the need to make upfront capital investments to keep up with performance needs of their workloads and avoid running out of storage capacity.

Today, we’re announcing the general availability of the Amazon FSx for Lustre Intelligent-Tiering, a new storage class that delivers virtually unlimited scalability, the only fully elastic Lustre ﬁle storage, and the lowest cost Lustre file storage in the cloud. With a starting price of less than $0.005 per GB-month, FSx for Lustre Intelligent-Tiering oﬀers the lowest cost high-performance file storage in the cloud, reducing storage costs for infrequently accessed data by up to 96 percent compared to other managed Lustre options. Elasticity means you no longer need to provision storage capacity upfront because your file system will grow and shrink as you add or delete data, and you pay only for the amount of data you store.

FSx for Lustre Intelligent-Tiering automatically optimizes costs by tiering cold data to the applicable lower-cost storage tier based on access patterns and includes an optional SSD read cache to improve performance for your most latency sensitive workloads. Intelligent-Tiering delivers high performance whether you’re starting with gigabytes of experimental data or working with large petabyte-scale datasets for your most demanding artificial intelligence/machine learning (AI/ML) and HPC workloads. With the flexibility to adjust your file system’s performance independent of storage, Intelligent-Tiering delivers up to 34 percent better price performance than on premises HDD file systems. The Intelligent-Tiering storage class is optimized for HDD-based or mixed HDD/SSD workloads that have a combination of hot and cold data. You can migrate and run such workloads to FSx for Lustre Intelligent-Tiering without application changes, eliminating storage capacity planning and management, while paying only for the resources that you use.

Prior to this launch, customers used the FSx for Lustre SSD storage class to accelerate ML and HPC workloads that need all-SSD performance and consistent low-latency access to all data. However, many workloads have a combination of hot and cold data and they don’t need all-SSD storage for colder portions of the data. FSx for Lustre is increasingly used in AI/ML workloads to increase graphics processing unit (GPU) utilization, and now it’s even more cost optimized to be one of the options for these workloads.

FSx for Lustre Intelligent-Tiering
Your data moves between three storage tiers (Frequent Access, Infrequent Access, and Archive) with no effort on your part, so you get automatic cost savings with no upfront costs or commitments. The tiering works as follows:

Frequent Access – Data that has been accessed within the last 30 days is stored in this tier.

Infrequent Access – Data that hasn’t been accessed for 30 – 90 days is stored in this tier, at a 44 percent cost reduction from Frequent Access.

Archive – Data that hasn’t been accessed for 90 or more days is stored in this tier, at a 65 percent cost reduction compared to Infrequent Access.

Regardless of the storage tier, your data is stored across multiple AWS Availability Zones for redundancy and availability, compared to typical on-premises implementations, which are usually confined within a single physical location. Additionally, your data can be retrieved instantly in milliseconds.

Creating a file system
I can create a file system using the AWS Management Console, AWS Command Line Interface (AWS CLI), API, or AWS CloudFormation. On the console, I choose Create file system to get started.

I select Amazon FSx for Lustre and choose Next.

Now, it’s time to enter the rest of the information to create the file system. I enter a name (veliswa_fsxINT_1) for my file system, and for deployment and storage class, I select Persistent, Intelligent-Tiering. I choose the desired Throughput capacity and the Metadata IOPS. The SSD read cache will be automatically configured by FSx for Lustre based on the specified throughput capacity. I leave the rest as the default, choose Next, and review my choices to create my file system.

With Amazon FSx for Lustre Intelligent-Tiering, you have the flexibility to provision the necessary performance for your workloads without having to provision any underlying storage capacity upfront.

I wanted to know which values were editable after creation, so I paid closer attention before finalizing the creation of the file system. I noted that Throughput capacity, Metadata IOPS, Security groups, SSD read cache, and a few others were editable later. After I start running the ML jobs, it might be necessary to increase the throughput capacity based on the volumes of data I’ll be processing, so this information is important to me.

The file system is now available. Considering that I’ll be running HPC workloads, I anticipate that I’ll be processing high volumes of data later, so I’ll increase the throughput capacity to 24 GB/s. After all, I only pay for the resources I use.

The SSD read cache is scaled automatically as your performance needs increase. You can adjust the cache size any time independently in user-provisioned mode or disable the read cache if you don’t need low-latency access.

Good to know

FSx for Lustre Intelligent-Tiering is designed to deliver up to multiple terabytes per second of total throughput.
FSx for Lustre with Elastic Fabric Adapter (EFA)/GPU Direct Storage (GDS) support provides up to 12x (up to 1200 Gbps) higher per-client throughput compared to the previous FSx for Lustre systems.
It can deliver up to tens of millions of IOPS for writes and cached reads. Data in the SSD read cache has submillisecond time-to-first-byte latencies, and all other data has time-to-first-byte latencies in the range of tens of milliseconds.

Now available
Here are a couple of things to keep in mind:

FSx Intelligent-Tiering storage class is available in the new FSx for Lustre file systems in the US East (N. Virginia, Ohio), US West (N. California, Oregon), Canada (Central), Europe (Frankfurt, Ireland, London, Stockholm), and Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo) AWS Regions.

You pay for data and metadata you store on your file system (GB/months). When you write data or when you read data that is not in the SSD read cache, you pay per operation. You pay for the total throughput capacity (in MBps/month), metadata IOPS (IOPS/month), and SSD read cache size for data and metadata (GB/month) you provision on your file system. To learn more, visit the Amazon FSx for Lustre Pricing page. To learn more about Amazon FSx for Lustre including this feature, visit the Amazon FSx for Lustre page.

Give Amazon FSx for Lustre Intelligent-Tiering a try in the Amazon FSx console today and send feedback to AWS re:Post for Amazon FSx for Lustre or through your usual AWS Support contacts.

– Veliswa.

How is the News Blog doing? Take this 1 minute survey!

(This survey is hosted by an external company. AWS handles your information as described in the AWS Privacy Notice. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.)

Amazon Aurora DSQL is now generally available

2025-05-27 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/amazon-aurora-dsql-is-now-generally-available/

Today, we’re announcing the general availability of Amazon Aurora DSQL, the fastest serverless distributed SQL database with virtually unlimited scale, the highest availability, and zero infrastructure management for always available applications. You can remove the operational burden of patching, upgrades, and maintenance downtime and count on an easy-to-use developer experience to create a new database in a few quick steps.

When we introduced the preview of Aurora DSQL at AWS re:Invent 2024, our customers were excited by this innovative solution to simplify complex relational database challenges. In his keynote, Dr. Werner Vogels, CTO of Amazon.com, talked about managing complexity upfront in the design of Aurora DSQL. Unlike most traditional databases, Aurora DSQL is disaggregated into multiple independent components such as a query processor, adjudicator, journal, and crossbar.

These components have high cohesion, communicate through well-specified APIs, and scale independently based on your workloads. This architecture enables multi-Region strong consistency with low latency and globally synchronized time. To learn more about how Aurora DSQL works behind the scenes, watch Dr. Werner Vogels’ keynote and read about an Aurora DSQL story.

The architecture of Amazon Aurora DSQL
Your application can use the fastest distributed SQL reads and writes and scale to meet any workload demand without any database sharding or instance upgrades. With Aurora DSQL, its active-active distributed architecture is designed for 99.99 percent availability in a single Region and 99.999 percent availability across multiple Regions. This means your applications can continue to read and write with strong consistency, even in the rare case an application is unable to connect to a Region cluster endpoint.

In a single-Region configuration, Aurora DSQL commits all write transactions to a distributed transaction log and synchronously replicates all committed log data to user storage replicas in three Availability Zones. Cluster storage replicas are distributed across a storage fleet and automatically scale to ensure optimal read performance.

Multi-Region clusters provide the same resilience and connectivity as single-Region clusters while improving availability through two Regional endpoints, one for each peered cluster Region. Both endpoints of a peered cluster present a single logical database and support concurrent read and write operations with strong data consistency. A third Region acts as a log-only witness which means there is is no cluster resource or endpoint. This means you can balance applications and connections for geographic locations, performance, or resiliency purposes, making sure readers consistently see the same data.

Aurora DSQL is an ideal choice to support applications using microservices and event-driven architectures, and you can design highly scalable solutions for industries such as banking, ecommerce, travel, and retail. It’s also ideal for multi-tenant software as a service (SaaS) applications and data-driven services like payment processing, gaming platforms, and social media applications that require multi-Region scalability and resilience.

Getting started with Amazon Aurora DSQL
Aurora DSQL provides a easy-to-use experience, starting with a simple console experience. You can use familiar SQL clients to leverage existing skillsets, and integration with other AWS services to improve managing databases.

To create an Aurora DSQL cluster, go to the Aurora DSQL console and choose Create cluster. You can choose either Single-Region or Multi-Region configuration options to help you establish the right database infrastructure for your needs.

1. Create a single-Region cluster

To create a single-Region cluster, you only choose Create cluster. That’s all.

In a few minutes, you’ll see your Aurora DSQL cluster created. To connect your cluster, you can use your favorite SQL client such as PostgreSQL interactive terminal, DBeaver, JetBrains DataGrip, or you can take various programmable approaches with a database endpoint and authentication token as a password. You can integrate with AWS Secrets Manager for automated token generation and rotation to secure and simplify managing credentials across your infrastructure.

To get the authentication token, choose Connect and Get Token in your cluster detail page. Copy the endpoint from Endpoint (Host) and the generated authentication token after Connect as admin is chosen in the Authentication token (Password) section.

Then, choose Open in CloudShell, and with a few clicks, you can seamlessly connect to your cluster.

After you connect the Aurora DSQL cluster, test your cluster by running sample SQL statements. You can also query SQL statements for your applications using your favorite programming languages: Python, Java, JavaScript, C++, Ruby, .NET, Rust, and Golang. You can build sample applications using a Django, Ruby on Rails, and AWS Lambda application to interact with Amazon Aurora DSQL.

2. Create a multi-Region cluster

To create a multi-Region cluster, you need to add the other cluster’s Amazon Resource Name (ARN) to peer the clusters.

To create the first cluster, choose Multi-Region in the console. You will also be required to choose the Witness Region, which receives data written to any peered Region but doesn’t have an endpoint. Choose Create cluster. If you already have a remote Region cluster, you can optionally enter its ARN.

Next, add an existing remote cluster or create your second cluster in another Region by choosing Create cluster.

Now, you can create the second cluster with your peer cluster ARN as the first cluster.

When the second cluster is created, you must peer the cluster in us-east-1 in order to complete the multi-Region creation.

Go to the first cluster page and choose Peer to confirm cluster peering for both clusters.

Now, your multi-Region cluster is created successfully. You can see details about the peers that are in other Regions in the Peers tab.

To get hands-on experience with Aurora DSQL, you can use this step-by-step workshop. It walks through the architecture, key considerations, and best practices as you build a sample retail rewards point application with active-active resiliency.

You can use the AWS SDKs, AWS Comand Line Interface (AWS CLI), and Aurora DSQL APIs to create and manage Aurora DSQL programmatically. To learn more, visit Setting up Aurora DSQL clusters in the Amazon Aurora DSQL User Guide.

What did we add after the preview?
We used your feedback and suggestions during the preview period to add new capabilities. We’ve highlighted a few of the new features and capabilities:

Console experience –We improved your cluster management experience to create and peer multi-Region clusters as well as easily connect using AWS CloudShell.
PostgreSQL features – We added support for views, unique secondary indexes for tables with existing data and launched Auto-Analyze which removes the need to manually maintain accurate table statistics. Learn about Aurora DSQL PostgreSQL-compatible features.
Integration with AWS services –We integrated various AWS services such as AWS Backup for a full snapshot backup and Aurora DSQL cluster restore, AWS PrivateLink for private network connectivity, AWS CloudFormation for managing Aurora DSQL resources, and AWS CloudTrail for logging Aurora DSQL operations.

Aurora DSQL now provides a Model Context Protocol (MCP) server to improve developer productivity by making it easy for your generative AI models and database to interact through natural language. For example, install Amazon Q Developer CLI and configure Aurora DSQL MCP server. Amazon Q Developer CLI now has access to an Aurora DSQL cluster. You can easily explore the schema of your database, understand the structure of the tables, and even execute complex SQL queries, all without having to write any additional integration code.

Now available
Amazon Aurora DSQL is available today in the AWS US East (N. Virginia), US East (Ohio), US West (Oregon) Regions for single- and multi-Region clusters (two peers and one witness Region), Asia Pacific (Osaka) and Asia Pacific (Tokyo) for single-Region clusters, and Europe (Ireland), Europe (London), and Europe (Paris) for single-Region clusters.

You’re billed on a monthly basis using a single normalized billing unit called Distributed Processing Unit (DPU) for all request-based activity such as read/write. Storage is based on the total size of your database and measured in GB-months. You are only charged for one logical copy of your data per single-Region cluster or multi-Region peered cluster. As a part of the AWS Free Tier, your first 100,000 DPUs and 1 GB-month of storage each month is free. To learn more, visit Amazon Aurora DSQL Pricing.

Give Aurora DSQL a try for free in the Aurora DSQL console. For more information, visit the Aurora DSQL User Guide and send feedback to AWS re:Post for Aurora DSQL or through your usual AWS support contacts.

— Channy

Introducing Claude 4 in Amazon Bedrock, the most powerful models for coding from Anthropic

2025-05-22 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/claude-opus-4-anthropics-most-powerful-model-for-coding-is-now-in-amazon-bedrock/

Anthropic launched the next generation of Claude models today—Opus 4 and Sonnet 4—designed for coding, advanced reasoning, and the support of the next generation of capable, autonomous AI agents. Both models are now generally available in Amazon Bedrock, giving developers immediate access to both the model’s advanced reasoning and agentic capabilities.

Amazon Bedrock expands your AI choices with Anthropic’s most advanced models, giving you the freedom to build transformative applications with enterprise-grade security and responsible AI controls. Both models extend what’s possible with AI systems by improving task planning, tool use, and agent steerability.

With Opus 4’s advanced intelligence, you can build agents that handle long-running, high-context tasks like refactoring large codebases, synthesizing research, or coordinating cross-functional enterprise operations. Sonnet 4 is optimized for efficiency at scale, making it a strong fit as a subagent or for high-volume tasks like code reviews, bug fixes, and production-grade content generation.

When building with generative AI, many developers work on long-horizon tasks. These workflows require deep, sustained reasoning, often involving multistep processes, planning across large contexts, and synthesizing diverse inputs over extended timeframes. Good examples of these workflows are developer AI agents that help you to refactor or transform large projects. Existing models may respond quickly and fluently, but maintaining coherence and context over time—especially in areas like coding, research, or enterprise workflows—can still be challenging.

Claude Opus 4
Claude Opus 4 is the most advanced model to date from Anthropic, designed for building sophisticated AI agents that can reason, plan, and execute complex tasks with minimal oversight. Anthropic benchmarks show it is the best coding model available on the market today. It excels in software development scenarios where extended context, deep reasoning, and adaptive execution are critical. Developers can use Opus 4 to write and refactor code across entire projects, manage full-stack architectures, or design agentic systems that break down high-level goals into executable steps. It demonstrates strong performance on coding and agent-focused benchmarks like SWE-bench and TAU-bench, making it a natural choice for building agents that handle multistep development workflows. For example, Opus 4 can analyze technical documentation, plan a software implementation, write the required code, and iteratively refine it—while tracking requirements and architectural context throughout the process.

Claude Sonnet 4
Claude Sonnet 4 complements Opus 4 by balancing performance, responsiveness, and cost, making it well-suited for high-volume production workloads. It’s optimized for everyday development tasks with enhanced performance, such as powering code reviews, implementing bug ﬁxes, and new feature development with immediate feedback loops. It can also power production-ready AI assistants for near real-time applications. Sonnet 4 is a drop-in replacement from Claude Sonnet 3.7. In multi-agent systems, Sonnet 4 performs well as a task-speciﬁc subagent—handling responsibilities like targeted code reviews, search and retrieval, or isolated feature development within a broader pipeline. You can also use Sonnet 4 to manage continuous integration and delivery (CI/CD) pipelines, perform bug triage, or integrate APIs, all while maintaining high throughput and developer-aligned output.

Opus 4 and Sonnet 4 are hybrid reasoning models offering two modes: near-instant responses and extended thinking for deeper reasoning. You can choose near-instant responses for interactive applications, or enable extended thinking when a request benefits from deeper analysis and planning. Thinking is especially useful for long-context reasoning tasks in areas like software engineering, math, or scientific research. By configuring the model’s thinking budget—for example, by setting a maximum token count—you can tune the tradeoff between latency and answer depth to fit your workload.

How to get started
To see Opus 4 or Sonnet 4 in action, enable the new model in your AWS account. Then, you can start coding using the Bedrock Converse API with model IDanthropic.claude-opus-4-20250514-v1:0 for Opus 4 and anthropic.claude-sonnet-4-20250514-v1:0 for Sonnet 4. We recommend using the Converse API, because it provides a consistent API that works with all Amazon Bedrock models that support messages. This means you can write code one time and use it with different models.

For example, let’s imagine I write an agent to review code before merging changes in a code repository. I write the following code that uses the Bedrock Converse API to send a system and user prompts. Then, the agent consumes the streamed result.

private let modelId = "us.anthropic.claude-sonnet-4-20250514-v1:0"

// Define the system prompt that instructs Claude how to respond
let systemPrompt = """
You are a senior iOS developer with deep expertise in Swift, especially Swift 6 concurrency. Your job is to perform a code review focused on identifying concurrency-related edge cases, potential race conditions, and misuse of Swift concurrency primitives such as Task, TaskGroup, Sendable, @MainActor, and @preconcurrency.

You should review the code carefully and flag any patterns or logic that may cause unexpected behavior in concurrent environments, such as accessing shared mutable state without proper isolation, incorrect actor usage, or non-Sendable types crossing concurrency boundaries.

Explain your reasoning in precise technical terms, and provide recommendations to improve safety, predictability, and correctness. When appropriate, suggest concrete code changes or refactorings using idiomatic Swift 6
"""
let system: BedrockRuntimeClientTypes.SystemContentBlock = .text(systemPrompt)

// Create the user message with text prompt and image
let userPrompt = """
Can you review the following Swift code for concurrency issues? Let me know what could go wrong and how to fix it.
"""
let prompt: BedrockRuntimeClientTypes.ContentBlock = .text(userPrompt)

// Create the user message with both text and image content
let userMessage = BedrockRuntimeClientTypes.Message(
    content: [prompt],
    role: .user
)

// Initialize the messages array with the user message
var messages: [BedrockRuntimeClientTypes.Message] = []
messages.append(userMessage)

// Configure the inference parameters
let inferenceConfig: BedrockRuntimeClientTypes.InferenceConfiguration = .init(maxTokens: 4096, temperature: 0.0)

// Create the input for the Converse API with streaming
let input = ConverseStreamInput(inferenceConfig: inferenceConfig, messages: messages, modelId: modelId, system: [system])

// Make the streaming request
do {
    // Process the stream
    let response = try await bedrockClient.converseStream(input: input)

    // Iterate through the stream events
    for try await event in stream {
        switch event {
        case .messagestart:
            print("AI-assistant started to stream"")

        case let .contentblockdelta(deltaEvent):
            // Handle text content as it arrives
            if case let .text(text) = deltaEvent.delta {
                self.streamedResponse + = text
                print(text, termination: "")
            }

        case .messagestop:
            print("\n\nStream ended")
            // Create a complete assistant message from the streamed response
            let assistantMessage = BedrockRuntimeClientTypes.Message(
                content: [.text(self.streamedResponse)],
                role: .assistant
            )
            messages.append(assistantMessage)

        default:
            break
        }
    }

To help you get started, my colleague Dennis maintains a broad range of code examples for multiple use cases and a variety of programming languages.

Available today in Amazon Bedrock
This release gives developers immediate access in Amazon Bedrock, a fully managed, serverless service, to the next generation of Claude models developed by Anthropic. Whether you’re already building with Claude in Amazon Bedrock or just getting started, this seamless access makes it faster to experiment, prototype, and scale with cutting-edge foundation models—without managing infrastructure or complex integrations.

Claude Opus 4 is available in the following AWS Regions in North America: US East (Ohio, N. Virginia) and US West (Oregon). Claude Sonnet 4 is available not only in AWS Regions in North America but also in APAC, and Europe: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Hyderabad, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), and Europe (Spain). You can access the two models through cross-Region inference. Cross-Region inference helps to automatically select the optimal AWS Region within your geography to process your inference request.

Opus 4 tackles your most challenging development tasks, while Sonnet 4 excels at routine work with its optimal balance of speed and capability.

Learn more about the pricing and how to use these new models in Amazon Bedrock today!

— seb

Centralize visibility of Kubernetes clusters across AWS Regions and accounts with EKS Dashboard

2025-05-21 Micah Walter

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/centralize-visibility-of-kubernetes-clusters-across-aws-regions-and-accounts-with-eks-dashboard/

Today, we are announcing EKS Dashboard, a centralized display that enables cloud architects and cluster administrators to maintain organization-wide visibility across their Kubernetes clusters. With EKS Dashboard, customers can now monitor clusters deployed across different AWS Regions and accounts through a unified view, making it easier to track cluster inventory, assess compliance, and plan operational activities like version upgrades.

As organizations scale their Kubernetes deployments, they often run multiple clusters across different environments to enhance availability, ensure business continuity, or maintain data sovereignty. However, this distributed approach can make it challenging to maintain visibility and control, especially in decentralized setups spanning multiple Regions and accounts. Today, many customers resort to third-party tools for centralized cluster visibility, which adds complexity through identity and access setup, licensing costs, and maintenance overhead.

EKS Dashboard simplifies this experience by providing native dashboard capabilities within the AWS Console. The Dashboard provides insights into 3 different resources including clusters, managed node groups, and EKS add-ons, offering aggregated insights into cluster distribution by Region, account, version, support status, forecasted extended support EKS control plane costs, and cluster health metrics. Customers can drill down into speciﬁc data points with automatic ﬁltering, enabling them to quickly identify and focus on clusters requiring attention.

Setting up EKS Dashboard

Customers can access the Dashboard in EKS console through AWS Organizations’ management and delegated administrator accounts. The setup process is straightforward and includes simply enabling trusted access as a one-time setup in the Amazon EKS console’s organizations settings page. Trusted access is available from the Dashboard settings page. Enabling trusted access will allow the management account to view the Dashboard. For more information on setup and configuration, see the official AWS Documentation.

Screenshot of EKS Dashboard settings

A quick tour of EKS Dashboard

The dashboard provides both graphical, tabular, and map views of your Kubernetes clusters, with advanced filtering, and search capabilities. You can also export data for further analysis or custom reporting.

EKS Dashboard overview with key info about your clusters.

There is a wide variety of available widgets to help visualize your clusters.

You can visualize your managed node groups by instance type distribution, launch templates, AMI versions, and more

There is even a map view where you can see all of your clusters across the globe.

Beyond EKS clusters

EKS Dashboard isn’t limited to just Amazon EKS clusters; it can also provide visibility into connected Kubernetes clusters running on-premises or on other cloud providers. While connected clusters may have limited data fidelity compared to native Amazon EKS clusters, this capability enables truly unified visibility for organizations running hybrid or multi-cloud environments.

Available now

EKS Dashboard is available today in the US East (N. Virginia) Region and is able to aggregate data from all commercial AWS Regions. There is no additional charge for using the EKS Dashboard. To learn more, visit the Amazon EKS documentation.

This new capability demonstrates our continued commitment to simplifying Kubernetes operations for our customers, enabling them to focus on building and scaling their applications rather than managing infrastructure. We’re excited to see how customers use EKS Dashboard to enhance their Kubernetes operations.

— Micah;

Configure System Integrity Protection (SIP) on Amazon EC2 Mac instances

2025-05-21 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/configure-system-integrity-protection-sip-on-amazon-ec2-mac-instances/

I’m pleased to announce developers can now programmatically disable Apple System Integrity Protection (SIP) on their Amazon EC2 Mac instances. System Integrity Protection (SIP), also known as rootless, is a security feature introduced by Apple in OS X El Capitan (2015, version 10.11). It’s designed to protect the system from potentially harmful software by restricting the power of the root user account. SIP is enabled by default on macOS.

SIP safeguards the system by preventing modification of protected files and folders, restricting access to system-owned files and directories, and blocking unauthorized software from selecting a startup disk. The primary goal of SIP is to address the security risk linked to unrestricted root access, which could potentially allow malware to gain full control of a device with just one password or vulnerability. By implementing this protection, Apple aims to ensure a higher level of security for macOS users, especially considering that many users operate on administrative accounts with weak or no passwords.

While SIP provides excellent protection against malware for everyday use, developers might occasionally need to temporarily disable it for development and testing purposes. For instance, when creating a new device driver or system extension, disabling SIP is necessary to install and test the code. Additionally, SIP might block access to certain system settings required for your software to function properly. Temporarily disabling SIP grants you the necessary permissions to fine-tune programs for macOS. However, it’s crucial to remember that this is akin to briefly disabling the vault door for authorized maintenance, not leaving it permanently open.

Disabling SIP on a Mac requires physical access to the machine. You have to restart the machine in recovery mode, then disable SIP with the csrtutil command line tool, then restart the machine again.

Until today, you had to operate with the standard SIP settings on EC2 Mac instances. The physical access requirement and the need to boot in recovery mode made integrating SIP with the Amazon EC2 control plane and EC2 API challenging. But that’s no longer the case! You can now disable and re-enable SIP at will on your Amazon EC2 Mac instances. Let me show you how.

Let’s see how it works
Imagine I have an Amazon EC2 Mac instance started. It’s a mac2-m2.metal instance, running on an Apple silicon M2 processor. Disabling or enabling SIP is as straightforward as calling a new EC2 API: CreateMacSystemIntegrityProtectionModificationTask. This API is asynchronous; it starts the process of changing the SIP status on your instance. You can monitor progress using another new EC2 API: DescribeMacModificationTasks. All I need to know is the instance ID of the machine I want to work with.

Prerequisites
On Apple silicon based EC2 Mac instances and more recent type of machines, before calling the new EC2 API, I must set the ec2-user user password and enable secure token for that user on macOS. This requires connecting to the machine and typing two commands in the terminal.

# on the target EC2 Mac instance
# Set a password for the ec2-user user
~ % sudo /usr/bin/dscl . -passwd /Users/ec2-user
New Password: (MyNewPassw0rd)

# Enable secure token, with the same password, for the ec2-user
# old password is the one you just set with dscl
~ % sysadminctl -newPassword MyNewPassw0rd -oldPassword MyNewPassw0rd
2025-03-05 13:16:57.261 sysadminctl[3993:3033024] Attempting to change password for ec2-user…
2025-03-05 13:16:58.690 sysadminctl[3993:3033024] SecKeychainCopyLogin returned -25294
2025-03-05 13:16:58.690 sysadminctl[3993:3033024] Failed to update keychain password (-25294)
2025-03-05 13:16:58.690 sysadminctl[3993:3033024] - Done

# The error about the KeyChain is expected. I never connected with the GUI on this machine, so the Login keychain does not exist
# you can ignore this error.  The command below shows the list of keychains active in this session
~ % security list
    "/Library/Keychains/System.keychain"

# Verify that the secure token is ENABLED
~ % sysadminctl -secureTokenStatus ec2-user
2025-03-05 13:18:12.456 sysadminctl[4017:3033614] Secure token is ENABLED for user ec2-user

Change the SIP status
I don’t need to connect to the machine to toggle the SIP status. I only need to know its instance ID. I open a terminal on my laptop and use the AWS Command Line Interface (AWS CLI) to retrieve the Amazon EC2 Mac instance ID.

 aws ec2 describe-instances \
         --query "Reservations[].Instances[?InstanceType == 'mac2-m2.metal' ].InstanceId" \
         --output text

i-012a5de8da47bdff7

Now, still from the terminal on my laptop, I disable SIP with the create-mac-system-integrity-protection-modification-task command:

echo '{"rootVolumeUsername":"ec2-user","rootVolumePassword":"MyNewPassw0rd"}' > tmpCredentials
aws ec2 create-mac-system-integrity-protection-modification-task \
--instance-id "i-012a5de8da47bdff7" \
--mac-credentials fileb://./tmpCredentials \
--mac-system-integrity-protection-status "disabled" && rm tmpCredentials

{
    "macModificationTask": {
        "instanceId": "i-012a5de8da47bdff7",
        "macModificationTaskId": "macmodification-06a4bb89b394ac6d6",
        "macSystemIntegrityProtectionConfig": {},
        "startTime": "2025-03-14T14:15:06Z",
        "taskState": "pending",
        "taskType": "sip-modification"
    }
}

After the task is started, I can check its status with the aws ec2 describe-mac-modification-tasks command.

{
    "macModificationTasks": [
        {
            "instanceId": "i-012a5de8da47bdff7",
            "macModificationTaskId": "macmodification-06a4bb89b394ac6d6",
            "macSystemIntegrityProtectionConfig": {
                "debuggingRestrictions": "",
                "dTraceRestrictions": "",
                "filesystemProtections": "",
                "kextSigning": "",
                "nvramProtections": "",
                "status": "disabled"
            },
            "startTime": "2025-03-14T14:15:06Z",
            "tags": [],
            "taskState": "in-progress",
            "taskType": "sip-modification"
        },
...

The instance initiates the process and a series of reboots, during which it becomes unreachable. This process can take 60–90 minutes to complete. After that, when I see the status in the console becoming available again, I connect to the machine through SSH or EC2 Instance Connect, as usual.

➜  ~ ssh [email protected]
Warning: Permanently added '54.99.9.99' (ED25519) to the list of known hosts.
Last login: Mon Feb 26 08:52:42 2024 from 1.1.1.1

    ┌───┬──┐   __|  __|_  )
    │ ╷╭╯╷ │   _|  (     /
    │  └╮  │  ___|\___|___|
    │ ╰─┼╯ │  Amazon EC2
    └───┴──┘  macOS Sonoma 14.3.1

➜  ~ uname -a
Darwin Mac-mini.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:27 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T8103 arm64

➜ ~ csrutil --status 
System Integrity Protection status: disabled.

When to disable SIP
Disabling SIP should be approached with caution because it opens up the system to potential security risks. However, as I mentioned in the introduction of this post, you might need to disable SIP when developing device drivers or kernel extensions for macOS. Some older applications might also not function correctly when SIP is enabled.

Disabling SIP is also required to turn off Spotlight indexing. Spotlight can help you quickly find apps, documents, emails and other items on your Mac. It’s very convenient on desktop machines, but not so much on a server. When there is no need to index your documents as they change, turning off Spotlight will release some CPU cycles and disk I/O.

Things to know
There are a couple of additional things to know about disabling SIP on Amazon EC2 Mac:

Disabling SIP is available through the API and AWS SDKs, the AWS CLI, and the AWS Management Console.
On Apple silicon, the setting is volume based. So if you replace the root volume, you need to disable SIP again. On Intel, the setting is Mac host based, so if you replace the root volume, SIP will still be disabled.
After disabling SIP, it will be enabled again if you stop and start the instance. Rebooting an instance doesn’t change its SIP status.
SIP status isn’t transferable between EBS volumes. This means SIP will be disabled again after you restore an instance from an EBS snapshot or if you create an AMI from an instance where SIP is enabled.

These new APIs are available in all Regions where Amazon EC2 Mac is available, at no additional cost. Try them today.

— seb

How is the News Blog doing? Take this 1 minute survey!

Amazon Inspector enhances container security by mapping Amazon ECR images to running containers

2025-05-19 Elizabeth Fuentes

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/amazon-inspector-enhances-container-security-by-mapping-amazon-ecr-images-to-running-containers/

When running container workloads, you need to understand how software vulnerabilities create security risks for your resources. Until now, you could identify vulnerabilities in your Amazon Elastic Container Registry (Amazon ECR) images, but couldn’t determine if these images were active in containers or track their usage. With no visibility if these images were being used on running clusters, you had limited ability to prioritize fixes based on actual deployment and usage patterns.

Starting today, Amazon Inspector offers two new features that enhance vulnerability management, giving you a more comprehensive view of your container images. First, Amazon Inspector now maps Amazon ECR images to running containers, enabling security teams to prioritize vulnerabilities based on containers currently running in your environment. With these new capabilities, you can analyze vulnerabilities in your Amazon ECR images and prioritize findings based on whether they are currently running and when they last ran in your container environment. Additionally, you can see the cluster Amazon Resource Name (ARN), number EKS pods or ECS tasks where an image is deployed, helping you prioritize fixes based on usage and severity.

Second, we’re extending vulnerability scanning support to minimal base images including scratch, distroless, and Chainguard images, and extending support for additional ecosystems including Go toolchain, Oracle JDK & JRE, Amazon Corretto, Apache Tomcat, Apache httpd, WordPress (core, themes, plugins), and Puppeteer, helping teams maintain robust security even in highly optimized container environments.

Through continual monitoring and tracking of images running on containers, Amazon Inspector helps teams identify which container images are actively running in their environment and where they’re deployed, detecting Amazon ECR images running on containers in Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS), and any associated vulnerabilities. This solution supports teams managing Amazon ECR images across single AWS accounts, cross-account scenarios, and AWS Organizations with delegated administrator capabilities, enabling centralized vulnerability management based on container images running patterns.

Let’s see it in action
Amazon ECR image scanning helps identify vulnerabilities in your container images through enhanced scanning, which integrates with Amazon Inspector to provide automated, continual scanning of your repositories. To use this new feature you have to enable enhanced scanning through the Amazon ECR console, you can do it by following the steps in the Configuring enhanced scanning for images in Amazon ECR documentation page. I already have Amazon ECR enhanced scanning, so I don’t have to do any action.

In the Amazon Inspector console, I navigate to General settings and select ECR scanning settings from the navigation panel. Here, I can configure the new Image re-scan mode settings by choosing between Last in-use date and Last pull date. I leave it as it is by default with Last in-use date and set the Image last in use date to 14 days. These settings make it so that Inspector monitors my images based on when they were running in the last 14 days in my Amazon ECS or Amazon EKS environments. After applying these settings, Amazon Inspector starts tracking information about images running on containers and incorporating it into vulnerability findings, helping me focus on images actively running in containers in my environment.

After it’s configured, I can view information about images running on containers in the Details menu, where I can see last in-use and pull dates, along with EKS pods or ECS tasks count.

When selecting the number of Deployed ECS Tasks/EKS Pods, I can see the cluster ARN, last use dates, and Type for each image.

For cross-account visibility demonstration, I have a repository with EKS pods deployed in two accounts. In the Resources coverage menu, I navigate to Container repositories, select my repository name and choose the Image tag. As before, I can see the number of deployed EKS pods/ECS tasks.

When I select the number of deployed EKS pods/ECS tasks, I can see that it is running in a different account.

In the Findings menu, I can review any vulnerabilities, and by selecting one, I can find the Last in use date and Deployed ECS Tasks/EKS Pods involved in the vulnerability under Resource affected data, helping me prioritize remediation based on actual usage.

In the All Findings menu, you can now search for vulnerabilities within account management, using filters such as Account ID, Image in use count and Image last in use at.

Key features and considerations
Monitoring based on container image lifecycle – Amazon Inspector now determines image activity based on: image push date ranging duration 14, 30, 60, 90, or 180 days or lifetime, image pull date from 14, 30, 60, 90, or 180 days, stopped duration from never to 14, 30, 60, 90, or 180 days and status of image running on the container. This flexibility lets organizations tailor their monitoring strategy based on actual container image usage rather than only repository events. For Amazon EKS and Amazon ECS workloads, last in use, push and pull duration are set to 14 days, which is now the default for new customers.

Image runtime-aware finding details – To help prioritize remediation efforts, each finding in Amazon Inspector now includes the lastInUseAt date and InUseCount, indicating when an image was last running on the containers and the number of deployed EKS pods/ ECS tasks currently using it. Amazon Inspector monitors both Amazon ECR last pull date data and images running on Amazon ECS tasks or Amazon EKS pods container data for all accounts, updating this information at least once daily. Amazon Inspector integrates these details into all findings reports and seamlessly works with Amazon EventBridge. You can filter findings based on the lastInUseAt field using rolling window or fixed range options, and you can filter images based on their last running date within the last 14, 30, 60, or 90 days.

Comprehensive security coverage – Amazon Inspector now provides unified vulnerability assessments for both traditional Linux distributions and minimal base images including scratch, distroless, and Chainguard images through a single service. This extended coverage eliminates the need for multiple scanning solutions while maintaining robust security practices across your entire container ecosystem, from traditional distributions to highly optimized container environments. The service streamlines security operations by providing comprehensive vulnerability management through a centralized platform, enabling efficient assessment of all container types.

Enhanced cross-account visibility – Security management across single accounts, cross-account setups, and AWS Organizations is now supported through delegated administrator capabilities. Amazon Inspector shares images running on container information within the same organization, which is particularly valuable for accounts maintaining golden image repositories. Amazon Inspector provides all ARNs for Amazon EKS and Amazon ECS clusters where images are running, if the resource belongs to the account with an API, providing comprehensive visibility across multiple AWS accounts. The system updates deployed EKS pods or ECS tasks information at least one time daily and automatically maintains accuracy as accounts join or leave the organization.

Availability and pricing – The new container mapping capabilities are available now in all AWS Regions where Amazon Inspector is offered at no additional cost. To get started, visit the AWS Inspector documentation. For pricing details and Regional availability, refer to the AWS Inspector pricing page.

PS: Writing a blog post at AWS is always a team effort, even when you see only one name under the post title. In this case, I want to thank Nirali Desai, for her generous help with technical guidance, and expertise, which made this overview possible and comprehensive.

— Eli

How is the News Blog doing? Take this 1 minute survey!

New Amazon EC2 P6-B200 instances powered by NVIDIA Blackwell GPUs to accelerate AI innovations

2025-05-16 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/new-amazon-ec2-p6-b200-instances-powered-by-nvidia-blackwell-gpus-to-accelerate-ai-innovations/

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P6-B200 instances powered by NVIDIA B200 to address customer needs for high performance and scalability in artificial intelligence (AI), machine learning (ML), and high performance computing (HPC) applications.

Amazon EC2 P6-B200 instances accelerate a broad range of GPU-enabled workloads but are especially well-suited for large-scale distributed AI training and inferencing for foundation models (FMs) with reinforcement learning (RL) and distillation, multimodal training and inference, and HPC applications such as climate modeling, drug discovery, seismic analysis, and insurance risk modeling.

When combined with Elastic Fabric Adapter (EFAv4) networking, hyperscale clustering by EC2 UltraClusters, and advanced virtualization and security capabilities by AWS Nitro System, you can train and serve FMs with increased speed, scale, and security. These instances also deliver up to two times the performance for AI training (time to train) and inference (tokens/sec) compared to EC2 P5en instances.

You can accelerate time-to-market for training FMs and deliver faster inference throughput, which lowers inference cost and helps increase adoption of generative AI applications as well as increased processing performance for HPC applications.

EC2 P6-B200 instances specifications
New EC2 P6-B200 instances provide eight NVIDIA B200 GPUs with 1440 GB of high bandwidth GPU memory, 5th Generation Intel Xeon Scalable processors (Emerald Rapids), 2 TiB of system memory, and 30 TB of local NVMe storage.

Here are the specs for EC2 P6-B200 instances:

Instance size	GPUs (NVIDIA B200)	GPU memory (GB)	vCPUs	GPU Peer to peer (GB/s)	Instance storage (TB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
P6-b200.48xlarge	8	1440 HBM3e	192	1800	8 x 3.84 NVMe SSD	8 x 400	100

These instances feature up to 125 percent improvement in GPU TFLOPs, 27 percent increase in GPU memory size, and 60 percent increase in GPU memory bandwidth compared to P5en instances.

P6-B200 instances in action
You can use P6-B200 instances in the US West (Oregon) AWS Region through EC2 Capacity Blocks for ML. To reserve your EC2 Capacity Blocks, choose Capacity Reservations on the Amazon EC2 console.

Select Purchase Capacity Blocks for ML and then choose your total capacity and specify how long you need the EC2 Capacity Block for p6-b200.48xlarge instances. The total number of days that you can reserve EC2 Capacity Blocks is 1-14 days, 21 days, 28 days, or multiples of 7 up to 182 days. You can choose your earliest start date for up to 8 weeks in advance.

Now, your EC2 Capacity Block will be scheduled successfully. The total price of an EC2 Capacity Block is charged up front, and the price doesn’t change after purchase. The payment will be billed to your account within 12 hours after you purchase the EC2 Capacity Blocks. To learn more, visit Capacity Blocks for ML in the Amazon EC2 User Guide.

When launching P6-B200 instances, you can use AWS Deep Learning AMIs (DLAMI) to support EC2 P6-B200 instances. DLAMI provides ML practitioners and researchers with the infrastructure and tools to quickly build scalable, secure, distributed ML applications in preconfigured environments.

To run instances, you can use AWS Management Console, AWS Command Line Interface (AWS CLI) or AWS SDKs.

You can integrate EC2 P6-B200 instances seamlessly with various AWS managed services such as Amazon Elastic Kubernetes Services (Amazon EKS), Amazon Simple Storage Service (Amazon S3), and Amazon FSx for Lustre. Support for Amazon SageMaker HyperPod is also coming soon.

Now available
Amazon EC2 P6-B200 instances are available today in the US West (Oregon) Region and can be purchased as EC2 Capacity blocks for ML.

Give Amazon EC2 P6-B200 instances a try in the Amazon EC2 console. To learn more, refer to the Amazon EC2 P6 instance page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

How is the News Blog doing? Take this 1 minute survey!

Accelerate CI/CD pipelines with the new AWS CodeBuild Docker Server capability

2025-05-15 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/accelerate-ci-cd-pipelines-with-the-new-aws-codebuild-docker-server-capability/

Starting today, you can use AWS CodeBuild Docker Server capability to provision a dedicated and persistent Docker server directly within your CodeBuild project. With Docker Server capability, you can accelerate your Docker image builds by centralizing image building to a remote host, which reduces wait times and increases overall efficiency.

From my benchmark, with this Docker Server capability, I reduced the total building time by 98 percent, from 24 minutes and 54 seconds to 16 seconds. Here’s a quick look at this feature from my AWS CodeBuild projects.

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages ready for deployment. Building Docker images is one of the most common use cases for CodeBuild customers, and the service has progressively improved this experience over time by releasing features such as Docker layer caching and reserved capacity features to improve Docker build performance.

With the new Docker Server capability, you can reduce build time for your applications by providing a persistent Docker server with consistent caching. When enabled in a CodeBuild project, a dedicated Docker server is provisioned with persistent storage that maintains your Docker layer cache. This server can handle multiple concurrent Docker build operations, with all builds benefiting from the same centralized cache.

Using AWS CodeBuild Docker Server
Let me walk you through a demonstration that showcases the benefits with the new Docker Server capability.

For this demonstration, I’m building a complex, multi-layered Docker image based on the official AWS CodeBuild curated Docker images repository, specifically the Dockerfile for building a standard Ubuntu image. This image contains numerous dependencies and tools required for modern continuous integration and continuous delivery (CI/CD) pipelines, making it a good example of the type of large Docker builds that development teams regularly perform.


# Copyright 2020-2024 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Amazon Software License (the "License"). You may not use this file except in compliance with the License.
# A copy of the License is located at
#
#    http://aws.amazon.com/asl/
#
# or in the "license" file accompanying this file.
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied.
# See the License for the specific language governing permissions and limitations under the License.
FROM public.ecr.aws/ubuntu/ubuntu:20.04 AS core

ARG DEBIAN_FRONTEND="noninteractive"

# Install git, SSH, Git, Firefox, GeckoDriver, Chrome, ChromeDriver,  stunnel, AWS Tools, configure SSM, AWS CLI v2, env tools for runtimes: Dotnet, NodeJS, Ruby, Python, PHP, Java, Go, .NET, Powershell Core,  Docker, Composer, and other utilities
COMMAND REDACTED FOR BREVITY
# Activate runtime versions specific to image version.
RUN n $NODE_14_VERSION
RUN pyenv  global $PYTHON_39_VERSION
RUN phpenv global $PHP_80_VERSION
RUN rbenv  global $RUBY_27_VERSION
RUN goenv global  $GOLANG_15_VERSION

# Configure SSH
COPY ssh_config /root/.ssh/config
COPY runtimes.yml /codebuild/image/config/runtimes.yml
COPY dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
COPY legal/bill_of_material.txt /usr/share/doc/bill_of_material.txt
COPY amazon-ssm-agent.json /etc/amazon/ssm/amazon-ssm-agent.json

ENTRYPOINT ["/usr/local/bin/dockerd-entrypoint.sh"]

This Dockerfile creates a comprehensive build environment with multiple programming languages, build tools, and dependencies – exactly the type of image that would benefit from persistent caching.

In the build specification (buildspec), I use the docker buildx build . command:

version: 0.2
phases:
  build:
    commands:
      - cd ubuntu/standard/5.0
      - docker buildx build -t codebuild-ubuntu:latest .

To enable the Docker Server capability, I navigate to the AWS CodeBuild console and select Create project. I can also enable this capability when editing existing CodeBuild projects.

I fill in all details and configuration. In the Environment section, I select Additional configuration.

Then, I scroll down and find Docker server configuration and select Enable docker server for this project. When I select this option, I can choose a compute type configuration for the Docker server. When I’m finished with the configurations, I create this project.

Now, let’s see the Docker Server capability in action.

The initial build takes approximately 24 minutes and 54 seconds to complete because it needs to download and compile all dependencies from scratch. This is expected for the first build of such a complex image.

For subsequent builds with no code changes, the build takes only 16 seconds and that shows 98% reduction in build time.

Looking at the logs, I can see that with Docker Server, most layers are pulled from the persistent cache:

The persistent caching provided by the Docker Server maintains all layers between builds, which is particularly valuable for large, complex Docker images with many layers. This demonstrates how Docker Server can dramatically improve throughput for teams running numerous Docker builds in their CI/CD pipelines.

Additional things to know
Here are a couple of things to note:

Architecture support – The feature is available for both x86 (Linux) and ARM builds.
Pricing – To learn more about pricing for Docker Server capability, refer to the AWS CodeBuild pricing page.
Availability – This feature is available in all AWS Regions where AWS CodeBuild is offered. For more information about the AWS Regions where CodeBuild is available, see the AWS Regions page.

You can learn more about the Docker Server feature in the AWS CodeBuild documentation.

Happy building! —

Donnie Prakoso

How is the News Blog doing? Take this 1 minute survey!

Accelerate the modernization of Mainframe and VMware workloads with AWS Transform

2025-05-15 Matheus Guimaraes

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/accelerate-the-modernization-of-mainframe-and-vmware-workloads-with-aws-transform/

Generative AI has brought many new possibilities to organizations. It has equipped them with new abilities to retire technical debt, modernize legacy systems, and build agile infrastructure to help unlock the value that is trapped in their internal data. However, many enterprises still rely heavily on legacy IT infrastructure, particularly mainframes and VMware-based systems. These platforms have been the backbone of critical operations for decades, but they hinder organizations’ ability to innovate, scale effectively, and reduce technical debt in an era where cloud-first strategies dominate. The need to modernize these workloads is clear, but the journey has traditionally been complex and risky.

The complexity spans multiple dimensions. Financially, organizations face mounting licensing costs and expensive migration projects. Technically, they must untangle legacy dependencies while meeting compliance requirements. Organizationally, they must manage the transition of teams who’ve built careers around legacy systems and navigate undocumented institutional knowledge.

AWS Transform directly addresses these challenges with purpose-built agentic AI that accelerates and de-risks your legacy modernization. It automates the assessment, planning, and transformation of both mainframe and VMware workloads into cloud based architectures, streamlining the entire process. Through intelligent insights, automated code transformation, and human-in-the-loop workflows, organizations can now tackle even the most challenging modernization projects with greater confidence and efficiency.

Mainframe workload migration
AWS Transform for mainframe is the first agentic AI service for modernizing mainframe workloads at scale. The specialized mainframe agent accelerates mainframe modernization by automating complex, resource-intensive tasks across every phase of modernization — from initial assessment to final deployment. It streamlines the migration of legacy applications built on IBM z/OS Db2, including COBOL, CICS, DB2, and VSAM, to modern cloud environments–cutting modernization timelines from years to months.

Let’s look at a few examples of how AWS Transform can help you through different aspects of the migration process.

Code analysis – AWS Transform provides comprehensive insights into your codebase, automatically examining mainframe codebases, creating detailed dependency graphs, measuring code complexity, and identifying component relationships

Documentation – AWS Transform for mainframe creates comprehensive technical and functional documentation of mainframe applications, preserving critical knowledge about features, program logic, and data flows. You can interact with the generated documentation through an AI-powered chat interface to discover and retrieve information quickly.

Business rule extraction – AWS Transform extracts and presents complex logic in plain language so you can gain visibility into business processes embedded within legacy applications. This enables both business and technical stakeholders to gain a greater understanding of application functionality.

Code decomposition – AWS Transform offers sophisticated code decomposition tools, including interactive dependency graphs and domain separation capabilities, enabling users to visualize and modify relationships between components while identifying key business functions. The solution also streamlines migration planning through an interactive wave sequence planner that considers user preferences to generate optimized migration strategies.

Modernization Wave Planning – With its specialized agent, AWS Transform for mainframe creates prioritized modernization wave sequences based on code and data dependencies, code volume, and business priorities. It enables modernization teams to make data-driven, customized migration plans that align to their specific organizational needs.

Code refactoring – AWS Transform can refactor millions of lines of mainframe code in minutes, converting COBOL, VSAM, and DB2 systems into modern Java Spring Boot applications while maintaining functional equivalence and transforming CICS transactions into web services and JCL batch processes into Groovy scripts. The solution provides high-quality output through configurable settings and bundled runtime capabilities, producing Java code that emphasizes readability, maintainability, and technical excellence.

Deployments – AWS Transform provides customizable deployment templates that streamline the deployment process through user-defined inputs. For added efficiency, the solution bundles the selected runtime version with the migrated application, enabling seamless deployment as a complete package.

By integrating intelligent documentation analysis, business rules extraction, and human-in-the-loop collaboration capabilities, AWS Transform helps organizations accelerate their mainframe transformation while reducing risk and maintaining business continuity.

VMware modernization
With rapid changes in VMware licensing and support model, organizations are increasingly exploring alternatives despite the difficulties associated with migrating and modernizing VMware workloads. This is aggravated by the fact that the accumulation of technical debt typically creates complex, poorly documented environments managed by multiple teams, leading to vendor lock-in and collaboration challenges that hinder migration efforts further.

AWS Transform is the first agentic AI service for VMware modernization of its kind that helps you to overcome those difficulties. It can offset risk and accelerate the modernization of VMware workloads by automating application discovery, dependency mapping, migration planning, network conversion, and EC2 instance optimization, reducing manual effort and accelerating cloud adoption.

The process is organized into four phases: inventory discovery, wave planning, network conversion, and server migration. It uses agentic AI capabilities to analyze and map complex VMware environments, converting network configurations into AWS built-in constructs and helps you to orchestrate dependency-aware migration waves for seamless cutovers. In addition, it also provides a collaborative web interface that keeps AWS teams, partners, and customers aligned throughout the modernization journey.

Let’s take a quick tour to see how this works.

Setting up
Before you can start using the service, you must first enable it by navigating to the AWS Transform console. AWS Transform requires AWS IAM Identity Center (IdC) to manage users and setup appropriate permissions. If you don’t yet have IdC set up it will ask you to configure it first and return to the AWS Transform console later to continue the process.

With IdC available, you can then proceed to choosing the encryption settings. AWS Transform gives you the option to use a default AWS managed key or you can use your own custom keys through AWS Key Management Service (AWS KMS).

After completing this step, AWS Transform will be enabled. You can manage admin access to the console by navigating to Users and using the search box to find them. You must create users or groups in IdC first if they don’t already exist. The service console will help admins provision users who will get access to the web app. Each provisioned user receives an email with a link to set password and get their personalized URL for the webapp.

You interact with AWS Transform through a dedicated web experience. To get the url, navigate to Settings where you can check your configurations and copy the links to the AWS Transform web experience where you and your teams can start using the service.

Discovery
AWS Transform can discover your VMware environment either automatically through AWS Application Discovery Service collectors or you can provide your own data by importing existing RVTools export files.

To get started, choose the Create or select connectors task and provide the account IDs for one or more AWS accounts that will be used for discovery. This will generate links that you can follow to authorize each account for usage within AWS Transform. You can then move on to the Perform discovery task, where you can choose to install AWS Application Discovery Service collectors or upload your own files such as exports from RVTools.

Provisioning
The steps for the provisioning phase are similar to the ones described earlier for discovery. You connect target AWS accounts by entering their account IDs and validating the authorization requests which will then enable the next steps such as the Generate VPC configuration step. Here, you can import your RVTools files or NSX exports from Import/Export from NSX, if applicable, and enable AWS Transform to understand your networking requirements.

You should then continue working through the job plan until you reach the point where it’s ready to deploy your Amazon Virtual Private Cloud (Amazon VPC). All the infrastructure as code (IaC) code is stored in Amazon Simple Storage Service (Amazon S3) buckets in the target AWS account.

Review the proposed changes and, if you’re happy, start the deployment process of the AWS resources to the target accounts.

Deployment
AWS Transform requires you to set up AWS Application Migration Service (MGN) in the target AWS accounts to automate the migration process. Choose the Initiate VM migration task and use the link to navigate to the service console, then follow the instructions to configure it.

After setting up service permissions, you’ll proceed to the implementation phase of the waves created by AWS Transform and start the migration process. For each wave, you’ll first be asked to make various choices such as setting the sizing preference and tenancy for the Amazon Elastic Compute Cloud (Amazon EC2) instances. Confirm your selections and continue following the instructions given by AWS Transform until you reach the Deploy replication agents stage, where you can start the migration for that wave.

After you start the waves migration process, you can switch to the dashboard at any time to check on progress.

With its agentic AI capabilities, AWS Transform offers a powerful solution for accelerating and de-risking mainframe and VMware modernization workloads. By automating complex assessment and transformation processes, AWS Transform reduces the time associated with legacy system migration while minimizing the potential for errors and business disruption enabling more agile, efficient, and future-ready IT environments within your organization.

Things to know
Availability – AWS Transform for mainframe is available in US East (N. Virginia) and Europe (Frankfurt) Regions. AWS Transform for VMware offers different availability options for data collection and migrations. Please refer to the AWS Transform for VMware FAQ for more details.

Pricing – Currently, we offer our core features—including assessment and transformation—at no cost to AWS customers.

Here are a few links for further reading.

Dive deeper into mainframe modernization and learn more about about AWS Transform for mainframe.

Explore more about VMware modernization and how to get started with your VMware migration journey.

Check out this interactive demo of AWS Transform for mainframe and this interactive demo of AWS Transform for VMware.

— Matheus Guimaraes | @codingmatheus

How is the News Blog doing? Take this 1 minute survey!

AWS Transform for .NET, the first agentic AI service for modernizing .NET applications at scale

2025-05-15 Prasad Rao

Post Syndicated from Prasad Rao original https://aws.amazon.com/blogs/aws/aws-transform-for-net-the-first-agentic-ai-service-for-modernizing-net-applications-at-scale/

I started my career as a .NET developer and have seen .NET evolve over the last couple of decades. Like many of you, I also developed multiple enterprise applications in .NET Framework that ran only on Windows. I fondly remember building my first enterprise application with .NET Framework. Although it served us well, the technology landscape has significantly shifted. Now that there is an open source and cross-platform version of .NET that can run on Linux, these legacy enterprise applications built on .NET Framework need to be ported and modernized.

The benefits of porting to Linux are compelling: applications cost 40 percent less to operate because they save on Windows licensing costs, run 1.5–2 times faster with improved performance, and handle growing workloads with 50 percent better scalability. Having helped port several applications, I can say the effort is worth the rewards.

However, porting .NET Framework applications to cross-platform .NET is a labor-intensive and error-prone process. You have to perform multiple steps, such as analyzing the codebase, detecting incompatibilities, implementing fixes while porting the code, and then validating the changes. For enterprises, the challenge becomes even more complex because they might have hundreds of .NET Framework applications in their portfolio.

At re:Invent 2024, we previewed this capability as Amazon Q Developer transformation capabilities for .NET to help port your .NET applications at scale. The experience is available as a unified web experience for at-scale transformation and within your integrated development environment (IDE) for individual project and solution porting.

Now that we’ve incorporated your valuable feedback and suggestions, we’re excited to announce today the general availability of AWS Transform for .NET. We’ve also added new capabilities to support projects with private NuGet packages, port model-view-controller (MVC) Razor views to ASP .NET Core Razor views, and execute the ported unit tests.

I’ll expand on the key new capabilities in a moment, but let’s first take a quick look at the two porting experiences of AWS Transform for .NET.

Large-scale porting experience for .NET applications
Enterprise digital transformation is typically driven by central teams responsible for modernizing hundreds of applications across multiple business units. Different teams have ownership of different applications and their respective repositories. Success requires close coordination between these teams and the application owners and developers across business units. To accelerate this modernization at scale, AWS Transform for .NET provides a web experience that enables teams to connect directly to source code repositories and efficiently transform multiple applications across the organization. For select applications requiring dedicated developer attention, the same agent capabilities are available to developers as an extension for Visual Studio IDE.

Let’s start by looking at how the web experience of AWS Transform for .NET helps port hundreds of .NET applications at scale.

Web experience of AWS Transform for .NET
To get started with the web experience of AWS Transform, I onboard using the steps outlined in the documentation, sign in using my credentials, and create a job for .NET modernization.

AWS Transform for .NET creates a job plan, which is a sequence of steps that the agent will execute to assess, discover, analyze, and transform applications at scale. It then waits for me to set up a connector to connect to my source code repositories.

After the connector is in place, AWS Transform begins discovering repositories in my account. It conducts an assessment focused on three key areas: repository dependencies, required private packages and third-party libraries, and supported project types within your repositories.

Based on this assessment, it generates a recommended transformation plan. The plan orders repositories according to their last modification dates, dependency relationships, private package requirements, and the presence of supported project types.

AWS Transform for .NET then prepares for the transformation process by requesting specific inputs, such as the target branch destination, target .NET version, and the repositories to be transformed.

To select the repositories to transform, I have two options: use the recommended plan or customize the transformation plan by selecting repositories manually. For selecting repositories manually, I can use the UI or download the repository mapping and upload the customized list.

AWS Transform for .NET automatically ports the application code, builds the ported code, executes unit tests, and commits the ported code to a new branch in my repository. It provides a comprehensive transformation summary, including modified files, test outcomes, and suggested fixes for any remaining work.

While the web experience helps accelerate large-scale porting, some applications may require developer attention. For these cases, the same agent capabilities are available in the Visual Studio IDE.

Visual Studio IDE experience of AWS Transform for .NET
Now, let’s explore how AWS Transform for .NET works within Visual Studio.

To get started, I install the latest version of AWS Toolkit extension for Visual Studio and set up the prerequisites.

I open a .NET Framework solution, and in the Solution Explorer, I see the context menu item Port project with AWS Transform for an individual project.

I provide the required inputs, such as the target .NET version and the approval for the agents to autonomously transform code, execute unit tests, generate a transformation summary, and validate Linux-readiness.

I can review the code changes made by the agents locally and continue updating my codebase.

Let’s now explore some of the key new capabilities added to AWS Transform for .NET.

Support for projects with private NuGet package dependencies
During preview, only projects with public NuGet package dependencies were supported. With general availability, we now support projects with private NuGet package dependencies. This has been one of the most requested features during the preview.

The feature I really love is that AWS Transform can detect cross-repository dependencies. If it finds the source code of my private NuGet package, it automatically transforms that as well. However, if it can’t locate the source code, in the web experience, it provides me the flexibility to upload the required NuGet packages.

AWS Transform displays the missing package dependencies that need to be resolved. There are two ways to do this: I can either use the provided PowerShell script to create and upload packages, or I can build the application locally and upload the NuGet packages from the packages folder in the solution directory.

After I upload the missing NuGet packages, AWS Transform is able to resolve the dependencies. It’s best to provide both the .NET Framework and cross platform .NET versions of the NuGet packages. If the cross platform .NET version is not available, then at a minimum the .NET Framework version is required for AWS Transform to add it as an assembly reference and proceed for transformation.

Unit test execution
During preview, we supported porting unit tests from .NET Framework to cross-platform .NET. With general availability, we’ve also added support for executing unit tests after the transformation is complete.

After the transformation is complete and the unit tests are executed, I can see the results in the dashboard and view the status of the tests at each individual test project level.

Transformation visibility and summary
After the transformation is complete, I can download a detailed report in JSON format that gives me a list of transformed repositories, details about each repository, and the status of the transformation actions performed for each project within a repository. I can view the natural language transformation summary at the project level to understand AWS Transform output with project-level granularity. The summary provides me with an overview of updates along with key technical changes to the codebase.

Other new features
Let’s have a quick look at other new features we’ve added with general availability:

Support for porting UI layer – During preview, you could only port the business logic layers of MVC applications using AWS Transform, and you had to port the UI layer manually. With general availability, you can now use AWS Transform to port MVC Razor views to ASP.NET Core Razor views.
Expanded connector support – During preview, you could connect only to GitHub repositories. Now with general availability, you can connect to GitHub, GitLab, and Bitbucket repositories.
Cross repository dependency – When you select a repository for transformation, dependent repositories are automatically selected for transformation.
Download assessment report – You can download a detailed assessment report of the identified repositories in your account and private NuGet packages referenced in these repositories.
Email notifications with deep links – You’ll receive email notifications when a job’s status changes to completed or stopped. These notifications include deep links to the transformed code branches for review and continued transformation in your IDE.

Things to know
Some additional things to know are:

Regions – AWS Transform for .NET is generally available today in the Europe (Frankfurt) and US East (N. Virginia) Regions.
Pricing – Currently, there is no additional charge for AWS Transform. Any resources you create or continue to use in your AWS account using the output of AWS Transform will be billed according to their standard pricing. For limits and quotas, refer to the documentation.
.NET versions supported – AWS Transform for .NET supports transforming applications written using .NET Framework versions 3.5+, .NET Core 3.1, and .NET 5+, and the cross-platform .NET version, .NET 8.
Application types supported – AWS Transform for .NET supports porting C# code projects of the following types: console application, class library, unit tests, WebAPI, Windows Communication Foundation (WCF) service, MVC, and single-page application (SPA).
Getting started – To get started, visit AWS Transform for .NET User Guide.
Webinar – Join the webinar Accelerate .NET Modernization with Agentic AI to experience AWS Transform for .NET through a live demonstration.

– Prasad

How is the News Blog doing? Take this 1 minute survey!

AWS Weekly Roundup: South America expansion, Q Developer in OpenSearch, and more (May 12, 2025)

2025-05-12 Micah Walter

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-south-america-expansion-q-developer-in-opensearch-and-more-may-12-2025/

I’ve always been fascinated by how quickly we’re able to stand up new Regions and Availability Zones at AWS. Today there are 36 launched Regions and 114 launched Availability Zones. That’s amazing!

This past week at AWS was marked by significant expansion to our global infrastructure. The announcement of a new Region in the works for South America means customers will have more options for meeting their low latency and data residency requirements. Alongside the expansion, AWS announced the availability of numerous instance types in additional Regions.

In addition to the infrastructure expansion, AWS is also expanding the reach of Amazon Q Developer into Amazon OpenSearch Service.

Last week’s launches

Introducing Amazon Q Developer in Amazon OpenSearch Service — Amazon Q Developer is now integrated with Amazon OpenSearch Service, providing AI-powered assistance to help developers quickly build search applications with relevant code suggestions and troubleshooting guidance.
In the works – AWS South America (Chile) Region — AWS has announced plans for a new South America (Chile) Region, expanding its global infrastructure to provide lower latency and enhanced data residency options for customers in that geographic region.
Introducing the AWS Zero Trust Accelerator for Government — The new AWS Zero Trust Accelerator for Government helps public sector organizations implement zero trust security architectures more efficiently, with specialized tools and guidance for compliance requirements.
Accelerate the transfer of data from an Amazon EBS snapshot to a new EBS volume — AWS has announced the general availability of Amazon EBS Provisioned Rate for Volume Initialization, a new feature that allows users to accelerate data transfer from EBS snapshots to new EBS volumes at specified rates between 100 MiB/s and 300 MiB/s.

Instance announcements

AWS expanded instance availability for an array of instance types across additional Regions.

Amazon EC2 C8gd, M8gd, and R8gd instances are now available in AWS Region Europe (Frankfurt) — Amazon EC2 C8gd, M8gd, and R8gd instances are now available in the AWS Europe (Frankfurt) Region, bringing Graviton-powered compute instances with local NVMe-based SSD storage to more customers.
Amazon EC2 R7g instances are now available in additional AWS Regions — Amazon EC2 R7g memory-optimized instances powered by AWS Graviton3 processors are now available in additional AWS Regions, expanding deployment options for memory-intensive workloads.
Amazon EC2 R7g instances are now available in AWS GovCloud (US-East) — Amazon EC2 R7g instances are now available in AWS GovCloud (US-East), bringing high-performance, memory-optimized compute options to government and regulated workloads.
Amazon EC2 X2idn instances now available in AWS Israel (Tel Aviv) Region — Amazon EC2 X2idn memory-optimized instances with Intel Xeon Scalable processors are now available in the AWS Israel (Tel Aviv) Region, supporting in-memory databases and high-performance computing workloads.
Amazon EC2 P5en instances are now available in the AWS US West (N. California) Region — Amazon EC2 P5en instances, designed for high-performance ML training and generative AI inferencing, are now available in the AWS US West (N. California) Region.

Additional updates

AWS Systems Manager adds customization options for onboarding configuration — AWS Systems Manager now offers expanded customization options for onboarding configuration, giving administrators more flexibility when setting up and managing their resources.
Amazon MSK enables seamless certificate renewals on MSK Provisioned clusters — Amazon MSK enables seamless certificate renewals on MSK Provisioned clusters, simplifying security management for Apache Kafka environments.
AWS Resource Explorer supports 41 new resource types — AWS Resource Explorer has expanded to support 41 new resource types, making it easier to discover and visualize more AWS resources across your accounts.
Amazon Managed Service for Prometheus now available in the AWS Canada (Central) Region — Amazon Managed Service for Prometheus is now available in the AWS Canada (Central) Region, offering container monitoring capabilities to more customers in North America.
AWS End User Messaging is now available in Mexico (Central) — AWS End User Messaging is now available in Mexico (Central), allowing businesses to send SMS, voice, and email messages to customers in the geographic region.
Amazon WorkSpaces is now available in AWS Europe (Paris) Region Amazon WorkSpaces, AWS’s managed desktop virtualization service, is now available in AWS Europe (Paris) Region, expanding virtual desktop infrastructure options for European customers.

Upcoming events

We are in the middle of AWS Summit season! AWS Summits run throughout the summer in cities all around the world. Be sure to check the calendar to find out when a AWS Summit is happening near you. Here are the remaining Summits for May, 2025.

Seoul, South Korea – May 14th – 15th
Dubai, United Arab Emirates – May 21st, 2025
Tel Aviv, Israel – May 28th, 2025
Singapore – May 29th, 2025

How is the News Blog doing? Take this 1 minute survey!

Save big on OpenSearch: Unleashing Intel AVX-512 for binary vector performance

2025-05-08 Akash Shankaran, Mulugeta Mammo, Noah Staveley, Assane Diop

Post Syndicated from Akash Shankaran, Mulugeta Mammo, Noah Staveley, Assane Diop original https://aws.amazon.com/blogs/big-data/save-big-on-opensearch-unleashing-intel-avx-512-for-binary-vector-performance/

With OpenSearch version 2.19, Amazon OpenSearch Service now supports hardware-accelerated enhanced latency and throughput for binary vectors. When you choose the latest-generation, Intel Xeon instances for your data nodes, OpenSearch uses AVX-512 acceleration to bring up to 48% throughput improvement vs. previous-generation R5 instances, and 10% throughput improvement compared with OpenSearch 2.17 and below. There’s no need to change your settings. You will simply see improvements when you upgrade to OpenSearch 2.19 and use c7i, m7i, and R7i instances.

In this post, we discuss the improvements these advanced processors provide to your OpenSearch workloads, and how it can help you lower your total cost of ownership (TCO).

Difference between full precision and binary vectors

When you use OpenSearch Service for semantic search, you create vector embeddings that you store in OpenSearch. OpenSearch’s k-nearest neighbors (k-NN) plugin provides engines—Facebook AI Similarity Search (FAISS), Non-Metric Space Library (NMSLib), and Apache Lucene—and algorithms—Hierarchical Navigable Small World (HNSW) and Inverted File (IVF)—that store embeddings and compute nearest neighbor matches.

Vector embeddings are high-dimension arrays of 32-bit floating-point numbers (FP32). Large language models (LLMs), foundation models (FMs), and other machine learning (ML) models generate vector embeddings from their inputs. A typical, 384-dimension embedding takes 384 * 4 = 1,536 B. As the number of vectors in the solution grows into the millions (or billions), it is costly to store and work with that much data.

OpenSearch Service supports binary vectors. These vectors use 1 bit to store each dimension. A 384-dimension, binary embedding takes 384 / 8 b = 48 B to store. Of course, in reducing the number of bits, you also lose information. Binary vectors don’t provide recall that is as accurate as full-precision vectors. In trade, binary vectors are substantially less costly and provide substantially better latency.

Hardware acceleration: AVX-512 and popcount instructions

Binary vectors rely on Hamming distance to measure similarity. The Hamming distance between 2-bit strings is the number of positions where corresponding bits differ. The Hamming distance between two binary vectors is the sum of the Hamming distances for the bytes in those vectors. Hamming distance relies on a technique called popcount (population count), which is briefly described in the next section.

For example, for finding the Hamming distance between 5 and 3:

5 = 101
3 = 011
Differences at two positions (bitwise XOR): 101 ⊕ 011 = 110 (2 ones)

Therefore, Hamming distance (5, 3) = 2.

Popcount is an operation that counts the number of 1 bits in a binary input. The Hamming distance between two binary inputs is directly equivalent to calculating the popcount of their bitwise XOR result. The AVX-512 accelerator has a native popcount operation, which makes popcount and Hamming distance calculations fast.

OpenSearch 2.19 integrates advanced Intel AVX-512 instructions in the FAISS engine. When you use binary vectors with OpenSearch 2.19 engine in OpenSearch Service, OpenSearch can maximize performance on the latest Intel Xeon processors. The OpenSearch k-NN plugin with FAISS uses a specialized build mode, avx512_spr, that enhances the Hamming distance computation with the __mm512_popcnt_epi64 vector instruction. __mm512_popcnt_epi64 counts the number of logical 1 bits in eight 64-bit integers at once. This reduces the instruction pathlength—the number of instructions the CPU executes— by eight times. The benchmarks in the next sections demonstrate the improvements seen on OpenSearch binary vectors due to this optimization.

There is no special configuration required to take advantage of the optimization, because it’s enabled by default. The requirements to using the optimization are:

OpenSearch version 2.19 and above
Intel 4th Generation Xeon or newer instances—C7i, M7i, or R7i— for data nodes

Where do binary vector workloads spend the bulk of time?

To put our system through its paces, we created a test dataset of 10 million binary vectors. We chose the Hamming space for measuring distances between vectors because it’s particularly well-suited for binary data. This substantial dataset helped us generate enough stress on the system to pinpoint exactly where performance bottlenecks might occur. If you’re interested in the details, you can find the complete cluster configuration and index settings for this analysis in Appendix 2 at the end of this post.

The following profile analysis of binary vector-based workloads using a flame graph shows that the bulk of time is spent in the FAISS library computing Hamming distances. We observe up to 66% time spent on BinaryIndices in the FAISS library.

Benchmarks and Results

In the next sections, we look at the results of optimizing this logic and the benefits to OpenSearch workloads along two aspects:

Price-performance; with reduced CPU consumption, you might be able to reduce the instances in your domain
Performance gains due to the Intel popcount instruction

Price-performance and TCO gains for OpenSearch users

If you want to take advantage of the performance gains, we recommend the R7i instances, with a high memory:core ratio, for your data nodes. The following table shows the results of benchmarking with a 10-million-vector and 100-million-vector dataset and the resulting improvements on an R7i instance compared to an R5 instance. R5 instances support avx512 instructions, but not the advanced instructions present in avx512_spr. That is only available with R7i and newer Intel instances.

On average, we observed 20% gains on indexing throughput and up to 48% gains on search throughput comparing R5 and R7i instances. R7i instances are about 13% more costly than R5 instances. The price-performance favors the R7is. The 100-million-vector dataset showed slightly better results with search throughput improving more than 40%. In Appendix 1, we document the test configuration, and we present the tabular results in Appendix 3.

The following figures visualize the results with the 10-million-vector dataset.

The following figures visualize the results with the 100-million-vector dataset.

Performance gains due to popcount instruction in AVX-512

This section is for advanced users interested in knowing the extent of improvements the new avx512_spr provides and more details on where the performance gains are coming from. The OpenSearch configuration used in this experiment is documented in Appendix 2.

We ran an OpenSearch benchmark on R7i instances with and without the Hamming distance optimization. You can disable avx512_spr by setting knn.faiss.avx512_spr.disabled in your opensearch.yaml file, as described in SIMD optimization. The data shows that the feature provides a 10% throughput improvement on indexing and search and a 10% reduction in latency if the client load is constant.

The gain is due to the use of __mm512_popcnt_epi64 hardware instruction present on Intel processors, which results in a pathlength reduction for the workloads. The hotspot identified in the earlier section is optimized with code using the hardware instruction. This results in fewer CPU cycles spent to run the same workload and translates to a 10% speed-up for binary vector indexing and latency reduction for search workloads on OpenSearch.

The following figures visualize the benchmarking results.

Conclusion

Improving storage, memory, and compute is key to optimizing vector search. Binary vectors already offer storage and memory benefits over FP32/FP16. This post detailed how our improvements to Hamming distance calculations significantly improve compute performance by up to 48% when comparing R5 and R7i instances on AWS. Whereas binary vectors fall short on matching recall for FP32 counterparts, techniques such as oversampling and rescoring help with improving recall rates. If you’re handling massive datasets, compute costs become a major expense. By migrating to Intel’s R7i and newer offerings on AWS, we’ve demonstrated substantial reductions in infrastructure costs, making these processors a highly efficient solution for users.

Hamming distance with newer AVX-512 instructions support is available on OpenSearch starting with 2.19 and later. We encourage you to give it a try on the latest Intel instances in your preferred cloud environment.

The new instructions also provide additional opportunities to use hardware acceleration in other areas of vector search, such as quantization techniques of FP16 and BF16. We are also interested in exploring the use of other hardware accelerators to vector search, such as AMX and AVX-10.

About the Authors

Akash Shankaran is a Software Architect and Tech Lead in the Xeon software team at Intel. He works on pathfinding opportunities and enabling optimizations on OpenSearch.

Mulugeta Mammo is a Senior Software Engineer and currently leads the OpenSearch Optimization team at Intel.

Noah Staveley is a Cloud Development Engineer currently working in the OpenSearch Optimization team at Intel.

Assane Diop is a Cloud Development Engineer, and currently works in the OpenSearch Optimization team at Intel.

Naveen Tatikonda is a software engineer at AWS, working on the OpenSearch Project and Amazon OpenSearch Service. His interests include distributed systems and vector search.

Vamshi Vijay Nakkirtha is a software engineering manager working on the OpenSearch Project and Amazon OpenSearch Service. His primary interests include distributed systems.

Dylan Tong is a Senior Product Manager at Amazon Web Services. He leads the product initiatives for AI and machine learning (ML) on OpenSearch including OpenSearch’s vector database capabilities. Dylan has decades of experience working directly with customers and creating products and solutions in the database, analytics and AI/ML domain. Dylan holds a BSc and MEng degree in Computer Science from Cornell University.

Notices and disclaimers

Intel and the OpenSearch team collaborated on adding the Hamming distance feature. Intel contributed by designing and implementing the feature, and Amazon contributed by updating the toolchain, including compilers, release management, and documentation. Both teams collected data points showcased in the post.

Performance varies by use, configuration, and other factors. Learn more on the Performance Index website.

Your costs and results may vary.

Intel technologies might require enabled hardware, software, or service activation.

Appendix 1

The following table summarizes the test configuration for results in Appendix 3.

	`avx512`	`avx512_spr`
vector dimension	768
ef_construction	100
ef_search	100
primary shards	8
replica	1
data nodes	2
data node instance type	R5.4xl	R7i.4xl
vCPU	16
Cluster manager nodes	3
Cluster manager node instance type	c5.xl
data type	binary
space type	Hamming

Appendix 2

The following table summarizes the OpenSearch configuration used for this benchmarking.

	`avx512`	`avx512_spr`
OpenSearch version	2.19
engine	faiss
dataset	random-768-10M
vector dimension	768
ef_construction	256
ef_search	256
primary shards	4
replica	1
data nodes	2
cluster manager nodes	1
data node instance type	R7i.2xl
client instance	m6id.16xlarge
data type	binary
space type	Hamming
Indexing clients	20
query clients	20
force merge segments	1

Appendix 3

This appendix contains the results of the 10-million-vector and 100-million-vector dataset runs.

The following table summarizes the query results in queries per second (QPS).

				Query Throughput Without Forcemerge		Query Throughput with Forcemerge to 1 Segment
Dataset	Dimension	`avx512` / `avx512_spr`	Query Clients	Mean Throughput	Median Throughput	Mean Throughput	Median Throughput
random-768-10M	768	`avx512`	10	397.00	398.00	1321.00	1319.00
random-768-10M	768	`avx512_spr`	10	516.00	525.00	1542.00	1544.00
%gain	–	–	–	29.97	31.91	16.73	17.06

random-768-10M	768	`avx512`	20	424.00	426.00	1849.00	1853.00
random-768-10M	768	`avx512_spr`	20	597.00	600.00	2127.00	2127.00
%gain	–	–	–	40.81	40.85	15.04	14.79

random-768-100M	768	`avx512`	10	219	220	668	668
random-768-100M	768	`avx512_spr`	10	324	324	879	887
%gain	–	–	–	47.95	47.27	31.59	32.78
random-768-100M	768	`avx512`	20	234	235	756	757
random-768-100M	768	`avx512_spr`	20	338	339	1054	1062
%gain	–	–	–	44.44	44.26	39.42	40.29

The following table summarizes the indexing results.

				Indexing Throughput (documents/second)
Dataset	Dimension	`avx512` / `avx512_spr`	Indexing Clients	Mean Throughput	Median Throughput	Forcemerge (minutes)
random-768-10M	768	`avx512`	20	58729	57135	61
random-768-10M	768	`avx512_spr`	20	63595	65240	57
%gain	–	–		8.29	14.19	7.02

random-768-100M	768	`avx512`	16	28006	25381	682
random-768-100M	768	`avx512_spr`	16	33477	30581	634
%gain	–	–		19.54	20.49	7.04

Accelerate the transfer of data from an Amazon EBS snapshot to a new EBS volume

2025-05-07 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/accelerate-the-transfer-of-data-from-an-amazon-ebs-snapshot-to-a-new-ebs-volume/

Today we are announcing the general availability of Amazon Elastic Block Store (Amazon EBS) Provisioned Rate for Volume Initialization, a feature that accelerates the transfer of data from an EBS snapshot, a highly durable backup of volumes stored in Amazon Simple Storage Service (Amazon S3) to a new EBS volume.

With Amazon EBS Provisioned Rate for Volume Initialization, you can create fully performant EBS volumes within a predictable amount of time. You can use this feature to speed up the initialization of hundreds of concurrent volumes and instances. You can also use this feature when you need to recover from an existing EBS Snapshot and need your EBS volume to be created and initialized as quickly as possible. You can use this feature to quickly create copies of EBS volumes with EBS Snapshots in a different Availability Zone, AWS Region, or AWS account. Provisioned Rate for Volume Initialization for each volume is charged based on the full snapshot size and the specified volume initialization rate.

This new feature expedites the volume initialization process by fetching the data from an EBS Snapshot to an EBS volume at a consistent rate that you specify between 100 MiB/s and 300 MiB/s. You can specify this volume initialization rate at which the snapshot blocks are to be downloaded from Amazon S3 to the volume.

With specifying the volume initialization rate, you can create a fully performant volume in a predictable time, enabling increased operational efficiency and visibility on the expected time of completion. If you run utilities like fio/dd to expedite volume initialization for your workflows like application recovery and volume copy for testing and development, it will remove the operational burden of managing such scripts with the consistency and predictability to your workflows.

Get started with specifying the volume initialization rate
To get started, you can choose the volume initialization rate when you launch your EC2 instance or create your volume from the snapshot.

1. Create a volume in the EC2 launch wizard
When launching new EC2 instances in the launch wizard of EC2 console, you can enter a desired Volume initialization rate in the Storage (volumes) section.

You can also set the volume initialization rate when creating and modifying the EC2 Launch Templates.

In the AWS Command Line Interface (AWS CLI), you can add VolumeInitializationRate parameter to the block device mappings when call run-instances command.

aws ec2 run-instances \
    --image-id ami-0abcdef1234567890 \
    --instance-type t2.micro \
    --subnet-id subnet-08fc749671b2d077c \
    --security-group-ids sg-0b0384b66d7d692f9 \
    --key-name MyKeyPair \
    --block-device-mappings file://mapping.json

Contents of mapping.json. This example adds /dev/sdh an empty EBS volume with a size of 8 GiB.

[
    {
        "DeviceName": "/dev/sdh",
        "Ebs": {
            "VolumeSize": 8
            "VolumeType": "gp3",            
            "VolumeInitializationRate": 300
		 } 
     } 
]

To learn more, visit block device mapping options, which defines the EBS volumes and instance store volumes to attach to the instance at launch.

2. Create a volume from snapshots
When you create a volume from snapshots, you can also choose Create volume in the EC2 console and specify the Volume initialization rate.

Confirm your new volume with the initialization rate.

In the AWS CLI, you can use VolumeInitializationRate parameter and when calling create-volume command.

aws ec2 create-volume --region us-east-1 --cli-input-json '{
    "AvailabilityZone": "us-east-1a",
    "VolumeType": "gp3",
    "SnapshotId": "snap-07f411eed12ef613a",
    "VolumeInitializationRate": 300
}'

If the command is run successfully, you will receive the result below.

{
    "AvailabilityZone": "us-east-1a",
    "CreateTime": "2025-01-03T21:44:53.000Z",
    "Encrypted": false,
    "Size": 100,
    "SnapshotId": "snap-07f411eed12ef613a",
    "State": "creating",
    "VolumeId": "vol-0ba4ed2a280fab5f9",
    "Iops": 300,
    "Tags": [],
    "VolumeType": "gp2",
    "MultiAttachEnabled": false,
    "VolumeInitializationRate": 300
}

You can also set the volume initialization rate when replacing root volumes of EC2 instances and provisioning EBS volumes using the EBS Container Storage Interface (CSI) driver.

After creation of the volume, EBS will keep track of the hydration progress and publish an Amazon EventBridge notification for EBS to your account when the hydration completes so that they can be certain when their volume is fully performant.

To learn more, visit Create an Amazon EBS volume and Initialize Amazon EBS volumes in the Amazon EBS User Guide.

Now available
Amazon EBS Provisioned Rate for Volume Initialization is now available and supported for all EBS volume types today. You will be charged based on the full snapshot size and the specified volume initialization rate. To learn more, visit Amazon EBS Pricing page.

To learn more about Amazon EBS including this feature, take the free digital course on the AWS Skill Builder portal. Course includes use cases, architecture diagrams and demos.

Give this feature a try in the Amazon EC2 console today and send feedback to AWS re:Post for Amazon EBS or through your usual AWS Support contacts.

— Channy

How is the News Blog doing? Take this 1 minute survey!

AWS Weekly Roundup: Amazon Nova Premier, Amazon Q Developer, Amazon Q CLI, Amazon CloudFront, AWS Outposts, and more (May 5, 2025)

2025-05-05 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-nova-premier-amazon-q-developer-amazon-q-cli-amazon-cloudfront-aws-outposts-and-more-may-5-2025/

Last week I went to Thailand to attend the AWS Summit Bangkok. It was an energizing and exciting event. We hosted the Developer Lounge, where developers can meet, discuss ideas, enjoy lightning talks, win SWAGs at AWS Builder ID Prize Wheel, take a challenge at Amazon Q Developer Coding Challenge, or learn Generative AI at Learn Amazon Bedrock booth.

Here’s a quick look:

Thank you to AWS Heroes, AWS Community Builders, AWS User Group leaders and developers for your collaboration.

Coming up next in ASEAN is AWS Summit Singapore—make sure you don’t miss it by registering now.

Last Week’s Launches
Here are some launches last week that caught my attention:

Amazon Nova Premier Now Generally Available — Amazon Nova Premier, our most capable model for complex tasks and teacher for model distillation, is now generally available in Amazon Bedrock. It excels at complex tasks requiring deep context understanding and multistep planning, while processing text, images, and videos with a 1M token context length. With Nova Premier and Amazon Bedrock Model Distillation, you can create highly capable, cost-effective, and low-latency versions of Nova Pro, Lite, and Micro, for your specific needs.

Amazon Q Developer elevates the IDE experience with new agentic coding experience — This new interactive, agentic coding experience for Visual Studio Code allows Q Developer to intelligently take actions on behalf of the developer. Amazon Q Developer introduces an interactive coding experience in Visual Studio Code, offering real-time collaboration for coding, documentation, and testing. It provides transparent reasoning, and supports automated or step-by-step changes in multiple languages.

New Foundation Models in Amazon Bedrock — Amazon Bedrock expands its model offerings with two significant additions:
- Writer’s Palmyra X5 and X4 models feature extensive context windows (1M and 128K tokens respectively) and excel in complex reasoning for enterprise applications. They support multistep tool-calling and adaptive thinking with high reliability standards.
- Meta’s Llama 4 Scout 17B and Maverick 17B models offer natively multimodal capabilities using mixture-of-experts architecture for enhanced reasoning and image understanding. They support multiple languages and extended context processing, with simplified integration through the Bedrock Converse API.
Second-Generation AWS Outposts Racks Released — AWS announces the general availability of second-generation Outposts racks with significant enhancements including the latest x86 EC2 instances, simplified networking, and accelerated networking options. These improvements deliver doubled vCPU, memory, and network bandwidth, 40% better performance, and support for ultra-low latency workloads, making them ideal for demanding on-premises deployments.

Amazon CloudFront SaaS Manager Launches — Amazon CloudFront SaaS Manager helps SaaS providers and web hosting platforms efficiently manage content delivery across multiple customer domains. The service dramatically reduces operational complexity while providing high-performance content delivery and enterprise-grade security for every customer domain.

Extend the Amazon Q Developer CLI with Model Context Protocol (MCP) for Richer Context | Amazon Web Services — Amazon Q Developer CLI now supports Model Context Protocol (MCP), enabling integration with external data sources for context-aware responses. This give developers the ability to connect pre-built integrations or MCP Servers supporting stdio, enhancing code accuracy, data understanding, and query execution. The feature streamlines development tasks and will be extended to Amazon Q Developer IDE plugins soon.

Amazon Aurora Now Supports PostgreSQL 17 — Amazon Aurora now supports PostgreSQL 17.4, offering community improvements and Aurora-specific enhancements like optimized memory management and faster failovers. The release includes new features for Babelfish, security fixes, and updated extensions, available in all AWS Regions.
CloudWatch Introduces Tiered Pricing for Lambda Logs — Amazon CloudWatch launches tiered pricing for AWS Lambda logs and new delivery destinations. Pricing in US East starts at $0.50/GB for CloudWatch and $0.25/GB for S3 and Firehose, both tiering down to $0.05/GB. This update enhances flexibility in log management across all supporting Regions.
RDS for MySQL Updates Minor Versions — Amazon RDS for MySQL now supports minor versions 8.0.42 and 8.4.5, delivering security fixes, bug fixes, and performance improvements. Users can upgrade automatically during maintenance windows or use Blue/Green deployments for safer updates.
Amazon Bedrock Model Distillation Generally Available — Amazon Bedrock Model Distillation is now generally available, supporting new models like Amazon Nova and Claude 3.5. It enables smaller models to accurately predict function calling for Agents, delivering up to 500% faster responses and 75% lower costs with minimal accuracy loss for RAG use cases. The service includes automated workflows for data synthesis and student model training.
AI Search Flow Builder for Amazon OpenSearch Service — Amazon OpenSearch Service now offers an AI search flow builder for OpenSearch 2.19+ domains. This low-code designer enables creation of sophisticated AI-enhanced search flows using AWS and third-party services, supporting use cases like RAG, query rewriting, and semantic encoding.

From Community.AWS
Here’s my personal favorites posts from community.aws:

How to Generate AWS Architecture Diagrams Using Amazon Q CLI and MCP — Omshree Butani demonstrates how to quickly generate AWS Architecture Diagrams using Amazon Q CLI and Model Context Protocol (MCP), streamlining the architecture design process.
Implementing Nova Act MCP Server on ECS Fargate — Vivek V details implementing an Amazon Nova Act Model Context Protocol (MCP) server on ECS Fargate for browser automation. The solution includes architecture designs, deployment strategies, server/client implementation, Streamlit UI, AWS CDK infrastructure, and VS Code integration.
Leveraging Crossplane to build single-tenant SaaS control planes on top of Kubernetes — Yehuda Cohen explores leveraging Crossplane to build single-tenant SaaS control planes on Kubernetes. The article highlights how Crossplane extends Kubernetes’ declarative model to manage non-Kubernetes resources, enabling automated tenant provisioning and scalable cloud resource management.
How to Securely Display Objects from an S3 Bucket in a Browser — Osabutey-Anikon Theeophilus Lloyd shares techniques for securely displaying objects from Amazon S3 buckets in web browsers, focusing on proper security measures for browser-based access.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

AWS Summit — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Poland (6 May), Bengaluru (May 7 – 8), Hong Kong (May 8), Seoul (May 14-15), Singapore (May 29), and Sydney (June 4–5).
AWS re:Inforce – Mark your calendars for AWS re:Inforce (June 16–18) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. You can subscribe for event updates now!
AWS Partners Events – You’ll find a variety of AWS Partner events that will inspire and educate you, whether you are just getting started on your cloud journey or you are looking to solve new business challenges.
AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Yerevan, Armenia (May 24), Zurich, Switzerland (May 25), and Bengaluru, India (May 25).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

How is the News Blog doing? Take this 1 minute survey!

Amazon Q Developer in GitHub (in preview) accelerates code generation

2025-05-05 Matheus Guimaraes

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/amazon-q-developer-in-github-now-in-preview-with-code-generation-review-and-legacy-transformation-capabilities/

Starting today, you can now use Amazon Q Developer in GitHub in preview! This is fantastic news for the millions of developers who use GitHub on a daily basis, whether at work or for personal projects. They can now use Amazon Q Developer for feature development, code reviews, and Java code migration directly within the GitHub interface.

To demonstrate, I’m going to use Amazon Q Developer to help me create an application from zero called StoryBook Teller. I want this to be an ASP.Core website using .NET 9 that takes three images from the user and uses Amazon Bedrock with Anthropic’s Claude to generate a story based on them.

Let me show you how this works.

Installation

The first thing you need to do is install the Amazon Q Developer application in GitHub, and you can begin using it immediately without connecting to an AWS account.

You’ll then be presented with a choice to add it to all your repositories or select specific ones. In this case, I want to add it to my storybook-teller-demo repo, so I choose Only selected repositories and type in the name to find it.

This is all you need to do to make the Amazon Q Developer app ready to use inside your selected repos. You can verify that the app is installed by navigating to your GitHub account Settings and the app should be listed in the Applications page.

You can choose Configure to view permissions and add Amazon Q Developer to repositories or remove it at any time.

Now let’s use Amazon Q Developer to help us build our application.

Feature development
When Amazon Q Developer is installed into a repository, you can assign GitHub issues to the Amazon Q development agent to develop features for you. It will then generate code using the whole codebase in your repository as context as well as the issue’s description. This is why it’s important to list your requirements as accurately and clearly as possible in your GitHub issues, the same way that you should always strive for anyway.

I have created five issues in my StoryBook Teller repository that cover all my requirements for this app, from creating a skeleton .NET 9 project to implementing frontend and backend.

Let’s use Amazon Q Developer to develop the application from scratch and help us implement all these features!

To begin with, I want Amazon Q Developer to help me create the .NET project. To do this, I open the first issue, and in the Labels section, I find and select Amazon Q development agent.

That’s all there is to it! The issue is now assigned to Amazon Q Developer. After the label is added, the Amazon Q development agent automatically starts working behind the scenes providing progress updates through the comments, starting with one saying, I'm working on it.

As you might expect, the amount of time it takes will depend on the complexity of the feature. When it’s done, it will automatically create a pull request with all the changes.

The next thing I want to do is make sure that the generated code works, so I’m going to download the code changes and run the app locally on my computer.

I go to my terminal and type git fetch origin pull/6/head:pr-6 to get the code for the pull request it created. I double-check the contents and I can see that I do indeed have an ASP.Core project generated using .NET 9, as I expected.

I then run dotnet run and open the app with the URL given in the output.

Brilliant, it works! Amazon Q Developer took care of implementing this one exactly as I wanted based on the requirements I provided in the GitHub issue. Now that I have tested that the app works, I want to review the code itself before I accept the changes.

Code review
I go back to GitHub and open the pull request. I immediately notice that Amazon Q Developer has performed some automatic checks on the generated code.

This is great! It has already done quite a bit of the work for me. However, I want to review it before I merge the pull request. To do that, I navigate to the Files changed tab.

I review the code, and I like what I see! However, looking at the contents of .gitignore, I notice something that I want to change. I can see that Amazon Q Developer made good assumptions and added exclusion rules for Visual Studio (VS) Code files. However, JetBrains Rider is my favorite integrated development environment (IDE) for .NET development, so I want to add rules for it, too.

You can ask Amazon Q Developer to reiterate and make changes by using the normal code review flow in the GitHub interface. In this case, I add a comment to the .gitignore code saying, add patterns to ignore Rider IDE files. I then choose Start a review, which will queue the change in the review.

I select Finish your review and Request changes.

Soon after I submit the review, I’m redirected to the Conversation tab. Amazon Q Developer starts working on it, resuming the same feedback loop and encouraging me to continue with the review process until I’m satisfied.

Every time Q Developer makes changes, it will run the automated checks on the generated code. In this case, the code was somewhat straightforward, so it was expected that the automatic code review wouldn’t raise any issues. But what happens if we have more complex code?

Let’s take another example and use Amazon Q Developer to implement the feature for enabling image uploads on the website. I use the same flow I described in the previous section. However, I notice that the automated checks on the pull request flagged a warning this time, stating that the API generated to support image uploads on the backend is missing authorization checks effectively allowing direct public access. It explains the security risk in detail and provides useful links.

It then automatically generates a suggested code fix.

When it’s done, you can review the code and choose to Commit changes if you’re happy with the changes.

After fixing this and testing it, I’m happy with the code for this issue and move on applying the same process to other ones. I assign the Amazon Q development agent to each one of my remaining issues, wait for it to generate the code, and go through the iterative review process asking it to fix any issues for me along the way. I then test my application at the end of that software cycle and am very pleased to see that Amazon Q Developer managed to handle all issues, from project setup, to boilerplate code, to more complex backend and frontend. A true full-stack developer!

I did notice some things that I wanted to change along the way. For example, it defaulted to using the Invoke API to send the uploaded images to Amazon Bedrock instead of the Converse API. However, because I didn’t state this in my requirements, it had no way of knowing. This highlights the importance of being as precise as possible in your issue’s titles and descriptions to give Q Developer the necessary context and make the development process as efficient as possible.

Having said that, it’s still straightforward to review the generated code on the pull requests, add comments, and let the Amazon Q Developer agent keep working on changes until you’re happy with the final result. Alternatively, you can accept the changes in the pull request and create separate issues that you can assign to Q Developer later when you’re ready to develop them.

Code transformation
You can also transform legacy Java codebases to modern versions with Q Developer. Currently, it can update applications from Java 8 or Java 11 to Java 17, with more options coming in future releases.

The process is very similar to the one I demonstrated earlier in this post, except for a few things.

First, you need to create an issue within a GitHub repository containing a Java 8 or Java 11 application. The title and description don’t really matter in this case. It might even be a short title such as “Migration,” leaving the description empty. Then, on Labels, you assign the Amazon Q transform agent label to the issue.

Much like before, Amazon Q Developer will start working immediately behind the scenes before generating the code on a pull request that you can review. This time, however, it’s the Amazon Q transform agent doing the work which is specialized in code migration and will take all the necessary steps to analyze and migrate the code from Java 8 to Java 17.

Notice that it also needs a workflow to be created, as per the documentation. If you don’t have it enabled yet, it will display clear instructions to help you get everything set up before trying again.

As expected, the amount of time needed to perform a migration depends on the size and complexity of your application.

Conclusion
Using Amazon Q Developer in GitHub is like having a full-stack developer that you can collaborate with to develop new features, accelerate the code review process, and rely on to enhance the security posture and quality of your code. You can also use it to automate migration from Java 8 and 11 applications to Java 17 making it much easier to get started on that migration project that you might have been postponing for a while. Best of all, you can do all this from the comfort of your own GitHub environment.

Now available
You can now start using Amazon Q Developer today for free in GitHub, no AWS account setup needed.

Amazon Q Developer in GitHub is currently in preview.

— Matheus Guimaraes | codingmatheus

How is the News Blog doing? Take this 1 minute survey!

Amazon Q Developer elevates the IDE experience with new agentic coding experience

2025-05-02 Elizabeth Fuentes

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/amazon-q-developer-elevates-the-ide-experience-with-new-agentic-coding-experience/

Today, Amazon Q Developer introduces a new, interactive, agentic coding experience that is now available in the integrated development environments (IDE) for Visual Studio Code. This experience brings interactive coding capabilities, building upon existing prompt-based features. You now have a natural, real-time collaborative partner working alongside you while writing code, creating documentation, running tests, and reviewing changes.

Amazon Q Developer transforms how you write and maintain code by providing transparent reasoning for its suggestions and giving you the choice between automated modifications or step-by-step confirmation of changes. As a daily user of Amazon Q Developer command line interface (CLI) agent, I’ve experienced firsthand how Amazon Q Developer chat interface makes software development a more efficient and intuitive process. Having an AI-powered assistant only a q chat away in CLI has streamlined my daily development workflow, enhancing the coding process.

The new agentic coding experience in Amazon Q Developer in the IDE seamlessly interacts with your local development environment. You can read and write files directly, execute bash commands, and engage in natural conversations about your code. Amazon Q Developer comprehends your codebase context and helps complete complex tasks through natural dialog, maintaining your workflow momentum while increasing development speed.

Let’s see it in action
To begin using Amazon Q Developer for the first time, follow the steps in the Getting Started with Amazon Q Developer guide to access Amazon Q Developer. When using Amazon Q Developer, you can choose between Amazon Q Developer Pro, a paid subscription service, or Amazon Q Developer Free tier with AWS Builder ID user authentication.

For existing users, update to the new version. Refer to Using Amazon Q Developer in the IDE for activation instructions.

To start, I select the Amazon Q icon in my IDE to open the chat interface. For this demonstration, I’ll create a web application that transforms Jupiter notebooks from the Amazon Nova sample repository into interactive applications.

I send the following prompt: In a new folder, create a web application for video and image generation that uses the notebooks from multimodal-generation/workshop-sample as examples to create the applications. Adapt the code in the notebooks to interact with models. Use existing model IDs

Amazon Q Developer then examines the ﬁles: the README file, notebooks, notes, and everything that is in the folder where the conversation is positioned. In our case it’s at the root of the repository.

After completing the repository analysis, Amazon Q Developer initiates the application creation process. Following the prompt requirements, it requests permission to execute the bash command for creating necessary folders and files.

With the folder structure in place, Amazon Q Developer proceeds to build the complete web application.

In a few minutes, the application is complete. Amazon Q Developer provides the application structure and deployment instructions, which can be converted into a README file upon request in the chat.

During my initial attempt to run the application, I encountered an error. I described it in Spanish using Amazon Q chat.

Amazon Q Developer responded in Spanish and gave me the solutions and code modifications in Spanish! I loved it!

After implementing the suggested fixes, the application ran successfully. Now I can create, modify, and analyze images and videos using Amazon Nova through this newly created interface.

The preceding images showcase my application’s output capabilities. Because I asked to modify the video generation code in Spanish, it gave me the message in Spanish.

Things to know
Chatting in natural languages – Amazon Q Developer IDE supports many languages, including English, Mandarin, French, German, Italian, Japanese, Spanish, Korean, Hindi, and Portuguese. For detailed information, visit the Amazon Q Developer User Guide page.

Collaboration and understanding – The system examines your repository structure, files, and documentation while giving you the flexibility to interact seamlessly through natural dialog with your local development environment. This deep comprehension allows for more accurate and contextual assistance during development tasks.

Control and transparency – Amazon Q Developer provides continuous status updates as it works through tasks and lets you choose between automated code modifications or step-by-step review, giving you complete control over the development process.

Availability – Amazon Q Developer interactive, agentic coding experience is now available in the IDE for Visual Studio Code.

Pricing – Amazon Q Developer agentic chat is available in the IDE at no additional cost to both Amazon Q Developer Pro Tier and Amazon Q Developer Free tier users. For detailed pricing information, visit the Amazon Q Developer pricing page.

To learn more about getting started visit the Amazon Q Developer product web page.

— Eli

How is the News Blog doing? Take this 1 minute survey!

Amazon Nova Premier: Our most capable model for complex tasks and teacher for model distillation

2025-05-01 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-nova-premier-our-most-capable-model-for-complex-tasks-and-teacher-for-model-distillation/

Today we’re expanding the Amazon Nova family of foundation models announced at AWS re:Invent with the general availability of Amazon Nova Premier, our most capable model for complex tasks and teacher for model distillation.

Nova Premier joins the existing Amazon Nova understanding models available in Amazon Bedrock. Similar to Nova Lite and Pro, Premier can process input text, images, and videos (excluding audio). With its advanced capabilities, Nova Premier excels at complex tasks that require deep understanding of context, multistep planning, and precise execution across multiple tools and data sources. With a context length of one million tokens, Nova Premier can process extremely long documents or large code bases.

With Nova Premier and Amazon Bedrock Model Distillation, you can create highly capable, cost-effective, and low-latency versions of Nova Pro, Lite, and Micro, for your specific needs. For example, we used Nova Premier to distill Nova Pro for complex tool selection and API calling. The distilled Nova Pro had a 20% higher accuracy for API invocations compared to the base model and consistently matched the performance of the teacher, with the speed and cost benefits of Nova Pro.

Amazon Nova Premier benchmark evaluation
We evaluated Nova Premier on a broad range of benchmarks across text intelligence, visual intelligence, and agentic workflows. Nova Premier is the most capable model in the Nova family as measured across 17 benchmarks as shown in the table below.

Nova Premier is also comparable to the best non-reasoning models in the industry and is equal or better on approximately half of these benchmarks when compared to other models in the same intelligence tier. Details of these evaluations are in the technical report.

Nova Premier is also the fastest and the most cost-effective model in Amazon Bedrock for its intelligence tier. For further details and comparison on pricing, please refer to the Bedrock pricing page.

Nova Premier can also be used as a teacher model for distillation, which means you can transfer its advanced capabilities for a specific use case into smaller, faster, and more efficient models like Nova Pro, Micro, and Lite for production deployments.

Using Amazon Nova Premier
To get started with Nova Premier, you first need to request access to the model in the Amazon Bedrock console. Navigate to Model access in the navigation pane, find Nova Premier, and toggle access.

Once you have access, you can use Nova Premier through the Amazon Bedrock Converse API providing in input a list of messages from the user and the assistant. Messages can include text, images, and videos. Here’s an example of a straightforward invocation using the AWS SDK for Python (Boto3):

import boto3
import json

AWS_REGION = "us-east-1"
MODEL_ID = "us.amazon.nova-premier-v1:0"

bedrock_runtime = boto3.client('bedrock-runtime', region_name=AWS_REGION)
messages = [
    {
        "role": "user",
        "content": [
            {
                "text": "Explain the differences between vector databases and traditional relational databases for AI applications."
            }
        ]
    }
]

response = bedrock_runtime.converse(
    modelId=MODEL_ID,
    messages=messages
)

response_text = response["output"]["message"]["content"][-1]["text"]

print(response_text)

This example shows how Nova Premier can provide detailed explanations for complex technical questions. But the real power of Premier comes with its ability to handle sophisticated workflows.

Multi-agent collaboration use case
Let’s explore a more complex scenario that showcases how Nova Premier works a multi-agent collaboration architecture for investment research.

The equity research process typically involves multiple stages: identifying relevant data sources for specific investments, retrieving required information from those sources, and synthesizing the data into actionable insights. This process becomes increasingly complex when dealing with different types of financial instruments like stock indices, individual equities, and currencies.

We can build this type of application using multi-agent collaboration in Amazon Bedrock, with Nova Premier powering the supervisor agent that orchestrates the entire workflow. The supervisor agent analyzes the initial query (for example, “What are the emerging trends in renewable energy investments?”), breaks it down into logical steps, determines which specialized subagents to engage, and synthesizes the final response.

For this scenario, I’ve created a system with the following components:

A supervisor agent powered by Nova Premier
Multiple specialized subagents powered by Nova Pro, each focusing on different financial data sources
Tools that connect to financial databases, market analysis tools, and other relevant information sources

When I submit a query about emerging trends in renewable energy investments, the supervisor agent powered by Nova Premier does the following:

Analyzes the query to determine the underlying topics and sources to cover
Selects the appropriate subagents specific to those topics and sources
Each subagent retrieves their relevant economic indicators, technical analysis, and market sentiment data
The supervisor agent synthesizes this information into a comprehensive report for review by a financial professional

Utilizing Nova Premier in a multi-agent collaboration architecture such as this streamlines the financial professional’s work and helps them formulate their investment analysis faster. The following video provides a visual description of this scenario.

The key advantage of using Nova Premier for the supervisor role is its accuracy in coordinating complex workflows, so that the right data sources are consulted in the optimal sequence and each subagent receives in input the correct information for their work, resulting in higher quality insights.

Multi-agent collaboration with model distillation
Although Nova Premier provides the highest level of accuracy of its family of models, you might want to optimize latency and cost in production environments. This is where the strength of Nova Premier as a teacher model for distillation becomes interesting. Using Amazon Bedrock Model Distillation, we can customize Nova Micro from the results of Nova Premier for this specific investment research use case.

Unlike traditional fine-tuning that requires human feedback and labeled examples, with model distillation you can generate high-quality training data by having a teacher model produce the desired outputs, streamlining the data acquisition process.

The process to distill a model involves:

Generating synthetic training data by capturing input and output from Nova Premier runs across multiple financial instruments
Using this data as a reference to train a customized version of Nova Micro through custom fine-tuning tools
Evaluating the difference in latency and performance of the customized Micro model
Deploying the customized Micro model as the supervisor agent in production

With Amazon Bedrock, you can further streamline the process and use invocation logs for data preparation. To do that, you need to set the model invocation logging on and set up an Amazon Simple Storage Service (Amazon S3) bucket as the destination for the logs.

Customer voices
Some of our customers had early access to Nova Premier. This is what they shared with us:

“Amazon Nova Premier has been outstanding in its ability to execute interactive analysis workflows, while still being faster and nearly half the cost compared to other leading models in our tests,” said Curtis Allen, Senior Staff Engineer at Slack, a company bringing conversations, apps, and customers together in one place.

“Implementing new solutions built on top of Amazon Nova has helped us with our mission of democratizing finance for all,” said Dev Tagare, Head of AI and Data at Robinhood Markets, a company on a mission to democratize finance for all. “We’re particularly excited about the ability to explore new avenues like complex multi-agent collaborations that are not just highly performing but also cost effective and fast. The intelligence of Nova Premier and what it can transfer to the other models like Nova Micro, Nova Lite, and Nova Pro unlocks multi-agent collaboration at a performance, price, and speed that will make it accessible to everyday customers.”

“Accelerating real-world AI deployments—not just prototypes—requires the ability to build models that are specialized for the unique needs of real world applications,” said Henry Ehrenberg, co-founder of Snorkel AI, a technology company that empowers data scientists and developers to quickly turn data into accurate and adaptable AI applications. “We’re excited to see AWS pushing efficient model customization forward with Amazon Bedrock Model Distillation and Amazon Nova Premier. These new model capabilities have the potential to accelerate our enterprise customers in building production AI applications, including Q&A applications with multimodal data and more.”

Things to know

Nova Premier is available in Amazon Bedrock in the US East (N. Virginia), US East (Ohio), and US West (Oregon) AWS Regions today via cross-Region inference. With Amazon Bedrock, you only pay for what you use. For more information, visit Amazon Bedrock pricing.

Customers in the US can also access Amazon Nova models at https://nova.amazon.com, a website to easily explore our FMs.

Nova Premier is our best teacher for distilling custom variants of Nova Pro, Micro, and Lite, which means you can capture the capabilities offered by Premier in smaller, faster models for production deployment.

Nova Premier includes built-in safety controls to promote responsible AI use, with content moderation capabilities that help maintain appropriate outputs across a wide range of applications.

To get started with Nova Premier, visit the Amazon Bedrock console today. For more information, see the Amazon Nova User Guide and send feedback to AWS re:Post for Amazon Bedrock. Explore the generative AI section of our community.aws site to see how our Builder communities are using Amazon Bedrock in their solutions.

– Danilo

How is the News Blog doing? Take this 1 minute survey!

Unified scheduling for visual ETL flows and query books in Amazon SageMaker Unified Studio

2025-05-01 Noritaka Sekiyama

Post Syndicated from Noritaka Sekiyama original https://aws.amazon.com/blogs/big-data/unified-scheduling-for-visual-etl-flows-and-query-books-in-amazon-sagemaker-unified-studio/

Data engineers and analysts often need to automate their data processing workflows and queries to maintain up-to-date data pipelines and reports. Amazon SageMaker Unified Studio provides a unified environment for data, analytics, machine learning (ML), and AI workloads. Amazon SageMaker Unified Studio provides powerful tools for visual extract, transform, and load (ETL) flows and query books. Until today, scheduling these workflows has required additional setup and infrastructure.

Today, we’re excited to introduce a new unified scheduling feature that simplifies this process. SageMaker Unified Studio allows you to create ETL flows using a visual interface and write SQL analytics queries using query books. This new unified scheduling feature allows you to schedule your visual ETL flows and query books directly from SageMaker Unified Studio within the same interface, eliminating the need for visiting other consoles or complex configurations. Using Amazon EventBridge Scheduler, this feature provides a seamless and easy-to-use scheduling experience.

In this post, we walk through how to schedule your visual ETL flows and query books with just a few clicks, explore the underlying architecture, and demonstrate how this feature can streamline your data workflow automation.

Feature overview

SageMaker Unified Studio unified scheduling is built on top of EventBridge Scheduler and Amazon SageMaker Training. When you configure a new schedule from SageMaker Unified Studio, a new EventBridge schedule is automatically created in your AWS account. The EventBridge schedule is configured with the SageMaker CreateTrainingJob API. The SageMaker Training job runs visual ETL flows or query books.

The following diagram illustrates how it works.

Prerequisites

To run the instruction, you must have the following prerequisites:

An AWS account
A SageMaker Unified Studio domain
A SageMaker Unified Studio project with a All capabilities profile. This profile includes Tooling blueprint in which Scheduling is enabled by default. If scheduling is disabled, you may need to update your project’s profile.

Schedule a visual ETL flow

Complete the following steps to configure a schedule on a visual ETL flow:

On the SageMaker Unified Studio console, on the top menu, choose Build.
Under DATA ANALYSIS & INTEGRATION, choose Visual ETL flows.
For Select or create project to continue, select your project, and choose Continue.
Choose your visual ETL flow. If you don’t have any visual ETL flows, refer to Author visual ETL flows on Amazon SageMaker Unified Studio to create a new visual ETL flow.
Choose the Schedule icon.
For Schedule name, enter a unique name (for example, everyday).
For Schedule Type, select Recurring.
For Value, enter 1.
For Unit, choose days.
For Timezone, choose your time zone.
Choose Create schedule.

You have successfully configured the schedule. Because Start date and time is not given, the visual ETL flow is triggered immediately and then it is triggered once a day after that.

Edit the schedule

You can view the configured schedules with the following steps:

On the SageMaker Unified Studio console, navigate to Visual ETL flows for your project.
Choose the Schedules tab.
Choose Edit schedule under Actions.
Edit with your preferences, then choose Save.

Pause or resume the schedule

If you want to pause the schedule, complete the following steps:

Choose Pause schedule under Actions.

On the same Schedule tab, Status of the schedule will be updated to Paused.

To resume the schedule, choose Activate schedule.

Delete the schedule

To delete the schedule, complete the following steps:

Choose Delete schedule under Actions.
Choose Delete schedule in the dialog.

On the same Schedule tab, you can verify that the deleted schedule disappears.

Schedule a query book flow

Complete the following steps to configure a schedule on a query book:

On the SageMaker Unified Studio console, on the top menu, choose Build.
Under DATA ANALYSIS & INTEGRATION, choose Query Editor.
On the data explorer, under Lakehouse, choose AwsDataCatalog.
Navigate to the table venue_event_agg. This table is created in the previous section.
On the options menu (three dots), choose Query with Athena.
On the Actions menu, choose Save to project.
Choose Save changes.
On the Actions menu, choose Create schedule.
For Schedule Type, choose Recurring.
For Value, enter 1.
For Unit, choose days.
For Timezone, choose your time zone.
Choose Create schedule.

You have successfully configured the schedule. Because Start date and time was not set, the query book is triggered immediately and then it is triggered once a day after that. You can optionally configure start and end times if you want to limit your schedule to run in a specific date range.

To view the configured schedules, in the navigation pane, choose Scheduled queries.

You can view the list of scheduled queries and edit, pause, resume, or delete them, as shown in the previous section.

Clean up

To avoid incurring future charges, clean up the resources you created during this walkthrough:

On the Schedule tab of Visual ETL flows, select the everyday schedule, and choose Delete schedule under Actions. The related EventBridge schedule is automatically deleted as well.
On the SageMaker AI console, choose Training jobs under Training, and delete all the SageMaker training jobs that start with everyday-.
(Optional) To delete the visual ETL flow, on the Flows tab of Visual ETL flows, select your visual ETL flow, and choose Delete flow under Actions.

Conclusion

The new unified scheduling experience in SageMaker Unified Studio simplifies workflow automation. With unified scheduling, you can seamlessly orchestrate your visual ETL flows and query books in one centralized location.

Whether you’re running daily data transformations, weekly analytical queries, or monthly reporting workflows, the unified scheduling experience provides a straightforward path to automation. This capability enables data teams to focus more on deriving insights from their data and less on managing infrastructure and scheduling configurations.

We encourage you to try out this new experience and share your feedback with us. For more information about SageMaker Unified Studio and its capabilities, visit our documentation or explore our other blog posts about visual ETL flows and query books.

About the Authors

Noritaka Sekiyama is a Principal Big Data Architect for AWS Analytics services with a strong focus on data engineering. He is responsible for building software artifacts to help customers. In his spare time, he enjoys cycling on his road bike.

Daniel Obi is a Frontend Engineer on the Amazon SageMaker Unified Studio team. He is dedicated to building intuitive and effective solutions that enhance user experience and technical functionality. Outside of his professional work, he enjoys watching and playing basketball.

Vasudevan Venkataramanan is a Senior Software Engineer on the Amazon SageMaker Unified Studio team. He is responsible for technical direction of scheduling and orchestration within SageMaker Unified Studio. Outside of his professional work, he enjoys spending time with his kid, and playing pickleball and cricket.

Yuhang Huang is a Software Development Manager on the Amazon SageMaker Unified Studio team. He leads the engineering team to design, build, and operate scheduling and orchestration capabilities in SageMaker Unified Studio. In his free time, he enjoys playing tennis.

Gal Heyne is a Senior Technical Product Manager for AWS Analytics services with a strong focus on AI/ML and data engineering. She is passionate about developing a deep understanding of customers’ business needs and collaborating with engineers to design simple-to-use data products.

Announcing second-generation AWS Outposts racks with breakthrough performance and scalability on-premises

2025-04-29 Micah Walter

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/announcing-second-generation-aws-outposts-racks-with-breakthrough-performance-and-scalability-on-premises/

Today we’re announcing the general availability of second-generation AWS Outposts racks, which marks the latest innovation from AWS for edge computing. This new generation includes support for the latest x86-powered Amazon Elastic Compute Cloud (Amazon EC2) instances, new simplified network scaling and configuration, and accelerated networking instances designed specifically for ultra-low latency and high-throughput workloads. These enhancements deliver greater performance for a broad range of on-premises workloads, such as core trading systems of financial services and telecom 5G Core workloads.

Customers like athenahealth, FanDuel, First Abu Dhabi Bank, Mercado Libre, Liberty Latin America, Riot Games, Vector Limited, and Wiwynn are already using Outposts racks for workloads that need to stay on-premises. The second-generation Outposts rack can provide low latency, local data processing, or data residency needs, such as game servers for multi-player online games, customer transaction data, medical records, industrial and manufacturing control systems, telecom Business Support Systems (BSS), and edge inference of a variety of machine learning (ML) models. Customers can now take advantage of the latest generation of processors and more advanced configurations of Outposts racks to support faster processing, higher memory capacity, and increased network bandwidth.

Latest generation EC2 instances

We’re excited to announce local support for the latest generation (7th generation) of x86-powered Amazon EC2 instances on AWS Outposts racks, starting with C7i compute-optimized instances, M7i general-purpose instances, and R7i memory-optimized instances. These new instances deliver twice the vCPU, memory, and network bandwidth while providing up to 40% better performance compared to C5, M5, and R5 instances on previous generation Outposts racks. They are powered by 4th Gen Intel Xeon Scalable processors and are ideal for a broad range of on-premises workloads requiring enhanced performance such as larger databases, more memory-intensive applications, advanced real-time big data analytics, high-performance video encoding and streaming, and CPU-based edge inference with more sophisticated ML models. Support for more latest generation EC2 instances, including GPU-enabled instances, is coming soon.

Simplified network scaling and configuration

We’ve completely reimagined networking in our latest Outposts generation, making it simpler and more scalable than ever. At the heart of this upgrade is our new Outposts network rack, which acts as a central hub for all your compute and storage traffic.

This new design brings three major benefits to the table. First, you can now scale your compute resources independently from your networking infrastructure, giving you more flexibility and cost efficiency as your workloads grow. Second, we’ve built in network resilience from the ground up, with the network rack automatically handling device failures to keep your systems running smoothly. Third, connecting to your on-premises environment and AWS Regions is now a breeze – you can configure everything from IP addresses to VLAN and BGP settings through straightforward APIs or our updated console interface.

Specialized Amazon EC2 instances with accelerated networking

We’re introducing a new category of specialized Amazon EC2 instances on Outposts racks with accelerated networking. These instances are purpose built for the most latency-sensitive, compute-intensive, and throughput-intensive mission-critical workloads on-premises. To deliver the best possible performance, in addition to the Outpost logical network, these instances feature a secondary physical network with network accelerator cards connected to top-of-rack (TOR) switches.

First in this category are bmn-sf2e instances, designed for ultra-low latency with deterministic performance. The new instances run on Intel’s latest Sapphire Rapids processors (4th Gen Xeon Scalable), delivering 3.9 GHz sustained performance across all cores with generous memory allocation – 8GB of RAM for every CPU core. We’ve equipped bmn-sf2e instances with AMD Solarflare X2522 network cards that connect directly to top-of-rack switches.

For financial services customers, especially capital market firms, these instances offer deterministic networking through native Layer 2 (L2) multicast, precision time protocol (PTP), and equal cable lengths. This enables customers to meet regulatory requirements around fair trading and equal access while easily connecting to their existing trading infrastructure.

Instance Name	vCPUs	Memory (DDR5)	Network Bandwidth	NVMe SSD Storage	Accelerated Network Cards	Accelerated Bandwidth (Gbps)
bmn-sf2e.metal-16xl	64	512 GiB	25 Gbps	2 x 8 TB (16 TB)	2	100
bmn-sf2e.metal-32xl	128	1024 GiB	50 Gbps	4 x 8 TB (32 TB)	4	200

The second instance type, bmn-cx2, is optimized for high throughput and low latency. This instance features NVIDIA ConnectX-7 400G NICs physically connected to high-speed top-of-rack switches, delivering up to 800 Gbps bare metal network bandwidth operating at near line rate. With native Layer 2 (L2) multicast and hardware PTP support, this instance is ideal for high-throughput workloads like real-time market data distribution, risk analytics, and telecom 5G core network applications.

Instance Name	vCPUs	Memory (DDR5)	Network Bandwidth	NVMe SSD Storage	Accelerated Network Cards	Accelerated Bandwidth (Gbps)
bmn-cx2.metal-48xl	192	1536 GiB	50 Gbps	4 x 4 TB (16 TB)	2	800

Bottom line, the new generation of Outposts racks deliver enhanced performance, scalability, and resiliency for a broad range of on-premises workloads, even for mission-critical workloads with the most stringent latency and throughput requirements. You can make your selection and initiate your order from the AWS Management Console. The new instances maintain consistency with regional deployments by supporting the same APIs, AWS Management Console, automation, governance policies, and security controls in the cloud and on-premises, improving developer productivity and IT efficiency.

Things to know

At launch, second-generation Outposts racks can be shipped to US and Canada and be parented back to 6 AWS Regions including US East (N. Virginia and Ohio), US West (Oregon), EU West (London and France) and Asia Pacific (Singapore). Support for more countries and territories and AWS Regions is coming soon. At launch, second-generation Outposts racks locally support a subset of AWS services found in previous generation Outposts racks. Support for more EC2 instance types and more AWS services is coming soon.

To learn more, visit the AWS Outposts racks product page and user guide. You can also talk to an Outposts expert if you are ready to discuss your on-premises needs.

— Micah;

How is the News Blog doing? Take this 1 minute survey!

Difference between full precision and binary vectors

Hardware acceleration: AVX-512 and popcount instructions

Where do binary vector workloads spend the bulk of time?

Benchmarks and Results

Price-performance and TCO gains for OpenSearch users

Performance gains due to popcount instruction in AVX-512

Conclusion

About the Authors

Appendix 1

Appendix 2

Appendix 3

Feature overview

Prerequisites

Schedule a visual ETL flow

Edit the schedule

Pause or resume the schedule

Delete the schedule

Schedule a query book flow

Clean up

Conclusion

About the Authors

The collective thoughts of the interwebz