Tag Archives: Amazon EC2

Introducing Amazon EC2 X8aedz instances powered by 5th Gen AMD EPYC processors for memory-intensive workloads

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-amazon-ec2-x8aedz-instances-powered-by-5th-gen-amd-epyc-processors-for-memory-intensive-workloads/

Today, we’re announcing the availability of new memory-optimized, high-frequency Amazon Elastic Compute Cloud (Amazon EC2) X8aedz instances powered by a 5th Gen AMD EPYC processor. These instances offer the highest CPU frequency, 5GHz in the cloud. They deliver up to two times higher compute performance and 31% price-performance compared to previous generation X2iezn instances.

X8aedz instances are ideal for electronic design automation (EDA) workloads, such as physical layout and physical verification jobs, and relational databases that benefit from high single-threaded processor performance and a large memory footprint. The combination of 5 GHz processors and local NVMe storage enables faster processing of memory-intensive backend EDA workloads such as floor planning, logic placement, clock tree synthesis (CTS), routing, and power/signal integrity analysis. The high memory-to-vCPU ratio of 32:1 makes these instances particularly effective for applications with vCPU-based licensing models.

Let me explain the instance type naming: The “a” suffix indicates an AMD processor, “e” denotes extended memory in the memory-optimized instance family, “d” represents local NVMe-based SSDs physically connected to the host server, and “z” indicates high-frequency processors.

X8aedz instances
X8aedz instances are available in eight sizes ranging from 2–96 vCPUs with 64–3,072 GiB of memory, including two bare metal sizes. X8aedz instances feature up to 75 Gbps of network bandwidth with support for the Elastic Fabric Adapter (EFA), up to 60 Gbps of throughput to the Amazon Elastic Block Store (Amazon EBS), and up to 8 TB of local NVMe SSD storage.

Here are the specs for X8aedz instances:

Instance name vCPUs Memory
(GiB)
NVMe SSD storage (GB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
x8aedz.large 2 64 158 Up to 18.75 Up to 15
x8aedz.xlarge 4 128 316 Up to 18.75 Up to 15
x8aedz.3xlarge 12 384 950 Up to 18.75 Up to 15
x8aedz.6xlarge 24 768 1,900 18.75 15
x8aedz.12xlarge 48 1,536 3,800 37.5 30
x8aedz.24xlarge 96 3,072 7,600 75 60
x8aedz.metal-12xl 48 1,536 3,800 37.5 30
x8aedz.metal-24xl 96 3,072 7,600 75 60

With the 60 Gbps Amazon EBS bandwidth and up to 8 TB of local NVMe SSD storage, you can achieve faster database response times and reduced latency for EDA operations, ultimately accelerating time-to-market for chip designs. These instances also support the instance bandwidth configuration feature that offers flexibility in allocating resources between network and EBS bandwidth. You can scale network or EBS bandwidth by 25% and improve database (read and write) performance, query processing, and logging speeds.

X8aedz instances use sixth-generation AWS Nitro cards, which offload CPU virtualization, storage, and networking functions to dedicated hardware and software, enhancing performance and security for your workloads.

Now available
Amazon EC2 X8aedz instances are now available in US West (Oregon) and Asia Pacific (Tokyo) AWS Regions, and additional Regions will be coming soon. For Regional availability and future roadmap, search the instance type in the AWS CloudFormation resources tab of the AWS Capabilities by Region.

You can purchase these instances as On-Demand, Savings Plan, Spot Instances, and Dedicated Instances. To learn more, visit the Amazon EC2 Pricing page.

Give X8aedz instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 X8aedz instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

Amazon GuardDuty adds Extended Threat Detection for Amazon EC2 and Amazon ECS

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/amazon-guardduty-adds-extended-threat-detection-for-amazon-ec2-and-amazon-ecs/

Today, we’re announcing new enhancements to Amazon GuardDuty Extended Threat Detection with the addition of two attack sequence findings for Amazon Elastic Compute Cloud (Amazon EC2) instances and Amazon Elastic Container Service (Amazon ECS) tasks. These new findings build on the existing Extended Threat Detection capabilities, which already combine sequences involving AWS Identity and Access Management (IAM) credential misuse, unusual Amazon Simple Storage service (Amazon S3) bucket activity, and Amazon Elastic Kubernetes Service (Amazon EKS) cluster compromise. By adding coverage for EC2 instance groups and ECS clusters, this launch expands sequence-level visibility to virtual machine and container environments that support the same application. Together, these capabilities provide a more consistent and unified way to detect multistage activity across diverse Amazon Web Services (AWS) workloads.

Modern cloud environments are dynamic and distributed, often running virtual machines, containers, and serverless workloads at scale. Security teams strive to maintain visibility across these environments and connect related activities that might indicate complex, multistage attack sequences. These sequences can involve multiple steps, such as establishing initial access and persistence, providing missing credentials or performing unexpected data access, that unfold over time and across different sources. GuardDuty Extended Threat Detection automatically links these signals using AI and machine learning (ML) models trained at AWS scale to build a complete picture of the activity and surface high-confidence insights to help customers prioritize response actions. By combining evidence from diverse sources, this analysis produces high-fidelity, unified findings that would otherwise be difficult to infer from individual events.

How it works
Extended Threat Detection analyzes multiple types of security signals, including runtime activity, malware detections, VPC Flow Logs, DNS queries, and AWS CloudTrail events to identify patterns that represent a multistage attack across Amazon EC2 and Amazon ECS workloads. Detection works with the GuardDuty foundational plan, and turning on Runtime Monitoring for EC2 or ECS adds deeper process and network-level telemetry that strengthens signal analysis and increases the completeness of each attack sequence.

The new attack sequence findings combine runtime and other observed behaviors across the environment into a single critical-severity sequence. Each sequence includes an incident summary, a timeline of observed events, mapped MITRE ATT&CK® tactics and techniques, and remediation guidance to help you understand how the activity unfolded and which resources were affected.

EC2 instances and ECS tasks are often created and replaced automatically through Auto Scaling groups, shared launch templates, Amazon Machine Images (AMIs), IAM instance profiles, or cluster-level deployments. Because these resources commonly operate as part of the same application, activity observed across them might originate from a single underlying compromise. The new EC2 and ECS findings analyze these shared attributes and consolidate related signals into one sequence when GuardDuty detects a pattern affecting the group.

When a sequence is detected, the GuardDuty console highlights any critical-severity sequence findings on the Summary page, with the affected EC2 instance group or ECS cluster already identified. Selecting a finding opens a consolidated view that shows how the resources are connected, which signals contributed to the sequence, and how the activity progressed over time, helping you quickly understand the scope of impact across virtual machine and container workloads.

In addition to viewing sequences in the console, you can also see these findings in AWS Security Hub, where they appear on the new exposure dashboards alongside other GuardDuty findings to help you understand your overall security risk in one place. This detailed view establishes the context for interpreting how the analysis brings related signals together into a broader attack sequence.

Together, the analysis model and grouping logic give you a clearer, consolidated view of activity across virtual machine and container workloads, helping you focus on the events that matter instead of investigating numerous individual findings. By unifying related behaviors into a single sequence, Extended Threat Detection helps you assess the full context of an attack path and prioritize the most urgent remediation actions.

Now available
Amazon GuardDuty Extended Threat Detection with expanded coverage for EC2 instances and ECS tasks is now available in all AWS Regions where GuardDuty is offered. You can start using this capability today to detect coordinated, multistage activity across virtual machine and container workloads by combining signals from runtime activity, malware execution, and AWS API activity.

This expansion complements the existing Extended Threat Detection capabilities for Amazon EKS, providing unified visibility into coordinated, multistage activity across your AWS compute environment. To learn more, visit the Amazon GuardDuty product page.

Betty

Performance benefits of new Amazon EC2 R8a memory-optimized instances

Post Syndicated from Tyler Jones original https://aws.amazon.com/blogs/compute/performance-benefits-of-new-amazon-ec2-r8a-memory-optimized-instances/

Recently we announced the availability of Amazon Elastic Compute Cloud (Amazon EC2) R8a instances, the latest addition to the AMD memory-optimized instance family. These instances are powered by the 5th Generation AMD EPYC (codename Turin) processors with a maximum frequency of 4.5 GHz. In this post I take these instances for a spin and benchmark MySQL later on, but first I discuss the top things you should know about these instances.

Notable characteristics of R8a instances

Each vCPU on an R8a instance corresponds to a physical CPU core (something we started on 7th generation AMD instances). This means that there is no simultaneous multi-threading (SMT). Each vCPU mapped to a dedicated physical core, which means that you get more predictable and consistent performance because there’s no resource sharing or potential interference between threads, which is particularly crucial for performance-sensitive workloads where consistent latency is essential. When evaluating and adopting R8a instances, make sure that you’re re-evaluating your thresholds for CPU usage. You can likely squeeze more out of each instance’s CPU without impacting any of your workload’s SLA metrics.

R8a instances feature sizes of up to 192 vCPU with 1,536 GiB RAM. The following table shows the detailed specs:

Instance size vCPU Memory (GiB) Instance storage Network bandwidth (Gbps) EBS bandwidth (Gbps)
r8a.medium 1 8 EBS Only Up to 12.5 Up to 10
r8a.large 2 16 EBS Only Up to 12.5 Up to 10
r8a.xlarge 4 32 EBS Only Up to 12.5 Up to 10
r8a.2xlarge 8 64 EBS Only Up to 15 Up to 10
r8a.4xlarge 16 128 EBS Only Up to 15 Up to 10
r8a.8xlarge 32 256 EBS Only 15 10
r8a.12xlarge 48 384 EBS Only 22.5 15
r8a.16xlarge 64 512 EBS Only 30 20
r8a.24xlarge 96 768 EBS Only 40 30
r8a.48xlarge 192 1536 EBS Only 75 60
r8a.metal-24xl 96 768 EBS Only 40 30
r8a.metal-48xl 192 1536 EBS Only 75 60

Testing MySQL performance using HammerDB

R8a instances are a great choice for MySQL databases, so I thought that would be a great place to showcase some of these instances capabilities. To test MySQL, I used a series of scripts written by my colleagues to track MySQL performance across software versions and different EC2 instances. These scripts are stored in the repro-collection repository, which is an open source, extensible framework for performance testing that addresses real-world workloads rather than micro-benchmarks. It is built to provide a performance measurement reference usable across multiple organizations, and it’s currently centered on MySQL and actively used in discussions with Linux Kernel developers and maintainers. Furthermore, it helps track any performance impacts created by code changes to MySQL. The scripts contained in this repository set up a MySQL database to be tested, and a load generator running the HammerDB benchmark.

For this benchmark I used an r6a.24xlarge instance for the load generator, and an r6a.xlarge, r7a.xlarge, and r8a.xlarge instances for the MySQL database server all deployed in the same AWS Availability Zone (AZ). I chose a single AZ setup to minimize any latency variability from crossing multiple AZs. This is not meant to be a production-like setup, and I highly recommend using multiple AZs for production workloads. Each MySQL instance was tested separately using the same HammerDB load generator. Each test was run three times, and the results were averaged across the three runs. A diagram of the architecture is shown in the following figure:

Performance testing architecture showing r6a/r7a/r8a instance types with HammerDB load generator executing 9 test runs

HammerDB overall results

R8a instances show great results in the HammerDB benchmark for MySQL databases. For HammerDB’s overall score category, R8a instances outscored R7a instances by 55% and outscored R6a instances by 74%.

Performance comparison chart showing r6a, r7a, and r8a instance scores

HammerDB transactions per minute test

R8a instances also showed a notable improvement in this category. When compared to previous generation R7a instances, R8a out performed R7a by 32%. When compared to R6a instances, R8a outperformed by 63%.

 Performance comparison showing r6a (91,105), r7a (112,686), and r8a (148,478) transactions per minute

HammerDB P99 latency results

R8a instances showed improvement in P99 latency results, showing the efficiency gains driven by the new 5th Generation AMD EPYC CPUs and higher memory bandwidth. R8a shows an 14% latency reduction when compared to R7a, and a 25% latency reduction when compared to R6a.

P99 latency comparison showing decrease from 39.93ms (r6a) to 30.02ms (r8a) across instance generations

Conclusion

Built on the AWS Nitro System using sixth generation Nitro Cards, R8a instances are ideal for high performance, memory-intensive workloads, such as SQL and NoSQL databases, as demonstrated by the bench-marking shown in this post, as well as distributed web scale in-memory caches, in-memory databases, real-time big data analytics, and Electronic Design Automation (EDA) applications. R8a instances offer 12 sizes, including 2 bare metal sizes. Amazon EC2 R8a instances are SAP-certified, and providing 38% more SAPS when compared to R7a instances. If you’re still running 6th generation R6a instances, then I highly encourage you to migrate to the 8th generation instances to use their clear price performance benefits. Staying on modern infrastructure is a great way to drive down costs and provide more features for your customers, and there are clear gains to be had based on the testing shown in this post.

Start optimizing your high performance memory intensive workloads today by migrating to R8a instances. Visit the Amazon EC2 R8a instances page to learn more and get started on your upgrades to use the increased price performance of R8a instances today!

Optimize unused capacity with Amazon EC2 interruptible capacity reservations

Post Syndicated from Shubham Sarin original https://aws.amazon.com/blogs/compute/optimize-unused-capacity-with-amazon-ec2-interruptible-capacity-reservations/

Organizations running critical workloads on Amazon Elastic Compute Cloud (Amazon EC2) reserve compute capacity using On-Demand Capacity Reservations (ODCR) to have availability when needed. However, reserved capacity can intermittently sit idle during off-peak periods, between deployments, or when workloads scale down. This unused capacity represents a missed opportunity for cost optimization and resource efficiency across the organization.

Amazon EC2 now offers interruptible ODCRs, a new capability that lets you make unused compute capacity temporarily available to other workloads while maintaining control to reclaim it. This feature helps you optimize reservations and reduce costs across your AWS organization.

In this post, we explore how this works through a practical customer scenario.

Customer scenario: Maximizing reservation utilization across

Consider a financial services company where the trading platform team maintains a large fleet of r7i.4xlarge instances reserved around-the-clock for critical blue/green deployments. During off-peak trading hours and weekends, a significant portion of this reserved capacity sits idle. Meanwhile, the data analytics team regularly runs batch processing jobs for risk modeling—workloads that could benefit from additional compute capacity but don’t require the same availability guarantees as the trading platform.

Previously, sharing this capacity meant losing control over when it could be reclaimed, creating operational challenges when the trading platform needed to scale up quickly during market volatility. Interruptible ODCRs solve this problem by giving the reservation owner control to reclaim capacity when needed for critical operations.

In the following sections, we walk through the key steps to configure capacity sharing, launch instances, and reclaim capacity. The high-level steps are:

  1. Set up capacity sharing
  2. Discover available capacity and launch instances
  3. Reclaim capacity and handle interruptions

Step 1: Set up capacity sharing

The trading platform team begins by identifying unused capacity patterns through Amazon EC2 Capacity Manager. They determine that approximately 60% of their reserved r7i.4xlarge capacity remains unused during overnight hours and weekends.

Create interruptible ODCR

For prerequisites, see Interruptible Capacity Reservations for capacity owners in the Amazon EC2 User Guide.

To repurpose this idle capacity, the trading platform team creates an interruptible ODCR using the AWS Management Console, SDK, or AWS Command Line Interface (AWS CLI). To use the console, they complete the following steps:

  1. On the Amazon EC2 console, choose Capacity Reservations in the navigation pane.
  2. Select the source ODCR and choose Create interruptible reservation.
  3. For Instances to allocate, enter how many instances to allocate (out of a 100-instance reservation). For this example, we allocate 60 instances.
  4. Choose Create interruptible reservation.

This configuration withdraws 60 instances from their original reservation and creates a new ODCR with interruptible configuration. The original reservation now shows 40 instances, and the new interruptible reservation shows 60.

Create interruptible on-demand capacity reservation

Share resources across the organization

With the interruptible reservation created, the reservation owning team uses AWS Resource Access Manager (AWS RAM)—a service that helps you securely share AWS resources across accounts and organizations—to share the newly created ODCR with additional accounts in their organization. When sharing your ODCR, you specify which consumer account IDs in your organization will get access to the interruptible ODCR. Alternatively, you can share the ODCR with your entire AWS Organization or Organizational Unit (OU). When it’s complete, the specified accounts get access to the interruptible ODCR capacity and establish a setup like the one illustrated in the following diagram.

Share resources across the organization

Sharing with all accounts (at once) within the organization requires organization-wide sharing to be enabled in AWS Organizations setup. If organization-wide sharing is not enabled, a user can still share with individual accounts by enumerating each account.

Step 2: Discover available capacity and launch instances

After the reservation owner (the trading platform team) shares their reservation, the capacity consumer (data analytics team) needs to find the capacity in their account and launch into it. In this section, we walk through the interruptible ODCR discovery and launch process.

Discover available capacity

The data analytics team, running batch processing jobs in a separate AWS account, can now find the shared interruptible capacity in their account using the console, SDK, or AWS CLI. To use the console, they complete the following steps:

  1. On the Amazon EC2 console, choose Capacity Reservations in the navigation pane.
  2. Choose the ODCR to view its details page.

The interruptible reservation appears with a clear indication that it’s interruptible, showing the instance type (r7i.4xlarge), Availability Zone, and available capacity.

Discover available capacity

Configure Auto Scaling groups for interruptible capacity

To use this capacity for their batch processing workloads, the analytics team creates a new launch template specifically designed for interruptible capacity. The key configuration element is setting the new market-type parameter and targeting the interruptible ODCR.

In the launch template, specify the following:

  • Instance type: r7i.4xlarge (matching the shared capacity)
  • Capacity reservation specification: Targeted
  • Capacity reservation ID: Enter the ID of the shared interruptible ODCR
  • Market type: Use the type interruptible-capacity-reservation

Next, create an Auto Scaling group that uses this launch template. The group is configured as follows:

  • Minimum size: 0 (to avoid unnecessary costs when capacity isn’t needed)
  • Maximum size: 40 (within the available shared capacity)
  • Desired capacity: Set based on job queue length

Launch instances into interruptible capacity

When the analytics team’s batch processing jobs trigger scaling events, the Auto Scaling group launches instances that automatically target the shared interruptible ODCR. These instances launch immediately if capacity is available, providing the team with access to reserved capacity for their fault-tolerant workloads. The instances appear on the Amazon EC2 console with their instance lifecycle as interruptible-capacity-reservation and the ODCR ID in which they’re running. This provides clear indication that they’re running on interruptible capacity, helping with monitoring and cost allocation.

Step 3: Reclaim capacity and handle interruptions

In this section, we review how the capacity owner (the trading platform team) can reclaim their capacity when needed for their critical operations and how the capacity consumer can gracefully handle such interruptions.

Trigger reclamation

When market volatility increases and the trading platform needs to scale up quickly, the platform team initiates capacity reclamation through the console, SDK, or AWS CLI. To use the console, they complete the following steps:

  1. On the Amazon EC2 console, choose Capacity Reservations in the navigation pane.
  2. Choose the ODCR to view its details page.
  3. Choose Edit Interruptible Allocation.
  4. Specify how many instances are needed back (in this case, all 60 instances for maximum trading capacity).
  5. Choose Update, then choose Confirm.

Edit interruptible allocation

The reclamation process can also be automated using AWS Lambda functions triggered by Amazon CloudWatch alarms or scheduled events, providing proactive capacity management based on predictable usage patterns.

Consumer notification and graceful shutdown

After the owner triggers capacity reclamation, consuming instances receive a 2-minute instance interruption warning notice through Amazon EventBridge. The analytics team has configured their batch processing applications to listen for these events. Their applications receive this 2-minute warning and immediately begin checkpointing their current work, saving intermediate results to Amazon Simple Storage Service (Amazon S3), and gracefully shutting down. For EventBridge notification details, refer to the Monitor interruptible Capacity Reservations with EventBridge section in the EC2 Capacity Reservations User Guide.

Automatic capacity restoration to source ODCR

After the 2-minute notice period, Amazon EC2 starts shutting down the consuming instances. After the instances are successfully shut down, Amazon EC2 restores the capacity to the trading platform’s original ODCR. The trading platform can then launch their critical workloads into the same ODCR, resulting in minimal delay for their scaling requirements. The reservation owner can track their capacity reclamation status through the console or API. On the Amazon EC2 console, the ODCR details page shows the current instance allocation, target instance allocation, and request status. When current and target counts match, the status changes to Active, confirming completion.

After the reservation owner requests their capacity back, the capacity reclamation process can take a few minutes, so reservation owners should account for this delay when planning critical activities. This is because Amazon EC2 provides a 2-minute warning to the consumer instances, followed by the instance shutdown period.

Billing and cost considerations

The billing model for interruptible ODCRs follows a clear usage-based approach that aligns costs with consumption:

  • Reservation owner (trading platform team) – Pays EC2 On-Demand rates for unused capacity in the interruptible ODCR, just like any standard ODCR. For example, when the analytics team uses 30 out of 60 available instances, the trading platform pays for the remaining 30 unused instances.
  • Consumer (analytics team) – Pays EC2 On-Demand rates only for the instances they actually launch and use. For example, when they use 30 instances for 4 hours, they’re charged for 30 × 4 = 120 instance-hours at the standard r7i.4xlarge On-Demand rate.

Conclusion

Amazon EC2 interruptible ODCR helps organizations optimize compute spending while maintaining operational control. Through capacity reclamation mechanisms, teams can achieve better resource utilization without compromising availability guarantees. In this post, we showed how this capability addresses real operational challenges through an example use case—enabling a trading platform to maintain their critical capacity guarantees while helping other teams access high-quality compute resources for their workloads. The predictable interruption model creates a sustainable approach to capacity sharing that benefits the entire organization.

To get started with interruptible capacity reservations, refer to EC2 Capacity Reservations User Guide. To learn more about using EC2 Auto Scaling, refer the Interruptible Capacity Reservations with EC2 Auto Scaling guide. Refer to the AWS RAM User Guide to learn more about sharing resources across your organization and the Amazon EventBridge User Guide to learn more about handling interruption notifications in your applications.

How potential performance upside with AWS Graviton helps reduce your costs further

Post Syndicated from Markus Adhiwiyogo original https://aws.amazon.com/blogs/compute/how-potential-performance-upside-with-aws-graviton-helps-reduce-your-costs-further/

Amazon Web Services (AWS) provides many mechanisms to optimize the price performance of workloads running on Amazon Elastic Compute Cloud (Amazon EC2), and the selection of the optimal infrastructure to run on can be one of the most impactful levers. When we started building the AWS Graviton processor, our goal was to optimize AWS Graviton features and capabilities to deliver a processor that provides the best price performance across a broad array of cloud workloads running on Amazon EC2. That goal continues to be our guiding principle, and today customers who adopt AWS Graviton-based EC2 instances see up to 40% better price performance on their cloud workloads when compared to equivalent non-Graviton EC2 instances. The price performance improvement is the result of both the performance improvement and the lower price in using AWS Graviton-based instances.

Price performance blends the cost of infrastructure with the amount of work you can achieve with infrastructure usage. After talking to many AWS Graviton customers, we’ve learned that the cost savings go beyond the lower AWS Graviton-based instances price. Many AWS Graviton customers told us that the performance increase from AWS Graviton allows them to consume fewer computing hours than comparable non-Graviton instances for equivalent workload throughput. In turn, this leads to further cost reduction.

The following are some of examples from our customers:

  • Pinterest achieved 47% cost savings and 38% savings on compute resources while reducing carbon emissions by 62% for its web API workload.
  • SAP powers its SAP HANA Cloud with AWS Graviton to enhance its price performance by 35% while lowering carbon impact by 45%.
  • Sprinklr improved their machine learning (ML) inference workloads’ throughput by up to 20% while reducing costs by up to 25%.

You can find more customer examples in the AWS Graviton testimonials page.

To help organizations capture similar benefits, we’ve enhanced the AWS Graviton Savings Dashboard (GSD) with new features that account for both pricing and performance improvements. In the following section we explore these new capabilities and how they can help optimize your infrastructure costs.

Understanding performance-driven cost optimization in the GSD

The GSD helps organizations identify ideal workloads for AWS Graviton migration through automated resource matching and data-driven visualizations. You can learn the GSD details and setup in this AWS compute post.

Although the dashboard has traditionally focused on calculating direct cost savings from the AWS Graviton pricing advantages, we’ve observed that customers often experience more benefits when their applications perform more efficiently on AWS Graviton processors, leading to decreased compute resource usage. To better reflect these real-world scenarios, we’ve enhanced the dashboard with new features highlighting Normalized Instance Hours (NIH) analysis capabilities so that you can model potential savings based on both pricing benefits and compute hour reductions. Although this tool helps estimate potential savings, actual performance improvements can only be determined by testing your specific workloads on AWS Graviton instances. Performance is always workload and use case specific, so we encourage you to test your AWS Graviton-based workloads using the Optimization and Performance Runbook to help you determine the actual possible NIH percent reduction.

Key dashboard components

This section outlines the following three key dashboard components: NIH reduction analysis, enhanced cost analysis visualizations, and detailed savings analysis.

NIH reduction analysis

The dashboard now features a new slider that lets you model potential cost savings by inputting the percentage reduction in NIH. Many organizations have found it challenging to calculate their total possible savings since the benefits come from two sources: the lower instance pricing of AWS Graviton and the reduced compute hours.

You can use the slider to model different cost scenarios by adjusting a theoretical NIH reduction between 0% and 40%. You can use this slider to input NIH reductions validated through your workload testing, model the combined impact of both pricing benefits and reduced compute hours, and explore different scenarios to help prioritize which workloads to test first.

Figure 1: NIH slider location

Figure 1: NIH slider location

Assume that your testing shows that your workload runs just as effectively with 15% fewer normalized instance hours on AWS Graviton. You can now plug that exact number into the slider to see your modeled savings combining both pricing differences and compute hour reductions. Although we’ve heard success stories of significant reductions from customers, we recommend starting your initial estimate with a conservative 10% baseline and adjusting based on your own testing results.

Enhanced cost analysis visualizations

The dashboard presents key visualizations that demonstrate the direct relationship between NIH reduction and cost savings. First, you see the Potential Graviton Base Savings from pricing differences alone. In the following diagram, we can observe an example of $61.54K of cost savings from migrating to equivalent AWS Graviton instances. Next, the Estimated Additional Savings Due to Performance in the same diagram shows $42.40K in savings if your performance testing confirms a 15% NIH reduction in your workload. Finally, the dashboard sums these two values into the Total Potential Graviton Savings of $103.94K. The Total Potential Graviton Savings helps visualize how both pricing benefits and any validated compute hour reductions could contribute to your overall savings.

Figure 2: Visualization with relationship between NIH reduction and cost savings

Figure 2: Visualization with relationship between NIH reduction and cost savings

The Amortized Cost Breakdown and Normalized Instance Hrs Breakdown charts in the following figure show 6-month historical trends, helping you spot patterns such as seasonal spikes or high-usage periods. These patterns can help you identify where even small efficiency improvements might yield significant savings, for example, workloads with consistently high usage or predictable peak periods that would be good candidates for testing.

Figure 3: Amortized Cost, NIH, and Total Potential Savings Breakdown charts

Figure 3: Amortized Cost, NIH, and Total Potential Savings Breakdown charts

Detailed savings analysis

Building on our commitment to help customers optimize cloud costs, we’ve enhanced the Potential Graviton Savings Details table with two columns focused on performance-based savings modeling. The Estimated Additional Savings Due to Performance column shows the modeled savings based on your chosen NIH reduction percentage, while Total Potential Graviton Savings combines this with the base pricing benefits.

Figure 4: Potential Graviton Savings Details table

Figure 4: Potential Graviton Savings Details table

As you examine your current instance family, you can observe both baseline AWS Graviton savings and these added saving opportunities clearly laid out in a comprehensive breakdown. The analysis presents your total savings potential in both dollar amounts and percentages. This allows you to build a compelling business case for migration. Although this detailed breakdown provides valuable planning insights, remember that actual savings may vary depending on your specific workload patterns, implementation approaches, and operational considerations.

Conclusion

The Graviton Savings Dashboard (GSD) serves as a powerful analytics tool that streamlines your journey to cost-effective cloud computing. The GSD provides clear visualizations and interactive features to help you understand and maximize potential savings when migrating to AWS Graviton-based instances. To further explore the new features, navigate to the GSD interactive demo, where you can model an example of potential savings using the NIH reduction slider and detailed cost breakdowns.

Ready to explore how AWS Graviton can transform your infrastructure costs? Visit the GSD page to deploy or update your GSD dashboard. Access implementation guides, such as the CFM Technical Implementation Playbook (CFM TIPs), and start optimizing your cloud spend today with the enhanced capabilities of the GSD.

Over 85,000 AWS customers have discovered the benefits of AWS Graviton, with many completing their adoptions in just hours. We have created this resource guide so that you can accelerate your AWS Graviton adoption with minimal effort and enjoy significant price performance benefits.

“What I always tell customers is one week, one application, one engineer, and see what you can do. They always are pleasantly surprised by how much progress they can make. If you’re out there and you haven’t yet moved to AWS Graviton, what are you waiting for? Let’s make it happen!”

Dave Brown, VP, AWS Compute & ML Services

Important note about performance testing
The GSD does not attempt to estimate the potential NIH percent reduction or your workload’s performance when transitioned to AWS Graviton. You can use it to perform what-if analysis of your potential savings for a projected NIH percent reduction. In the absence of this variable, GSD only considers the price delta between instance types and misses an important contributor to the overall savings potential of AWS Graviton from the performance upside. Compute performance is always workload and use case specific, so we encourage you to test your AWS Graviton-based workloads using the Optimization and Performance Runbook to help you determine the actual possible NIH percent reduction.

Optimize latency-sensitive workloads with Amazon EC2 detailed NVMe statistics

Post Syndicated from Sanjeev Malladi original https://aws.amazon.com/blogs/compute/optimize-latency-sensitive-workloads-with-amazon-ec2-detailed-nvme-statistics/

Amazon Elastic Cloud Compute (Amazon EC2) instances with locally attached NVMe storage can provide the performance needed for workloads demanding ultra-low latency and high I/O throughput. High-performance workloads, from high-frequency trading applications and in-memory databases to real-time analytics engines and AI/ML inference, need comprehensive performance tracking. Operating system tools like iostat and sar provide valuable system-level insights, and Amazon CloudWatch offers important disk IOPs and throughput measurements, but high-performance workloads can benefit from even more detailed visibility into instance store performance.

For latency-sensitive applications where every millisecond counts, enhanced performance monitoring tools provide deep visibility into storage systems, so your teams can track and analyze behavior at a 1 second granularity. This detailed insight can help your organization detect bottlenecks quickly, fine-tune application performance, and deliver reliable service.

In this post, we discuss how you can use Amazon EC2 detailed performance statistics for instance store NVMe volumes, a set of new metrics that provide per-second granularity, to provide real-time visibility into your locally attached storage performance. These statistics are similar to the Amazon EBS detailed performance statistics, providing a consistent monitoring experience across both storage types. You can access these statistics directly from your NVMe devices attached to the Amazon EC2 instance using nvme-cli or using CloudWatch agent to monitor I/O performance at the storage level. We also provide examples of how to use these statistics to identify performance bottlenecks.

Feature overview

Amazon EC2 Nitro-based instances with locally attached NVMe instance storage now offer 11 comprehensive metrics at per-second granularity. These metrics, similar to EBS volume metrics, include queue length measurements, IOPS, throughput data, and IO latency histograms for the locally attached NVMe instance storage. Additionally, they also include IO size-specific latency histograms to provide even more detailed insights into performance patterns of the local NVMe instance storage. These metrics are collected and presented separately for each individual NVMe volume available on an instance.

The statistics are presented in three main formats:

    1. Cumulative counters that track IO operations, throughput, and read/write times
    2. Real-time queue length, displaying the current value at the time of your query
    3. Latency histograms visualizing the distribution of IO operations across different latency ranges by displaying both cumulative view and IO size-specific distributions

Prerequisites

To access detailed performance statistics for local instance storage, complete the following steps:

    1. Launch a new Amazon EC2 Nitro instance or use an existing one, then connect to it using SSH or your preferred connection method.
    2. Identify the NVMe device associated with the local storage to query for the performance statistics. For example, you can run the nvme-cli command in the CLI to output all NVMe devices on the instance.
      $ sudo nvme list.

      The following is an example output of the list command that lists the NVMe devices on the instance and their volume Serial Numbers (SN; masked in the below output for privacy). In this demonstration, consider that the local storage used by your application is /dev/nvme1n1.

      Terminal output showing five NVMe devices: one EBS volume and four EC2 instance storage volumes with 3.75TB capacity each

    3. If you are using Amazon Linux 2023 version 2023.8.20250915 (or later) or Amazon Linux 2 2.0.20251014.0 (or later) you can proceed to Step 4 because nvme-cli will use the latest version. If you are using an earlier Amazon Linux version, update the nvme-cli using the following command, where 2023.8.20250915 can be replaced with the latest Amazon Linux 2023 version:
      $ sudo dnf upgrade --releasever=2023.8.20250915
    4. Run the nvme-cli, with the correct permissions, and pass the device as a parameter. You can use --help to get details on the command usage:
      $ sudo nvme amzn stats --help

      Example output:
      Command help output for 'nvme amzn stats' showing usage syntax and format options
      If you prefer output in a JSON format, you can provide the -o json parameter to the command.

      $ sudo nvme amzn stats /dev/nvme1n1 -o json

      The following output (without the -o json parameter) shows cumulative read/write operations, read/write bytes, total processing time (read and write in microseconds), and duration (in microseconds) when application attempted to exceed the instance’s IOPS/throughput limits.
      Storage performance metrics showing read operations count, total bytes, and timing statistics for an EC2 NVMe volume
      It also displays read/write I/O latency histograms, with each row representing completed I/O operations within a specific bin of time (in microseconds).
      Read latency distribution histogram showing operation counts across different microsecond ranges, with peak activity in 2048-4096 rangeWrite latency distribution histogram showing zero operations across all time ranges, indicating no write activity
      If you want to view the latency histograms across 5 different IO bands: (0, 512 Byte], (512B, 4KiB], (4KiB, 8KiB], (8KiB 32KiB], (32 KiB, MAX], you can provide --details or -d parameter to the command:

      $ sudo nvme amzn stats -d /dev/nvme1n

      The following image is an excerpt of the above command’s output, showing the additional latency histograms (read and write) of the 5 different IO bands.
      Dual read/write I/O latency histogram analyzing small block operations from 0-512 bytes with peak at 4096-8192 rangePerformance analysis histogram showing I/O patterns for 512-4K blocks with significant activity in 512-1024 rangeDual histogram showing I/O latency patterns for 4K-8K block operations with concentrated activity at 4096-8192Performance analysis histogram displaying I/O patterns for 8K-32K blocks with peak activity in 4096-8192 rangeComprehensive I/O latency histogram analyzing largest block sizes from 32K to maximum with concentrated activity in 4096-8192

You can run the stats command at a per second granularity. You can also write scripts to pull the stats at a desired interval (every second or any other duration) with each subsequent output reflecting the updated cumulative totals for the metrics. Calculating the difference in the statistics across the last two outputs allows you to derive insight into the instance storage profile during the interval. Below is a sample script you can use to pull the stats at a default interval of 1 second or at your desired interval.

#!/bin/bash 
# interval of 1 second 
INTERVAL=${1:-1} 
while true; do 
	echo "=== $(date) ===" 
	sudo nvme amzn stats /dev/nvme1 || break 
	echo 
	sleep $INTERVAL 
done

You can save this script, make it executable and run it at either the default 1-second interval or provide a custom interval when executing the script. For example, if you saved the script as nvme_stats.sh, you could use the following commands to make it executable and run to get the output at the default 1-second interval (assuming you are in the same directory as that of nvme_stats.sh).

chmod +x nvme_stats.sh
./nvme_stats.sh

If, for instance, you want to get the output at every 5 seconds, you can use the command below (after making the script executable)

./nvme_stats.sh 5

You can also integrate with CloudWatch using CloudWatch agent to collect and publish these statistics for historical tracking, trend visualization through dashboards, and performance-based alerts to correlate with application metrics and automated notifications for performance issues.

Deriving insights from the Amazon EC2 instance store NVMe detailed performance statistics

Similar to EBS detailed performance statistics, you can use Amazon EC2 instance store NVMe statistics to troubleshoot various workload performance issues. As mentioned in the preceding section, you can also use the detailed statistics to view I/O latency histograms to observe the spread of I/O latency within the period. You can use the read/write operations and time spent statistics to calculate the average latency. The detailed statistics show the average latency at per-second granularity.

The next two example scenarios demonstrate key performance analysis using the statistics. In Scenario 1, we will use the EC2 Instance Local Storage Performance Exceeded (us) metric to check if I/O demands exceed instance storage capabilities, helping with instance right-sizing for sufficient I/O application performance. In Scenario 2, we will use IO-size specific histograms (using --details) to diagnose how large block writes affect subsequent read performance – an issue typically hidden by traditional monitoring tools’ aggregated metrics across all IO sizes.

Scenario 1: Identifying when applications exceed instance storage performance limits

Understanding whether your application’s I/O demands exceed your instance store volumes’ capabilities is important for performance troubleshooting. When applications generate I/O workloads that consistently attempt to exceed the IOPS and throughput limits of specific Amazon EC2 instance types, you’ll experience increased latency and degraded performance. The EC2 Instance Local Storage Performance Exceeded (us) metric helps identify these scenarios by showing the duration (in microseconds) when workloads exceeded supported instance performance. A non-zero value or increasing count between snapshots indicates your current instance size or type may not provide sufficient I/O performance for your application.

The following section shows how to identify if an application is sending more IOPS than the instance’s local storage can support.

The example scenario: An application on an i3en.xlarge instance shows elevated write latency of >1ms. You want to determine if the application’s workload is exceeding the instance’s NVMe volume supported performance.

    1. Select the Instance Storage NVMe device you want to analyze – Identify the instance you want to analyze for the application experiencing elevated latency.
    2. Identify the NVMe device – Use the following nvme-cli command, and identify the NVMe device associated with that instance storage.
      $ sudo nvme list

      Example scenario: We used the list and identified /dev/nvme1n1 as the NVMe device associated with the i3en.xlarge instance that is running the application which is currently seeing elevated write latency >1ms (while read latency is <50us as per normal conditions), so now we want to. analyze it.

    3. Collect statistics for the device at a single point in time or at desired intervals – Collect the detailed performance statistics using the nvme-cli command or use the sample script provided in previous section to capture statistics at the desired intervals, if needed.
      $ sudo nvme amzn stats /dev/nvme1n1

      Example scenario: We choose to collect the statistics only once after noticing elevated write latency of the application.

    4. Analyze the statistics to check if the application demands more than the supported performance of the instance storage – Confirm existence of overall I/O latency degradation by comparing two sets of read/write I/O latency histograms taken some time apart.Example scenario: The following output shows Read IO histogram of the NVMe local instance storage taken 40 seconds apart with no read IO latency issues (as normal read latency for this workload is < 50 us).

      Metric captured at time T:
      AWS EC2 storage performance histogram showing read latency distribution, peak at 16-32 microsecond bucket
      Metric captured at time T+40s:
      AWS EC2 storage performance data showing increased read latency concentration in 16-32 microsecond bucket
      The following output shows Write IO histogram taken 40 seconds apart. We can discern that many write IOs fall into the 1ms – 2ms latency range, which is not expected for this application.
      Metric captured at time T:
      AWS EC2 storage write performance data showing majority of operations between 1-2ms latency
      Metric captured at time T+40s:
      AWS EC2 storage performance metrics showing increased write operations clustered in 1-2ms latency range

    5. Analyze the EC2 Instance Local Storage Performance Exceeded (us) metric which shows total time (in microseconds) IOPS requests exceed volume limits. Ideally, the incremental count of this metric between two snapshot times should be minimal, as any value above 0 indicates that the workload demanded more IOPS than the volume could deliver.Example scenario: Comparing metrics 40 seconds apart shows that for more than 34 seconds, the application’s IOPS demands surpassed the IOPS supported by the local instance storage. This explains elevated write latency: excess IOPS above what the underlying storage can physically handle queue the operations, increasing wait times. This indicates that the i3en.xlarge instance chosen to run this application cannot meet the application’s performance requirements, suggesting either upgrading to a larger instance size or re-evaluating the instance type itself.
      Metric captured at time T:
      EC2 Instance Local Storage Performance exceeded output of nvme-cli for the described scenario at time T
      Metric captured at time T+40s:
      EC2 Instance Local Storage Performance exceeded output of nvme-cli for the described scenario at time T+40 with increased count of metric

It’s important to have the right instance size to avoid performance bottlenecks to your application. Refer to the Amazon EC2 instance documentation for more information on the different instances and their storage size.

Scenario 2: Identifying the block size causing elevated latency in your applications

Many storage performance issues arise from complex interactions between read and write operations with different I/O sizes, which traditional system-level monitoring tools like iostat or sar cannot effectively diagnose due to their aggregated metrics across all I/O sizes. EC2 instance store NVMe detailed performance statistics solves this by providing I/O-size specific latency histograms through the --details option in NVMe CLI. These histograms show latency data for different I/O size ranges: (0, 512 Byte], (512B, 4KiB], (4KiB, 8KiB], (8KiB, 32KiB], (32KiB, MAX], for a more precise correlation between application workload patterns and I/O size-specific latency metrics for targeted optimizations.

In this example scenario, your application performs small reads (typically <=4KiB, like metadata read) followed by large writes (>=32KiB) and shows unexpectedly high read latency. This common issue occurs when large writes impact subsequent read operations’ performance, creating a cascading effect on overall I/O performance.

    1. Gather read and write IO latency by size ranges – Use the NVMe CLI with the --details option to gather read and write IO latency by size ranges:
      $ sudo nvme amzn stats /dev/nvme1n1 --details

    2. Confirm existence of overall IO latency degradation – In the example scenario, examining overall IO latency, both read (left) and write (right) operations are showing higher than expected latency.
      NVMe storage read latency histogram highlighting concentrated IO operations in 4K-16K microsecond rangeNVMe storage write latency histogram highlighting concentrated IO operations in 8-32K microsecond range
    3. Examine the output for patterns across different IO size bands – Analyzing latency by operation sizes shows small read operations (512 bytes to 4K), typically fast, are experiencing unexpected latency spikes while large writes (32K+) show significant delays. Small reads should theoretically maintain good performance regardless of other I/O activities.
      NVMe storage read/write latency histogram highlighting concentrated IO operations in 8-16K microsecond range for IO band of 512 - 4KNVMe storage read/write latency histogram highlighting concentrated IO operations in 8-16K microsecond range in IO band 32K and above
      The observed pattern indicates that the backed-up large write operations create system-wide congestion, affecting all I/O operations of types and sizes. Despite the storage system’s capability to handle small reads efficiently, the queued large writes slow down both read and write operations at the application level.

Based on this analysis, we can implement several targeted optimizations to the application, like using smaller block sizes for write operations when possible, or batching smaller writes instead of performing large single writes.

Clean up

If you created an Amazon EC2 instance with NVMe volume for this exercise, then terminate and delete the appropriate instance to avoid future costs.

Conclusion

Amazon EC2 detailed performance statistics for instance store NVMe volumes provide real-time, sub-minute storage performance monitoring, similar to the detailed performance statistics available for Amazon EBS volumes. This offers consistent monitoring experience across both storage types, with additional IO-size based latency histograms for instance storage for better optimization of I/O patterns, and more effective troubleshooting.

To learn more about Amazon EC2 instance store NVMe volumes, optimization techniques for latency-sensitive workloads or other Amazon EC2 related topics, visit the Amazon EC2 documentation page or explore our other AWS Storage Blog posts on performance optimization.

We’d love to hear how you’re using these statistics to enhance your workloads, or if you have any questions, in the comments section below.

Accelerate large-scale AI applications with the new Amazon EC2 P6-B300 instances

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/accelerate-large-scale-ai-applications-with-the-new-amazon-ec2-p6-b300-instances/

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P6-B300 instances, our next-generation GPU platform accelerated by NVIDIA Blackwell Ultra GPUs. These instances deliver 2 times more networking bandwidth, and 1.5 times more GPU memory compared to previous generation instances, creating a balanced platform for large-scale AI applications.

With these improvements, P6-B300 instances are ideal for training and serving large-scale AI models, particularly those employing sophisticated techniques such as Mixture of Experts (MoE) and multimodal processing. For organizations working with trillion-parameter models and requiring distributed training across thousands of GPUs, these instances provide the perfect balance of compute, memory, and networking capabilities.

Improvements made compared to predecessors
The P6-B300 instances deliver 6.4Tbps Elastic Fabric Adapter (EFA) networking bandwidth, supporting efficient communication across large GPU clusters. These instances feature 2.1TB of GPU memory, allowing large models to reside within a single NVLink domain, which significantly reduces model sharding and communication overhead. When combined with EFA networking and the advanced virtualization and security capabilities of AWS Nitro System, these instances provide unprecedented speed, scale, and security for AI workloads.

The specs for the EC2 P6-B300 instances are as follows.

Instance size VCPUs System memory GPUs GPU memory GPU-GPU interconnect EFA network bandwidth ENA bandwidth EBS bandwidth Local storage
P6-B300.48xlarge 192 4TB 8x B300 GPU 2144GB HBM3e 1800 GB/s 6.4 Tbps 300 Gbps 100 Gbps 8x 3.84TB

Good to know
In terms of persistent storage, AI workloads primarily use a combination of high performance persistent storage options such as Amazon FSx for Lustre, Amazon S3 Express One Zone, and Amazon Elastic Block Store (Amazon EBS), depending on price performance considerations. For illustration, the dedicated 300Gbps Elastic Network Adapter (ENA) networking on P6-B300 enables high-throughput hot storage access with S3 Express One Zone, supporting large-scale training workloads. If you’re using FSx for Lustre, you can now use EFA with GPUDirect Storage (GDS) to achieve up to 1.2Tbps of throughput to the Lustre file system on the P6-B300 instances to quickly load your models.

Available now
The P6-B300 instances are now available through Amazon EC2 Capacity Blocks for ML and Savings Planin the US West (Oregon) AWS Region.
For on-demand reservation of P6-B300 instances, please reach out to your account manager. As usual with Amazon EC2, you pay only for what you use. For more information, refer to Amazon EC2 Pricing. Check out the full collection of accelerated computing instances to help you start migrating your applications.

To learn more, visit our Amazon EC2 P6-B300 instances page. Send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

– Veliswa

How to automate Session Manager preferences across your organization

Post Syndicated from Nima Fotouhi original https://aws.amazon.com/blogs/security/how-to-automate-session-manager-preferences-across-your-organization/

AWS Systems Manager Session Manager is a fully managed service that provides secure, interactive, one-click access to your Amazon Elastic Compute Cloud (Amazon EC2) instances, edge devices, and virtual machines (VMs) through a browser-based shell or AWS Command Line Interface (AWS CLI), without requiring open inbound ports, bastion hosts, or SSH keys. Session Manager helps you maintain security compliance and controlled access while providing users with access to managed nodes. When starting a session, you must specify a preferences document (known as the Session Manager preferences document) to set the session parameters.

While providing users with access to managed nodes, managing these preferences consistently across multiple AWS Regions and accounts in a large organization can be challenging. Organizations often need to maintain standardized security settings, logging configurations, and session controls across their entire AWS footprint. Manual configuration of these preferences in each Region and account is not only time-consuming but also prone to human error and can lead to security gaps or compliance violations. Additionally, tracking and maintaining these configurations becomes increasingly complex as the organization scales.

You can use Session Manager to control various session options including data encryption for session data in transit and session logs at rest, session duration, and logging. For example, you can specify whether to store session log data in an Amazon Simple Storage Service (Amazon S3) bucket or Amazon CloudWatch Logs log group. In this post, I demonstrate how to manage Session Manager preferences across your organization using AWS CloudFormation StackSets. You can use CloudFormation StackSets to manage resources and configurations, such as Session Manager preferences, across different AWS accounts and Regions using standardized templates to maintain consistent security and compliance standards across your entire AWS infrastructure.

Prerequisites

You need to meet the following prerequisites to deploy the solution in this post:

  • Basic understanding of CloudFormation
  • Trusted access enabled between CloudFormation StackSets and AWS Organizations
  • Access to an AWS management account or StackSet delegated admin account
  • Appropriate AWS Identity and Access Management (IAM) permissions to create and manage StackSets

The Session Manager environment has some additional prerequisites:

  • For EC2 instances with internet access, allow HTTPS (port 443) outbound traffic to:
    • ec2messages.<region>.amazonaws.com
    • ssm.<region>.amazonaws.com
    • ssmmessages.<region>.amazonaws.com

    Note: <region> represents the actual Region where you are deploying your instances.

  • Additional endpoints required for specific features:
    • For CloudWatch Logs integration: logs.<region>.amazonaws.com
    • For Amazon S3 log storage: s3.<region>.amazonaws.com
    • For session data encryption: kms.<region>.amazonaws.com

    Note: For EC2 instances without internet access, you must configure virtual private cloud (VPC) endpoints to maintain connectivity with Systems Manager and related services.

  • SSM Agent requirements:
    • Minimum version 2.3.68.0 for basic session connectivity
    • Version 3.0.222.0 or later for port forwarding and SSH sessions

    Note: Many AWS-provided and trusted third-party Amazon Machine Images (AMIs) come with the SSM Agent pre-installed. For more information, see Find AMIs with the SSM Agent preinstalled.

For a complete list of requirements, see Setting up Session Manager.

Solution overview

This solution, shown in Figure 1, automatically configures the SSM-SessionManagerRunShell document with customizable preferences that govern how Session Manager behaves across your AWS accounts. It creates resources for logging, encryption, and session controls, and updates the SSM-SessionManagerRunShell document with these preferences. The document is updated by an AWS Lambda function that helps make sure that the preferences are correctly applied. It transforms the default Session Manager preferences document to meet your enterprise compliance requirements. Changes are deployed using CloudFormation template provided in the GitHub repository. The solution supports multiple logging destinations, encryption options, and session controls to meet various security and compliance requirements.

Figure 1: Solution overview

Figure 1: Solution overview

Walkthrough

To deploy the solution, complete the following steps.

Step 1: Download or clone the repository

The first step is to download or clone the GitHub repository.

To download the repository:

  1. Go to the main page of the repository on GitHub.
  2. Choose Code and then choose Download ZIP.

To clone the repository:

  1. Make sure that you have Git installed.
  2. Run the following command in your terminal:
    git clone https://github.com/aws-samples/<repo-link>

Step 2: Create the CloudFormation StackSet

In this step, you deploy the solution’s resources by creating a CloudFormation StackSet using the provided CloudFormation template. Sign in to your management account or StackSet delegated admin account. To create the stack, follow the steps in Get started with StackSets using a sample template. Create the StackSet in each of the accounts and Regions where you plan to implement the solution. Note that you need to provide values for the parameters defined in the template to deploy the stack. The following table lists the parameters that you need to provide.

Parameter

Description

S3Logging

Enables storing session logs to an S3 bucket.

S3BucketName

Name of the S3 bucket for session logs. The bucket must exist or the deployment will fail.

S3KeyPrefix

Key prefix for session logs, will be appended by account ID and Region

S3EncryptionEnabled

If set to true, the S3 bucket you specified in the s3BucketName input must be encrypted.

CreateCWLogGroup

Creates the CloudWatch log group. If set to true, a CloudWatch log group will be created; if not, the log group name passed is used.

CWLogGroupName

The name of the CloudWatch log group you want to send session logs to.

CWEncryptionEnabled

If set to true, the CloudWatch log group you specified in the cwLogGroupName input must be encrypted.

CWStreamingEnabled

If set to true, a continual stream of session data logs is sent to the log group.

SessionDataEncryption

If set to true, session data is encrypted with a key created by the stack.

RunAsEnabled

If set to true, sessions are run using another user than ssm-user. The Run As feature is only supported for connecting to Linux and macOS managed nodes.

RunAsDefaultUser

The name of the user account to start sessions with on Linux and macOS managed nodes when the runAsEnabled input is set to true.

IdleSessionTimeout

The amount of time of inactivity you want to allow before a session ends. This input is measured in minutes.

MaxSessionDuration

The maximum amount of time you want to allow before a session ends. This input is measured in minutes.

WinShellProfile

The shell preferences, environment variables, working directories, and commands you specify for sessions on Windows Server managed nodes.

LinuxShellProfile

The shell preferences, environment variables, working directories, and commands you specify for sessions on Linux and macOS managed nodes.

Step 3: Update your EC2 instance profiles with proper permissions

Depending on the parameter values you pass when deploying the template, you need to update your EC2 instance profiles with proper permissions. For example, if you have enabled session data and session log encryption, you need to add the following policy to your instance profiles.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:DescribeLogGroups"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "logs:DescribeLogStreams"
            ],
            "Resource": "<arn:aws:logs:*:123456789012:log-group:ssm-sessionmanager-logs>",
            "Effect": "Allow"
        },
        {
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:123456789012:log-group:ssm-sessionmanager-logs:log-stream:*",
            "Effect": "Allow"
        },
        {
            "Condition": {
                "Null": {
                    "kms:ResourceAliases": "false"
                },
                "ForAllValues:StringLike": {
                    "kms:ResourceAliases": [
                        "alias/session-manager/data"
                    ]
                }
            },
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": "arn:aws:kms:us-east-1:123456789012:key/*",
            "Effect": "Allow"
        }
    ]
}

Note: If you enable S3 logging, you need to add the required permissions for that as well. See Configure a central S3 bucket for Session Manager logging article on AWS re:Post for more information about how to properly configure your S3 bucket and EC2 instance profile for centralized logging. Same-account logging follows a similar pattern.

Step 4: Verify the solution implementation

You can verify that the Session Manager preferences are correctly configured across your environment. Here’s a systematic approach to validation:

Verify preference configuration

Through the AWS Management Console, navigate to AWS Systems Manager Session Manager, choose Preferences and review the configured Session Manager preferences. Alternatively, verify the configuration through AWS CLI using:

aws ssm get-document --name "SSM-SessionManagerRunShell" --document-version \$LATEST

Validate session functionality

Start a new session following the AWS Systems Manager User Guide and perform the following validations:

  1. Verify the encryption configuration by starting a new session. If session data encryption is enabled, you should see the message This session is encrypted using AWS KMS when the session begins.
  2. For CloudWatch logging verification, navigate to the CloudWatch console and access the Log groups section. Confirm that your specified log group exists and KMS encryption is enabled if you configured it during deployment. Execute some commands in your session and observe the real-time log streaming to your configured log group.
  3. To verify S3 logging, establish a session and execute several commands. Terminate the session and check your configured S3 bucket for the session logs. Remember that S3 logs are only generated after the session is terminated.
  4. If you enabled the RunAsEnabled option, verify the configuration by executing the whoami command in your session. The output should match your configured RunAs user.

Resources

The following is a list of resources created by this solution:

AWS::Lambda::Function (UpdateSessionManagerFunction)
This resource creates a Lambda function that:

  • Updates the SSM-SessionManagerRunShell document with the specified preferences
  • Handles CloudFormation create, update, and delete events
  • Performs deep comparison of document contents to avoid unnecessary updates
  • Includes error handling and retry logic

AWS::IAM::Role (LambdaExecutionRole)
This resource creates an IAM role that allows the Lambda function to:

  • Execute with basic Lambda permissions
  • Access and modify the SSM-SessionManagerRunShell document
  • Access SSM parameters storing session data encryption key ID

AWS::KMS::Key (SessionDataKMSKey)
This conditional resource creates a KMS key for encrypting session data when SessionDataEncryption parameter is set to enabled. The key has a policy allowing key management with IAM.

AWS::KMS::Alias (SessionDataKeyAlias)
This conditional resource creates a friendly alias (alias/session-manager/data) for the session data encryption key. This value cannot be changed.

AWS::SSM::Parameter (SessionKeyID)
This conditional resource creates an Systems Manager parameter to store the KMS key ID for session data encryption, making it accessible to other components.

Note: The session data KMS key ID is stored in a Systems Manager parameter to decouple components and help prevent circular dependency and failures to due race conditions.

AWS::KMS::Key (SessionLogsKMSKey)
This conditional resource creates a KMS key for encrypting CloudWatch logs when CWEncryptionEnabled parameter is set to enabled. The key has a policy allowing CloudWatch Logs service to use it

Note: SessionLogsKMSKey is used to encrypt logs at-rest and is not used by the SSM Agent, so your instance profile does not need to have permission to this key. Logs are encrypted in-transit and will be encrypted by CloudWatch service after they are received.

AWS::KMS::Alias (SessionLogsKeyAlias)
This conditional resource creates a friendly alias (alias/session-manager/logs) for the CloudWatch Logs encryption key.

AWS::Logs::LogGroup (SessionManagerLogGroup)
This conditional resource creates a CloudWatch Logs group for session logs when the CreateCWLogGroup paremeter is set to enabled. The log group:

  • Uses the specified name (controlled by the CWLogGroupName parameter, and defaults to ssm-sessionmanager-logs)
  • Sets a 90-day retention period
  • Uses KMS encryption if enabled

Custom::UpdateSessionManager (UpdateSessionManagerCustomResource)
This custom resource invokes the Lambda function to update the SSM-SessionManagerRunShell document with the specified preferences.

Parameter groups

The following template parameters are available for customizing Session Manager behavior:

Parameter group

Parameters

Description

S3 logging

S3Logging, S3BucketName, S3KeyPrefix, S3EncryptionEnabled

Controls logging to Amazon S3

CloudWatch logging

CreateCWLogGroup, CWLogGroupName, CWEncryptionEnabled, CWStreamingEnabled

Controls logging to CloudWatch Logs

Encryption

SessionDataEncryption

Controls encryption of session data

Session controls

RunAsEnabled, RunAsDefaultUser, IdleSessionTimeout, MaxSessionDuration

Controls session behavior

Shell profiles

WinShellProfile, LinuxShellProfile

Controls shell environment

Conclusion

In this post, we explored how to implement and manage Session Manager preferences across your organization using CloudFormation StackSets. This solution enables centralized management of Session Manager configurations across multiple accounts and Regions from a single account, significantly simplifying the administration of remote access to your compute resources. Through automated deployment of security controls including session encryption, logging, and access restrictions, the solution helps facilitate consistent compliance with organizational security requirements while reducing manual configuration efforts and the risk of human error. As your organization grows, this solution scales seamlessly to accommodate new accounts and Regions while maintaining uniform security standards across your infrastructure.

Remember to regularly review and update your Session Manager preferences to align with evolving security requirements and organizational needs. For more information about AWS Systems Manager Session Manager, visit the official AWS documentation.

If you have feedback about this post, submit comments in the Comments section below.

Nima Fotouhi

Nima Fotouhi

Nima is a Security Consultant at AWS. He’s a builder with a passion for infrastructure as code (IaC) and policy as code (PaC) and helps customers build secure infrastructure on AWS. In his spare time, he loves to hit the slopes and go snowboarding.

AWS Weekly Roundup: Kiro waitlist, EBS Volume Clones, EC2 Capacity Manager, and more (October 20, 2025)

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-kiro-waitlist-ebs-volume-clones-ec2-capacity-manager-and-more-october-20-2025/

I’ve been inspired by all the activities that tech communities around the world have been hosting and participating in throughout the year. Here in the southern hemisphere we’re starting to dream about our upcoming summer breaks and closing out on some of the activities we’ve initiated this year. The tech community in South Africa is participating in Amazon Q Developer coding challenges that my colleagues and I are hosting throughout this month as a fun way to wind down activities for the year. The first one was hosted in Johannesburg last Friday with Durban and Cape Town coming up next.

Last week’s launches
These are the launches from last week that caught my attention:

Additional updates
I thought these projects, blog posts, and news items were also interesting:

Upcoming AWS events
Keep a look out and be sure to sign up for these upcoming events:

AWS re:Invent 2025 (December 1-5, 2025, Las Vegas) — AWS flagship annual conference offering collaborative innovation through peer-to-peer learning, expert-led discussions, and invaluable networking opportunities.

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Veliswa.

Monitor, analyze, and manage capacity usage from a single interface with Amazon EC2 Capacity Manager

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/monitor-analyze-and-manage-capacity-usage-from-a-single-interface-with-amazon-ec2-capacity-manager/

Today, I’m happy to announce Amazon EC2 Capacity Manager, a centralized solution to monitor, analyze, and manage capacity usage across all accounts and AWS Regions from a single interface. This service aggregates capacity information with hourly refresh rates and provides prioritized optimization opportunities, streamlining capacity management workflows that previously required custom automation or manual data collection from multiple AWS services.

Organizations using Amazon Elastic Compute Cloud (Amazon EC2) at scale operate hundreds of instance types across multiple Availability Zones and accounts, using On-Demand Instances, Spot Instances, and Capacity Reservations. This complexity means customers currently access capacity data through various AWS services including the AWS Management Console, Cost and Usage Reports, Amazon CloudWatch, and EC2 describe APIs. This distributed approach can create operational overhead through manual data collection, context switching between tools, and the need for custom automation to aggregate information for capacity optimization analysis.

EC2 Capacity Manager helps you overcome these operational complexities by consolidating all capacity data into a unified dashboard. You can now view cross-account and cross-Region capacity metrics for On-Demand Instances, Spot Instances, and Capacity Reservations across all commercial AWS Regions from a single location, eliminating the need to build custom data collection tools or navigate between multiple AWS services.

This consolidated visibility can help you discover cost savings by highlighting underutilized Capacity Reservations, analyzing usage patterns across instance types, and providing insights into Spot Instance interruption patterns. By having access to comprehensive capacity data in one place, you can make more informed decisions about rightsizing your infrastructure and optimizing your EC2 spending.

Let me show you the capabilities of EC2 Capacity Manager in detail.

Getting started with EC2 Capacity Manager
On the AWS Management Console, I navigate to Amazon EC2 and select Capacity Manager from the navigation pane. I enable EC2 Capacity Manager through the service settings. The service aggregates historical data from the previous 14 days during initial setup.

The main Dashboard displays capacity utilization across all instance types through a comprehensive overview section that presents key metrics at a glance. The capacity overview cards for Reservations, Usage, and Spot show trend indicators and percentage changes to help you identify capacity patterns quickly. You can apply filtering through the date filter controls, which include date range selection, time zone configuration, and interval settings.

You can select different units to analyze data by vCPUs, instance counts, or estimated costs to understand resource consumption patterns. Estimated costs are based on published On-Demand rates and do not include Savings Plans or other discounts. This pricing reference helps you compare the relative impact of underutilized capacity across different instance types—for example, 100 vCPU hours of unused p5 reservations represents a larger cost impact than 100 vCPU hours of unused t3 reservations.

The dashboard includes detailed Usage metrics with both total usage visualization and usage over time charts. The total usage section shows the breakdown between reserved usage, unreserved usage, and Spot usage. The usage over time chart provides visualization that tracks capacity trends over time, helping you identify usage patterns and peak demand periods.

Under Reservation metrics, Reserved capacity trends visualizes used and unused reserved capacity across the selected period, showing the proportion of reserved vCPU hours that remain unutilized compared with those actively consumed, helping you track reservation efficiency patterns and identify periods of consistent low utilization. This visibility can help you reduce costs by identifying underutilized reservations and helping you to make informed decisions about capacity adjustments.

The Unused capacity section lists underutilized capacity reservations by instance type and Availability Zone combinations, displaying specific utilization percentages and instance types across different Availability Zones. This prioritized list helps you identify potential savings with direct visibility into unused capacity costs.

The Usage tab provides detailed historical trends and usage statistics across all AWS Regions for Spot Instances, On-Demand Instances, Capacity Reservations, Reserved Instances, and Savings Plans. Dedicated Hosts usage is not included. The Dimension filter helps you group by and filter capacity data by Account ID, Region, Instance Family, Availability Zone, and Instance Type, creating custom views that reveal usage patterns across your accounts and AWS Organizations. This helps you analyze specific configurations and compare performance across accounts or Regions.

The Aggregations section provides a comprehensive usage table across EC2 and Spot Instances. You can select different units to analyze data by vCPUs, instance counts, or estimated costs to understand resource consumption patterns. The table shows instance family breakdowns with total usage statistics, reserved usage hours, unreserved usage hours, and Spot usage data. Each row includes a View breakdown action for a detailed analysis.

The Capacity usage or estimated cost trends section visualizes usage trends, reserved usage, unreserved usage, and Spot usage. You can filter the displayed data and adjust the unit of measurement to view historical patterns. These filtering and analysis tools help you identify usage trends, compare costs across dimensions, and make informed decisions for capacity planning and optimization.

When you choose View breakdown from the Aggregations table, you access detailed Usage breakdown based on the dimension filters you selected. This breakdown view shows usage patterns for individual instance types within the selected family and Availability Zone combinations, helping you identify specific optimization opportunities.

The Reservations tab displays capacity reservation utilization with automated analysis capabilities that generate prioritized lists of optimization opportunities. Similar to the Usage tab, you can apply dimension filters by Account ID, Region, Instance Family, Availability Zone, and Instance Type along with additional options related to the reservation details. On each of the tabs you can drill down to see data for individual line items. For reservations specifically, you can view specific reservations and access detailed information about On-Demand Capacity Reservations (ODCRs), including utilization history, configuration parameters, and current status. When the ODCR exists in the same account as Capacity Manager, you can modify reservation parameters directly from this interface, eliminating the need to navigate to separate EC2 console sections for reservation management.

The Statistics section provides summary metrics, including total reservations count, overall utilization percentage, reserved capacity totals, used and unused capacity volumes, average scheduled reservations, and counts of accounts, instance families, and Regions with reservations.

This consolidated view helps you understand reservation distribution and utilization patterns across your infrastructure. For example, you might discover that your development accounts consistently show 30% reservation utilization while production accounts exceed 95%, indicating an opportunity to redistribute or modify reservations. Similarly, you could identify that specific instance families in certain Regions have sustained low utilization rates, suggesting candidates for reservation adjustments or workload optimization. These insights help you make data-driven decisions about reservation purchases, modifications, or cancellations to better align your reserved capacity with actual usage patterns.

The Spot tab focuses on Spot Instance usage and displays the amount of time your Spot instances run before being interrupted. This analysis of Spot Instance usage patterns helps you identify optimization opportunities for Spot Instance workloads. You can use Spot placement score recommendations to improve workload flexibility.

For organizations requiring data export capabilities, Capacity Manager includes data exports to Amazon Simple Storage Service (Amazon S3) buckets for capacity analysis. You can view and manage your data exports through the Data exports tab, which helps you create new exports, monitor delivery status, and configure export schedules to analyze capacity data outside the AWS Management Console.

Data exports extend your analytical capabilities by storing capacity data beyond the 90-day retention period available through the console and APIs. This extended retention enables long-term trend analysis and historical capacity planning. You can also integrate exported data with existing analytics workflows, business intelligence tools, or custom reporting systems to incorporate EC2 capacity metrics into broader infrastructure analysis and decision-making processes.

The Settings section provides configuration options for AWS Organizations integration, enabling centralized capacity management across multiple accounts. Organization administrators can enable enterprise-wide capacity visibility or delegate access to specific accounts while maintaining appropriate permissions and access controls.

Now available
EC2 Capacity Manager eliminates the operational overhead of collecting and analyzing capacity data from multiple sources. The service provides automated optimization opportunities, centralized multi-account visibility, and direct access to capacity management tools. You can reduce manual analysis time while improving capacity utilization and cost optimization across your EC2 infrastructure.

Amazon EC2 Capacity Manager is available at no additional cost. To begin using Amazon EC2 Capacity Manager, visit the Amazon EC2 console or access the service APIs. The service is available in all commercial AWS Regions.

To learn more, visit the EC2 Capacity Manager documentation.

— Esra

Migrate encrypted Amazon EC2 instances across AWS Regions without sharing AWS KMS keys

Post Syndicated from Rakesh Mannepalli original https://aws.amazon.com/blogs/compute/migrate-encrypted-amazon-ec2-instances-across-aws-regions-without-sharing-aws-kms-keys/

At AWS, we’ve designed our global infrastructure with isolated AWS Regions to help you achieve high fault tolerance and stability for your applications. These AWS Regions are organized into partitions, each with distinct network and security boundaries.

As your business evolves, you might need to migrate workloads between AWS Regions. Perhaps you’re looking to reduce latency for users in new geographic areas, meet Region-specific compliance requirements, or you’re an ISV expanding your product’s availability. Whatever your motivation, cross-Region migration needs careful planning, especially when dealing with encrypted resources.

When migrating Amazon Elastic Compute Cloud (Amazon EC2) instances with encrypted Amazon Elastic Block Storage (Amazon EBS) volumes across AWS Regions with in the same account or a different account, you face a particular challenge: AWS Key Management Service (AWS KMS) keys are AWS Region-specific and cannot be shared across AWS Regions. This post provides a step-by-step approach to successfully migrate your encrypted EC2 instances without compromising your security posture by sharing your KMS keys.

Solution overview

The following diagram and steps are an overview of how an EC2 instance can be migrated to a different Region in a different account without sharing the KMS keys.

Figure 1:Design to migrate EC2 between two accounts

Figure 1:Design to migrate EC2 between two accounts

Prerequisites

The following prerequisites are necessary to complete this solution:

  • Create an S3 bucket in both the source and target Region.
  • Configure the target account Amazon S3 bucket with the following policy to copy the Amazon Machine Image (AMI) file between two accounts:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:sts::1234567891:assumed-role/<rolename>"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::<target-bucket-name>/*"
        }
    ]
}

Implementation steps

Now based on the above architecture, you are implementing the follow steps to move your EC2 instance from the source account to the target account

  1. Create an AMI of the server that you want to move to a different Region in the same account or different account.
    1. Choose the server
    2. Choose Actions, Image and templates, and Create image.

      Figure 2: steps to create an AMI

      Figure 2: steps to create an AMI

    3. Fill the details and choose Create Image.

      Figure 3: Confirming AMI creation attributes

      Figure 3: Confirming AMI creation attributes

  2. Check the status of the AMI by choosing AMI ID or under AMI on your left-hand side menu, wait until the status shows as Available.

    Figure 4: AMI availability status

    Figure 4: AMI availability status

  3. Run the following command using AWS CloudShell from the AWS console.
    aws ec2 create-store-image-task --image-id ami-xxxxxxxxx –bucket <bucket_name>

  4. You can check the status of the job using the following command to make sure it is completed.
    aws ec2 describe-store-image-tasks

    Figure 5: CloudShell command execution

    Figure 5: CloudShell command execution

     

  5. Now you can see your AMI bin file in the S3 bucket.

    Figure 6: .bin file in the source S3 bucket

    Figure 6: .bin file in the source S3 bucket

  6. Copy the AMI bin to the target S3 bucket using the following command from the CloudShell in the source account.
    aws s3 cp s3://<source_bucket>/ami-000xxxxxxxxx.bin s3://<target_bucket>/

  7. When the copy job is completed, validate the AMI’s availability in .bin format in the target AWS account S3 bucket.

    Figure 7: .bin file in the target S3 bucket

    Figure 7: .bin file in the target S3 bucket

  8. Now restore the .bin file as an AMI in the target account by running the following command in the target account CloudShell.
    aws ec2 create-restore-image-task --object-key ami-xxxx.bin --bucket <target_bucket> --name "<AMI_name>"

    Figure 8: CloudShell command execution

    Figure 8: CloudShell command execution

     

  9. Check the availability of the AMI under the EC2 section in the target. You should find a new AMI ID along with the source Region information.

    Figure 9: AMI created in the target account

    Figure 9: AMI created in the target account

  10. Launch the instance using the migrated AMI in the target Region.

    Figure 10: Launched EC2 instance in the target account

    Figure 10: Launched EC2 instance in the target account

Limitations

Following are the limitations with this process:

  • To store an AMI, your AWS account must either own the AMI and its snapshots, or the AMI and its snapshots must be shared directly with your account. You can’t store an AMI if it is only publicly shared.
  • Only Amazon EBS-backed AMIs can be stored using these APIs.
  • Paravirtual (PV) AMIs are not supported.
  • The size of an AMI (before compression) that can be stored is limited to 5,000 GB.
  • Quota on store image requests: 1,200 GB of storage work (snapshot data) in progress.
  • Quota on restore image requests: 600 GB of restore work (snapshot data) in progress.
  • For the duration of the store task, the snapshots must not be deleted and the AWS Identity and Access Management (IAM) principal doing the store must have access to the snapshots, otherwise the store process fails.
  • You can’t create multiple copies of an AMI in the same S3 bucket.
  • An AMI that is stored in an S3 bucket can’t be restored with its original AMI ID. You can mitigate this by using AMI aliasing.
  • Currently the store and restore APIs are only supported by using the AWS Command Line Interface (AWS CLI), AWS SDKs, and Amazon EC2 API. You can’t store and restore an AMI using the Amazon EC2 console.

Clean up resources

When you have successfully deployed the server in the target Region you can delete the S3 buckets that were created for this migration. You can also terminate EC2 and delete associated EBS volumes and snapshots if you do not need them to avoid additional cost.

Conclusion

In this post, we showed you how to migrate an Amazon EC2 instances into another Region in a different account without sharing any AWS KMS keys in a secured manner.

Securing applications with AWS Nitro Enclaves: TLS termination, TAP networking, and IMDSv2

Post Syndicated from David-Paul Dornseifer original https://aws.amazon.com/blogs/compute/securing-applications-with-aws-nitro-enclaves-tls-termination-tap-networking-and-imdsv2/

AWS Nitro Enclaves provide isolated environments that keep critical operations such as decryption and cryptographic key management secure from both from root user and external threats.

Many customers have applications that require end-to-end authentication using Transport Layer Security (TLS) and requiring control over TLS termination.

TLS termination refers to the process where encrypted TLS traffic is decrypted using the server’s private key, converting the secure encrypted communication back to plaintext for processing. TLS termination can be done directly within an enclave, helping to ensure that encrypted traffic is not exposed outside the trusted boundary.

This is particularly valuable for public-facing services such as anonymization proxies and Model Context Protocol (MCP) servers, where clients demand assurance that their communications are protected and the application’s integrity can be independently verified using cryptographic attestation in a remote fashion.

This post covers critical design and implementation decisions from the Build multi-party crypto wallets with AWS Nitro Enclaves workshop and the associated public GitHub repository.

Specifically, in this blog we explore patterns on how:

  1. you can build applications that are remotely verifiable by clients, including enclave-based TLS termination using Nitriding, an open-source framework built by Brave and AWS Nitro Enclaves.
  2. you can configure TAP networking devices for AWS Nitro Enclaves using gvproxy.
  3. your enclaves can access EC2 instance metadata service (IMDSv2) and fetch temporary AWS credentials.
  4. you can decrypt secrets via AWS Key Management Service (KMS) using cryptographic attestation and the Python Boto3 SDK.

Prerequisites and Deployment

This post builds on our workshop “Build multi-party crypto wallets with AWS Nitro Enclaves” which demonstrates a Shamir Secret Sharing (SSS) application. The SSS app securely splits cryptographic private keys into multiple shards, requiring a threshold number to reconstruct the original key, ideal for Nitro Enclaves as it prevents any single party from accessing the complete key while maintaining operational functionality.

To follow along hands-on, you’ll need to deploy the provided AWS Cloud Development Kit (CDK) stack from the workshop repository on GitHub. However, you can understand the concepts and architecture discussed in this post without deploying the solution yourself.

Solution architecture

The following diagram depicts the high-level architecture of the solution.

Comprehensive AWS architecture showing VPC networking, container deployment, security services, and managed database integration

Before we dive deep into the application design, lets introduce the high-level components enclosed in the AWS Cloud Development Kit (AWS CDK) stack:

  • A dedicated virtual private cloud (VPC) and private subnets are created. Internet access is only possible through a NAT gateway, avoiding public exposure of the Amazon Elastic Compute Cloud (EC2) instances.
  • EC2 instances are placed in several private subnets and in different Availability Zones (AZ) using the auto-scaling group (ASG) to provide high availability. Network Load Balancer (NLB) is used to distribute the requests between different EC2 instances in the ASG. Each EC2 instance has one AWS Nitro enclave associated.
  • AWS Key Management Service (AWS KMS) manages the symmetric key required for secure private key management using AWS Nitro Enclaves.
  • Amazon DynamoDB is used to store the key shards for the Shamir Secret Sharing (SSS) solution.

Application design

During the AWS CDK deployment process (shown in the following figure), the following application will be built and deployed to the EC2 instance and the associated enclave. You can review the Python source code for the different components in the public GitHub repository.

Detailed AWS Nitro Enclave security architecture illustrating attestation process, TLS certification, and DynamoDB integration

EC2 instance (left side)

  • gvproxy: Proxy component that manages outbound and inbound TCP to vsock connections.
  • watchdog: Systemd service that starts the enclave and makes sure it stays up and healthy.
  • imds proxy: Systemd service that forwards Instance Metadata calls originating from vsock to 169.254.169.254. This allows the enclave to request fresh IMDSv2 credentials.

Enclave (right side)

  • TAP interface: gvproxy counterpart. A fully routed network interface created by nitriding-daemon that allows inbound and outbound traffic routing in the enclave.
  • imds proxy: IMDS proxy counterpart that allows the enclave to request credentials from its parent instance metadata service.
  • nitriding-daemon: HTTPS service that terminates incoming HTTPS connections, responds to attestation requests, and forwards all /app* HTTP requests to the sss app HTTP listener.
  • SSS application: An SSS application that interacts with all AWS services such as AWS KMS or DynamoDB through Boto3 and provides key management and signing capabilities.
  • Nitro Secure Module: Enclave internal /dev/nsm device that provides attestation and random number generator capabilities. Attestation private/public keys are managed by AWS.

Enclave based TLS termination and Remote Validation

Let’s now see how we can achieve TLS termination inside the enclave and allow remote clients to verify the enclaves code.

To do so, we are using Nitriding, a Go toolkit that simplifies running web applications inside AWS Nitro Enclaves without requiring networking stack changes. It uses gvproxy to create a tap0 interface, enabling controlled inbound and outbound traffic for the application inside the enclave.

Let’s have a look at the most important features nitriding offers.

TLS Termination: Nitriding generates an ephemeral private/public key pair on first launch, issuing a self-signed certificate for TLS. Furthermore, it supports Let’s Encrypt certificates for production use.

Application integration: Nitriding terminates TLS and forwards all /app* HTTP requests to the HTTP listener of the configured application. In the workshop these requests are forwarded to the SSS application.

Attestation endpoint: By default, nitriding exposes an /attestation endpoint that accepts a nonce value and returns a signed cryptographic attestation document.

This cryptographic attestation document includes hash measurements, also referred to as platform configuration registers (PCR), such as the hash of the enclave images (PCR0) or details about the parent EC2 instance (PCR4). For details on these measurements, refer to Where to get an enclave’s measurements.

The attestation document supports optional, customizable fields, namely nonce, public-key and user_data, which can be set individually for every attestation doc. For more information on the Nitro Enclaves attestation process and document structure, refer to Nitro Enclaves Attestation Process or check out the workshop sections about Customizing Attestation or document Validation.

Nitriding adds the nonce to the attestation document as a measure of freshness. Furthermore, the fingerprint (hash) of TLS certificate used by the enclave, is being added to the user_data field, as shown in the following sequence diagram.

This binds the certificate to the specific enclave instance.

Detailed AWS Nitro Enclave attestation sequence showing vsock communication and system calls

By comparing the TLS certificate fingerprint presented during the HTTPs connection and the fingerprint in the attestation document, you can prove the following aspects:

  • The private key for TLS termination resides securely inside the enclave (in a trusted AWS environment).
  • The enclave is running trusted code, as verified by the attestation’s PCR (Platform Configuration Register) measurements.
  • The identity of the enclave is validated, whether the code is open source (allowing deterministic measurement through reproducible builds) or closed source (with measurements distributed by the provider). For more information on deterministic and reproducible builds, refer to Establishing verifiable security: Reproducible builds and AWS Nitro Enclaves.

Horizontal scaling

Let’s now have look into the scaling properties of a AWS Nitro Enclave based nitriding application and learn how we can improve the processing capacities of our application by scaling out horizontally.The provided CDK, by default, provisions a single EC2 instance with its associated enclave. As depicted in the preceding sequence diagram, nitriding generates a self-signed certificate at the start and uses it to terminate TLS connections. This approach is limited to a single worker because load balancing requests over several workers would introduce non-identical TLS certificates. Non-identical TLS certificates behind NLB can cause certificate mismatch errors and TLS handshake failures when clients are routed to different backend servers with certificates that don’t match (the expected domain name) or have different validation properties.There are different ways you can address this issue besides implementing your own cryptographic attestation-based method:

  • Create a symmetric KMS key and associate it with your enclaves using AWS KMS condition keys for AWS Nitro Enclaves. Use AWS Certificate Manager (ACM) to create an exportable TLS certificate. Alternatively, generate a custom TLS certificate in a trusted environment. Encrypt all sensitive key material via AWS KMS and store the ciphertext in a database such as DynamoDB. Provide the encrypted TLS certificate to each enclave that requires access and use cryptographic attestation to decrypt the TLS certificate or key.
  • Nitriding provides an enclave key synchronization mechanism based on AWS Nitro Enclaves cryptographic attestation. This mechanism supports Let’s Encrypt certificates out of the box so organizations can avoid all the operational and security challenges associated with self-signed certificates, particularly in context of web browsers.

Virtual Networking for Enclaves with Tap Interface

Now let’s deep dive into how nitriding provides tap0based networking (to the enclave) and learn how we can use tap0 networking without nitriding.

As mentioned previously, nitriding uses gvisor-tap-vsock package to provide tap0 based networking to the enclave.

gvisor-tap-vsock delivers a user-mode network stack for virtual machines (VMs) and containers, enabling secure, flexible connectivity between AWS Nitro Enclaves and external networks.

You can use gvisor-tap-vsock independently from nitriding if you only require tap0 networking without TLS termination and http forwarding capabilities. The setup remains the same as in the workshop; however instead of nitriding binary, you need to include the gvforwarder binary in the enclave Dockerfile. The build instructions can be found in Makefile.

After copying the binary into your Docker file, use a similar command in your enclave start.sh file to activate DNS resolution and start gvforwarder:

echo "nameserver 192.168.127.1" > /run/resolvconf/resolv.conf
./app/gvforwarder -url vsock://3:1024/connect &

After you have started your enclave with gvforwarder you can manage port forwarding using the gvproxy process running on EC2 parent instances as done in the workshop.

IMDSv2 access from inside Enclaves

This section explores the requirement of accessing EC2 Instance Metadata Service Version 2 (IMDSv2) from inside an enclave and discusses different ways on how access can be provided.

Applications inside AWS Nitro Enclaves often need access to IMDSv2 to obtain temporary AWS credentials to interact with AWS services such as AWS KMS for decrypt operations. IMDSv2 is only accessible from within the associated EC2 instance and can be accessed at 169.254.169.254.You can enable IMDSv2 access for enclaves using one of the following two approaches:

Dedicated vsock proxy route (as done in the workshop)

Run a vsock proxy on the EC2 parent instance and one inside the enclave to provide access to IMDSv2 from inside the enclave. Apply the following configuration to your enclave to map 169.254.169.254 from inside the enclave to the endpoint on the parent instance:

ip addr add 169.254.169.254/16 dev lo
IN_ADDRS=169.254.169.254:80 OUT_ADDRS=3:8002 ./app/proxy &

This method is suitable if you do not need a tap interface in the enclave and want to tightly control outbound communication.

TAP interface with gvisor-tap-vsock

If your enclave uses a tap interface via gvisor, pass the -ec2-instance-metadata flag in the gvisor start command on the parent EC2 instance. This allows the host process to forward IMDSv2 traffic from the enclave (via tap0) to the metadata service. Ensure you are using gvisor-tap-vsock version v0.8.7 or newer for this feature.

Any of the EC2 parent instance or enclave related changes described in this section can be applied to an existing workshop CDK stack by rerunning the cdk deploy command as described here: Deploy the CDK application.

Encrypting and decrypting secrets inside AWS Nitro Enclaves using Python and Cryptographic Attestation

In this section we will go in depth on how KMS based decryption can be implemented inside enclaves in Python using AWS SDK for Python (Boto3).

Decryption, leverages the enclave’s unique cryptographic attestation feature unavailable directly on standard EC2 instances – ensuring enhanced security by verifying the enclave’s integrity.Encryption inside an enclave using the Boto3 SDK however mirrors the process outside the enclave, so it’s not detailed here.

High-Level Decryption Flow

The process for decrypting content inside a Nitro Enclave follows these streamlined steps:

  1. Ensure that the enclave has outbound networking configured.
  2. Generate an ephemeral RSA key pair.
  3. Request an attestation document that includes the public key.
  4. Create a KMS decrypt request with the ciphertext and attached attestation document.
  5. Receive and parse the resulting ciphertext_for_recipient in Cryptographic Message Syntax (CMS) format.

This flow enables secure decryption in Python, aligning with workshop examples for practical implementation.

Make sure that the tap0 network Interface is up and running and DNS has been configured

The Python code example discussed uses Boto3 SDK. Boto3 requires a fully routed network interface such as tap0 as described previously and access to AWS credentials. The credentials can be managed manually as done in the workshop or managed automatically by the SDK. See the previous section about managing AWS credentials.

Generate an ephemeral RSA key pair inside the enclave

Generate a fresh RSA private/public key pair for each session. This key is just used for the re-encryption schema and does not need persisted.

from cryptography.hazmat.primitives.asymmetric import rsa
private_key = rsa.generate_private_key(
    public_exponent=65537,
    key_size=2048,
)
public_key = private_key.public_key() 

Request an attestation document included the public key

Use the Nitro Secure Module (NSM) to generate an attestation document that cryptographically proves enclave identity and includes the ephemeral public key.

import base64
import aws_nsm_interface_verifiably
file_desc = aws_nsm_interface_verifiably.open_nsm_device()
attestation_doc = aws_nsm_interface_verifiably.get_attestation_doc(
    file_desc, public_key=public_key_raw)["document"]
attestation_doc_b64 = base64.b64encode(attestation_doc).decode("utf-8") 

AWS Nitro Enclaves SDK for C can be used along with Python to interact with the NSM device as done in the Validate a Nitro Enclave Attestation Document sample code repository.

Create an AWS KMS decrypt request including the ciphertext and attestation document

Send the attestation document as part of the Recipient parameter in the AWS KMS decrypt API call. AWS KMS will verify the attestation and encrypt the response for your enclave’s public key.

response = kms_client.decrypt(
    KeyId=ssm_params["KMSKeyID"],
    CiphertextBlob=base64.standard_b64decode(ciphertext_blob_b64),
    Recipient={
        "KeyEncryptionAlgorithm": "RSAES_OAEP_SHA_256",
        "AttestationDocument": base64.standard_b64decode(attestation_doc_b64),
    },
)

Receive and parse the ciphertext_for_recipient CMS document

AWS KMS returns a Cryptographic Message Syntax (CMS) structure containing the encrypted symmetric key and ciphertext. To decrypt, use the following steps:

  1. Load the private key from Step 2
from cryptography.hazmat.primitives import serialization
with open(private_key_file, "rb") as f:
    private_key_raw = f.read()
private_key = serialization.load_der_private_key(private_key_raw, 
                                   password=None)
  1. Parse the CMS structure

Use a library such as asn1crypto to extract the encrypted key, initialization vector (IV), and encrypted content.

from asn1crypto import cms
content_info = cms.ContentInfo.load(ciphertext_for_recipient)
enveloped_data = content_info["content"]
recipient_infos = enveloped_data["recipient_infos"][0].chosenencrypted_key = recipient_infos["encrypted_key"].native
encrypted_content_info = enveloped_data["encrypted_content_info"]
content_encryption_algorithm = encrypted_content_info["content_encryption_algorithm"]
iv = content_encryption_algorithm["parameters"].native
encrypted_content = encrypted_content_info["encrypted_content"].native
  1. Decrypt the symmetric key

CMS uses private/public key cryptography to encrypt a symmetric key that is used for the payload. Use the enclave’s RSA private key to decrypt the symmetric key with OAEP padding.

from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.primitives import hashes
decrypted_sym_key = private_key.decrypt(
    encrypted_key,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None,
    ),
)
  1. Decrypt the content with Advanced Encryption Standard (AES)

Use the decrypted symmetric key and IV to decrypt the content (typically using AES-CBC).

from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import padding as sym_padding
cipher = Cipher(
    algorithms.AES(decrypted_sym_key), modes.CBC(iv), backend=default_backend()
)
decryptor = cipher.decryptor()
decrypted_padded = decryptor.update(encrypted_content) + decryptor.finalize()
unpadder = sym_padding.PKCS7(128).unpadder()
decrypted_content = unpadder.update(decrypted_padded) + unpadder.finalize()
  1. Encode the content for transport

Encode the decrypted content as base64 for safe transport or further processing.

import base64
result = base64.b64encode(decrypted_content).decode("utf-8")

Cleanup

To avoid incurring future charges, delete the resources following the steps described in the workshop Cleanup section.

Conclusion

In this post, you learned how to use AWS Nitro Enclaves for building secure (public) applications using TLS termination, cryptographic attestation and TAP networking. The implementation includes practical examples using gvisor-tap-vsock tap networking, secure IMDSv2 access patterns and Python based CMS decrypt..

Ready to enhance your application security? Visit our GitHub repository and workshop to start building with AWS Nitro Enclaves today.

New general-purpose Amazon EC2 M8a instances are now available

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/new-general-purpose-amazon-ec2-m8a-instances-are-now-available/

Today, we’re announcing the availability of Amazon Elastic Compute Cloud (Amazon EC2) M8a instances, the latest addition to the general-purpose M instance family. These instances are powered by the 5th Generation AMD EPYC (codename Turin) processors with a maximum frequency of 4.5GHz. Customers can expect up to 30% higher performance and up to 19% better price performance compared to M7a instances. They also provide higher memory bandwidth, improved networking and storage throughput, and flexible configuration options for a broad set of general-purpose workloads.

Improvements in M8a
M8a instances deliver up to 30% better performance per vCPU compared to M7a instances, making them ideal for applications that require benefit from high performance and high throughput such as financial applications, gaming, rendering, application servers, simulation modeling, midsize data stores, application development environments, and caching fleets.

They provide 45% more memory bandwidth compared to M7a instances, accelerating in-memory databases, distributed caches, and real-time analytics.

For workloads with high I/O requirements, M8a instances provide up to 75 Gbps of networking bandwidth and 60 Gbps of Amazon Elastic Block Store (Amazon EBS) bandwidth, a 50% improvement over the previous generation. These enhancements support modern applications that rely on rapid data transfer and low-latency network communication.

Each vCPU on an M8a instance corresponds to a physical CPU core, meaning there is no simultaneous multithreading (SMT). In application benchmarks, M8a instances delivered up to 60% faster performance for GroovyJVM and up to 39% faster performance for Cassandra compared to M7a instances.

M8a instances support instance bandwidth configuration (IBC), which provides flexibility to allocate resources between networking and EBS bandwidth. This gives customers the flexibility to scale network or EBS bandwidth by up to 25% and improve database performance, query processing, and logging speeds.

M8a is available in ten virtualized sizes and two bare metal options (metal-24xl and metal-48xl), providing deployment choices that scale from small applications to large enterprise workloads. All of these improvements are built on the AWS Nitro System, which delivers low virtualization overhead, consistent performance, and advanced security across all instance sizes. These instances are built using the latest sixth generation AWS Nitro Cards, which offload and accelerate I/O for functions, increasing overall system performance.

M8a instances feature sizes of up to 192 vCPU with 768GiB RAM. Here are the detailed specs:

M8a vCPUs Memory (GiB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
medium 1 4 Up to 12.5 Up to 10
large 2 8 Up to 12.5 Up to 10
xlarge 4 16 Up to 12.5 Up to 10
2xlarge 8 32 Up to 15 Up to 10
4xlarge 16 64 Up to 15 Up to 10
8xlarge 32 128 15 10
12xlarge 48 192 22.5 15
16xlarge 64 256 30 20
24xlarge 96 384 40 30
48xlarge 192 768 75 60
metal-24xl 96 384 40 30
metal-48xl 192 768 75 60

For a complete list of instance sizes and specifications, refer to the Amazon EC2 M8a instances page.

When to use M8a instances
M8a is a strong fit for general-purpose applications that need a balance of compute, memory, and networking. M8a instances are ideal for web and application hosting, microservices architectures, and databases where predictable performance and efficient scaling are important.

These instances are SAP certified and also well suited for enterprise workloads such as financial applications and enterprise resource planning (ERP) systems. They’re equally effective for in-memory caching and customer relationship management (CRM), in addition to development and test environments that require cost efficiency and flexibility. With this versatility, M8a supports a wide spectrum of workloads while helping customers improve price performance.

Now available
Amazon EC2 M8a instances are available today in US East (Ohio) US West (Oregon) and Europe (Spain) AWS Regions. M8a instances can be purchased as On-Demand, Savings Plans, and Spot Instances. M8a instances are also available on Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

To learn more, visit the Amazon EC2 M8a instances page and send feedback to AWS re:Post for EC2 or through your usual AWS support contacts.

Betty

Introducing new compute-optimized Amazon EC2 C8i and C8i-flex instances

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-new-compute-optimized-amazon-ec2-c8i-and-c8i-flex-instances/

After launching Amazon Elastic Compute Cloud (Amazon EC2) memory-optimized R8i and R8i-flex instances and general-purpose M8i and M8i-flex instances, I am happy to announce the general availability of compute-optimized C8i and C8i-flex instances powered by custom Intel Xeon 6 processors available only on AWS with sustained all-core 3.9 GHz turbo frequency and feature a 2:1 ratio of memory to vCPU. These instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

The C8i and C8i-flex instances offer up to 15 percent better price-performance, and 2.5 times more memory bandwidth compared to C7i and C7i-flex instances. The C8i and C8i-flex instances are up to 60 percent faster for NGINX web applications, up to 40 percent faster for AI deep learning recommendation models, and 35 percent faster for Memcached stores compared to C7i and C7i-flex instances.

C8i and C8i-flex instances are ideal for running compute-intensive workloads, such as web servers, caching, Apache.Kafka, ElasticSearch, batch processing, distributed analytics, high performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding.

As like other 8th generation instances, these instances use the new sixth generation AWS Nitro Cards, delivering up to two times more network and Amazon Elastic Block Storage (Amazon EBS) bandwidth compared to the previous generation instances. They also support bandwidth configuration with 25 percent allocation adjustments between network and Amazon EBS bandwidth, enabling better database performance, query processing, and logging speeds.

C8i instances
C8i instances provide up to 384 vCPUs and 768 TB memory including bare metal instances that provide dedicated access to the underlying physical hardware. These instances help you to run compute-intensive workloads, such as CPU-based inference, and video streaming that need the largest instance sizes or high CPU continuously.

Here are the specs for C8i instances:

Instance size vCPUs Memory (GiB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
c8i.large 2 4 Up to 12.5 Up to 10
c8i.xlarge 4 8 Up to 12.5 Up to 10
c8i.2xlarge 8 16 Up to 15 Up to 10
c8i.4xlarge 16 32 Up to 15 Up to 10
c8i.8xlarge 32 64 15 10
c8i.12xlarge 48 96 22.5 15
c8i.16xlarge 64 128 30 20
c8i.24xlarge 96 192 40 30
c8i.32xlarge 128 256 50 40
c8i.48xlarge 192 384 75 60
c8i.96xlarge 384 768 100 80
c8i.metal-48xl 192 384 75 60
c8i.metal-96xl 384 768 100 80

C8i-flex instances
C8i-flex instances are a lower-cost variant of the C8i instances, with 5 percent better price performance at 5 percent lower prices. These instances are designed for workloads that benefit from the latest generation performance but don’t fully utilize all compute resources. These instances can reach up to the full CPU performance 95 percent of the time.

Here are the specs for the C8i-flex instances:

Instance size vCPUs Memory (GiB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
c8i-flex.large 2 4 Up to 12.5 Up to 10
c8i-flex.xlarge 4 8 Up to 12.5 Up to 10
c8i-flex.2xlarge 8 16 Up to 15 Up to 10
c8i-flex.4xlarge 16 32 Up to 15 Up to 10
c8i-flex.8xlarge 32 64 Up to 15 Up to 10
c8i-flex.12xlarge 48 96 Up to 22.5 Up to 15
c8i-flex.16xlarge 64 128 Up to 30 Up to 20

If you’re currently using earlier generations of compute-optimized instances, you can adopt C8i-flex instances without having to make changes to your application or your workload.

Now available
Amazon EC2 C8i and C8i-flex instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Spain) AWS Regions. C8i and C8i-flex instances can be purchased as On-Demand, Savings Plan, and Spot instances. C8i instances are also available in Dedicated Instances and Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

Give C8i and C8i-flex instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 C8i instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

Announcing Amazon ECS Managed Instances for containerized applications

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/announcing-amazon-ecs-managed-instances-for-containerized-applications/

Today, we’re announcing Amazon ECS Managed Instances, a new compute option for Amazon Elastic Container Service (Amazon ECS) that enables developers to use the full range of Amazon Elastic Compute Cloud (Amazon EC2) capabilities while offloading infrastructure management responsibilities to Amazon Web Service (AWS). This new offering combines the operational simplicity of offloading infrastructure with the flexibility and control of Amazon EC2, which means customers can focus on building applications that drive innovation, while reducing total cost of ownership (TCO) and maintaining AWS best practices.

Customers running containerized workloads told us they want to combine the simplicity of serverless with the flexibility of self-managed EC2 instances. Although serverless options provide an excellent general-purpose solution, some applications require specific compute capabilities, such as GPU acceleration, particular CPU architectures, or enhanced networking performance. Additionally, customers with existing Amazon EC2 capacity investments through EC2 pricing options couldn’t fully use these commitments with serverless offerings.

Amazon ECS Managed Instances provides a fully managed container compute environment that supports a broad range of EC2 instance types and deep integration with AWS services. By default, it automatically selects the most cost-optimized EC2 instances for your workloads, but you can specify particular instance attributes or types when needed. AWS handles all aspects of infrastructure management, including provisioning, scaling, security patching, and cost optimization, enabling you to concentrate on building and running your applications.

Let’s try it out

Looking at the AWS Management Console experience for creating a new Amazon ECS cluster, I can see the new option for using ECS Managed Instances. Let’s take a quick tour of all the new options.

Creating a ECS cluster with Managed Instances

After I’ve selected Fargate and Managed Instances, I’m presented with two options. If I select Use ECS default, Amazon ECS will choose general purpose instance types based on grouping together pending Tasks, and picking the optimum instance type based on cost and resilience metrics. This is the most straightforward and recommended way to get started. Selecting Use custom – advanced opens up additional configuration parameters, where I can fine-tune the attributes of instances Amazon ECS will use.

Creating a ECS cluster with Managed Instances

By default, I see CPU and Memory as attributes, but I can select from 20 additional attributes to continue to filter the list of available instance types Amazon ECS can access.

Creating a ECS cluster with Managed Instances

After I’ve made my attribute selections, I see a list of all the instance types that match my choices.

Creating a ECS cluster with Managed Instances

From here, I can create my ECS cluster as usual and Amazon ECS will provision instances for me on my behalf based on the attributes and criteria I’ve defined in the previous steps.

Key features of Amazon ECS Managed Instances

With Amazon ECS Managed Instances, AWS takes full responsibility for infrastructure management, handling all aspects of instance provisioning, scaling, and maintenance. This includes implementing regular security patches initiated every 14 days (due to instance connection draining, the actual lifetime of the instance may be longer), with the ability to schedule maintenance windows using Amazon EC2 event windows to minimize disruption to your applications.

The service provides exceptional flexibility in instance type selection. Although it automatically selects cost-optimized instance types by default, you maintain the power to specify desired instance attributes when your workloads require specific capabilities. This includes options for GPU acceleration, CPU architecture, and network performance requirements, giving you precise control over your compute environment.

To help optimize costs, Amazon ECS Managed Instances intelligently manages resource utilization by automatically placing multiple tasks on larger instances when appropriate. The service continually monitors and optimizes task placement, consolidating workloads onto fewer instances to dry up, utilize and terminate idle (empty) instances, providing both high availability and cost efficiency for your containerized applications.

Integration with existing AWS services is seamless, particularly with Amazon EC2 features such as EC2 pricing options. This deep integration means that you can maximize existing capacity investments while maintaining the operational simplicity of a fully managed service.

Security remains a top priority with Amazon ECS Managed Instances. The service runs on Bottlerocket, a purpose-built container operating system, and maintains your security posture through automated security patches and updates. You can see all the updates and patches applied to the Bottlerocket OS image on the Bottlerocket website. This comprehensive approach to security keeps your containerized applications running in a secure, maintained environment.

Available now

Amazon ECS Managed Instances is available today in US East (North Virginia), US West (Oregon), Europe (Dublin), Africa (Cape Town), Asia Pacific (Singapore), and Asia Pacific (Tokyo) AWS Regions. You can start using Managed Instances through the AWS Management Console, AWS Command Line Interface (AWS CLI), or infrastructure as code (IaC) tools such as AWS Cloud Development Kit (AWS CDK) and AWS CloudFormation. You pay for the EC2 instances you use plus a management fee for the service.

To learn more about Amazon ECS Managed Instances, visit the documentation and get started simplifying your container infrastructure today.

Automate and orchestrate Amazon EMR jobs using AWS Step Functions and Amazon EventBridge

Post Syndicated from Senthil Kamala Rathinam original https://aws.amazon.com/blogs/big-data/automate-and-orchestrate-amazon-emr-jobs-using-aws-step-functions-and-amazon-eventbridge/

Many enterprises are adopting Apache Spark for scalable data processing tasks such as extract, transform, and load (ETL), batch analytics, and data enrichment. As data pipelines evolve, the need for flexible and cost-efficient execution environments that support automation, governance, and performance at scale also evolve in parallel. Amazon EMR provides a powerful environment to run Spark workloads, and depending on workload characteristics and compliance requirements, teams can choose between fully managed options like Amazon EMR Serverless or more customizable configurations using Amazon EMR on Amazon Elastic Compute Cloud (Amazon EC2).

In use cases where infrastructure control, data locality, or strict security postures are essential, such as in financial services, healthcare, or government, running transient EMR on EC2 clusters becomes a preferred choice. However, orchestrating the full lifecycle of these clusters, from provisioning to job submission and eventual teardown, can introduce operational overhead and risk if done manually.

To streamline this process, the AWS Cloud offers built-in orchestration capabilities using AWS Step Functions and Amazon EventBridge. Together, these services help you automate and schedule the entire EMR job lifecycle, reducing manual intervention while optimizing cost and compliance. Step Functions provides the workflow logic to manage cluster creation, Spark job execution, and cluster termination, and EventBridge schedules these workflows based on business or operational needs.

In this post, we discuss how to build a fully automated, scheduled Spark processing pipeline using Amazon EMR on EC2, orchestrated with Step Functions and triggered by EventBridge. We walk through how to deploy this solution using AWS CloudFormation, processes COVID-19 public dataset data in Amazon Simple Storage Service (Amazon S3), and store the aggregated results in Amazon S3. This architecture is ideal for periodic or scheduled batch processing scenarios where infrastructure control, auditability, and cost-efficiency are critical.

Solution overview

This solution uses the publicly available COVID-19 dataset to illustrate how to build a modular, scheduled architecture for scalable and cost-efficient batch processing for time-bound data workloads.The solution follows these steps:

  1. Raw COVID-19 data in CSV format is stored in an S3 input bucket.
  2. A scheduled rule in EventBridge triggers a Step Functions workflow.
  3. The Step Functions workflow provisions a transient Amazon EMR cluster using EC2 instances.
  4. A PySpark job is submitted to the cluster to calculate COVID-19 hospital utilization data to compute monthly state-level averages of inpatient and ICU bed utilization, and COVID-19 patient percentages.
  5. The processed results are written back to an S3 output bucket.
  6. After successful job completion, the EMR cluster is automatically deleted.
  7. Logs are persisted to Amazon S3 for observability and troubleshooting.

By automating this workflow, you alleviate the need to manually manage EMR clusters while gaining cost-efficiency by running compute only when needed. This architecture is ideal for periodic Spark jobs such as ETL pipelines, regulatory reporting, and batch analytics, especially when control, compliance, and customization are required.The following diagram illustrates the architecture for this use case.

The infrastructure is deployed using AWS CloudFormation to provide consistency and repeatability. AWS Identity and Access Management (IAM) roles grant least‑privilege access to Step Functions, Amazon EMR, EC2 instances, and S3 buckets, and optional AWS Key Management Service (AWS KMS) encryption can secure data at rest in Amazon S3 and Amazon CloudWatch Logs. By combining a scheduled trigger, stateful orchestration, and centralized logging, this solution delivers a fully automated, cost‑optimized, and secure way to run transient Spark workloads in production.

Prerequisites

Before you get started, make sure you have the following prerequisites:

Set up resources with AWS CloudFormation

To provision the required resources using a single CloudFormation template, complete the following steps:

  1. Sign in to the AWS Management Console as an admin user.
  2. Clone the sample repository to your local machine or AWS CloudShell and navigate into the project directory.
    git clone https://github.com/aws-samples/sample-emr-transient-cluster-step-functions-eventbridge.git
    cd sample-emr-transient-cluster-step-functions-eventbridge

  3. Set an environment variable for the AWS Region where you plan to deploy the resources. Replace the placeholder with your Region code, for example, us-east-1.
    export AWS_REGION=<YOUR AWS REGION>

  4. Deploy the stack using the following command. Update the stack name if needed. In this example, the stack is created with the name covid19-analysis.
    aws cloudformation deploy \
    --template-file emr_transient_cluster_step_functions_eventbridge.yaml \
    --stack-name covid19-analysis \
    --capabilities CAPABILITY_IAM \
    --region $AWS_REGION 

You can monitor the stack creation progress on the AWS CloudFormation console on the Events tab. The deployment typically completes in under 5 minutes.

After the stack is successfully created, go to the Outputs tab on the AWS CloudFormation console and note the following values for use in later steps:

  • InputBucketName
  • OutputBucketName
  • LogBucketName

Set up the COVID-19 dataset

With your infrastructure in place, complete the following steps to set up the input data:

  1. Download the COVID-19 data CSV file from HealthData.gov to your local machine.
  2. Rename the downloaded file to covid19-dataset.csv.
  3. Upload the renamed file to your S3 input bucket under the raw/ folder path.

Set up the PySpark Script

Complete the following steps to set up the PySpark script:

  1. Open AWS CloudShell from the console.
  2. Confirm that you are working inside the sample-emr-transient-cluster-step-functions-eventbridge directory before running the next command.
  3. Copy the PySpark script needed for this walkthrough into your input bucket:
    aws s3 cp covid19_processor.py s3://<InputBucketName>/scripts/

This script processes COVID-19 hospital utilization data stored as CSV files in your S3 input bucket. When running the job, provide the following command-line arguments:

  • --input – The S3 path to the input CSV files
  • --output – The S3 path to store the processed results

The script reads the raw dataset, standardizes various date formats, and filters out records with invalid or missing dates. It then extracts key utilization metrics such as inpatient bed usage, ICU bed usage, and the percentage of beds occupied by COVID-19 patients and calculates monthly averages grouped by state. The aggregated output is saved as timestamped CSV files in the specified S3 location.

This example demonstrates how you can use PySpark to efficiently clean, transform, and analyze large-scale healthcare data to gain actionable insights on hospital capacity trends during the pandemic.

Configure a schedule in EventBridge

The Step Functions state machine is by default scheduled to run on December 31, 2025, as a one-time execution. You can update the schedule for recurring or one-time execution as needed. Complete the following steps:

  1. On the EventBridge console, choose Schedules under Scheduler in the navigation pane.
  2. Select the schedule named <StackName>-covid19-analysis and choose Edit.
  3. Set your preferred schedule pattern.
    1. If you want to run the schedule one time, select One-time schedule for Occurrence and enter a date and time.
    2. If you want to run this on a recurring basis, select Recurring schedule. Specify the schedule type as either Cron-based schedule or Rate-based schedule as needed.
  4. Choose Next twice and choose Save schedule.

Start the workflow in Step Functions

Based on your EventBridge schedule, the Step Functions workflow will run automatically. For this walkthrough, complete the following steps to trigger it manually:

  1. On the Step Functions console, choose State machines in the navigation pane.
  2. Choose the state machine that begins with Covid19AnalysisStateMachine-*.
  3. Choose Start execution.
  4. In the Input section, provide the following JSON (provide the log bucket and output bucket names with the appropriate values captured earlier):
    {
      "LogUri": "s3://<LogBucketName>/logs/",
      "OutputS3Location": "s3://<OutputBucketName>/processed/"
    }

  5. Choose Start execution to initiate the workflow.

Monitor the EMR job and workflow execution

After you start the workflow, you can track both the Step Functions state transitions and the EMR job progress in real time on the console.

Monitor the Step Functions state machine

Complete the following steps to monitor the Step Functions state machine:

  1. On the Step Functions console, choose State machines in the navigation pane.
  2. Choose the state machine that begins with Covid19AnalysisStateMachine-*.
  3. Choose the running execution to view the visual workflow.

    Each state node will update as it progresses—green for success, red for failure.

  4. To explore a step, choose its node and inspect the input, output, and error details in the side pane.

The following screenshot shows an example of a successfully executed workflow.

Monitor the EMR cluster and EMR step

Complete the following steps to monitor the EMR cluster and EMR step status:

  1. While the cluster is active, open the Amazon EMR console and choose Clusters in the navigation pane.
  2. Locate the Covid19Cluster transient EMR cluster.
    Initially, it will be in Starting status.

    On the Steps tab, you can see your Spark submit step listed. As the job progresses, the step status changes from Pending to Running to finally Completed or Failed.

  3. Choose the Applications tab to view the application UIs, in which you can access the Spark History Server and YARN Timeline Server for monitoring and troubleshooting.

Monitor CloudWatch logs

To enable CloudWatch logging and enhanced monitoring for your EMR on EC2 cluster, refer to Amazon EMR on EC2 – Enhanced Monitoring with CloudWatch using custom metrics and logs. This guide explains how to install and configure the CloudWatch agent using a bootstrap action, so you can stream system-level metrics (such as CPU, memory, and disk usage) and application logs from EMR nodes directly to CloudWatch. With this setup, you can gain real-time visibility into cluster health and performance, simplify troubleshooting, and retain critical logs even after the cluster is terminated.

For this walkthrough, check the logs in the S3 log output location.

Confirm cluster deletion

When the Spark step is complete, Step Functions will automatically delete the Amazon EMR cluster. Refresh the Clusters page on the Amazon EMR console. You should see your cluster status change from Terminating to Terminated within a minute.

By following these steps, you gain full end-to-end visibility into your workflow from the moment the Step Functions state machine is triggered to the automatic shutdown of the EMR cluster. You can monitor execution progress, troubleshoot issues, confirm job success, and continuously optimize your transient Spark workloads.

Verify job output in Amazon S3

When the job is complete, complete the following steps to check the processed results in the S3 output bucket:

  1. On the Amazon S3 console, choose Buckets in the navigation pane.
  2. Open the output S3 bucket you noted earlier.
  3. Open the processed folder.
  4. Navigate into the timestamped subfolder to view the CSV output file.
  5. Download the CSV file to view the processed results, as shown in the following screenshot.

Monitoring and troubleshooting

To monitor the progress of your Spark job running on a transient EMR on EC2 cluster, use the Step Functions console. It provides real-time visibility into each state transition in your workflow, from cluster creation and job submission to cluster deletion. This makes it straightforward to track execution flow and identify where issues might occur.During job execution, you can use the Amazon EMR console to access cluster-level monitoring. This includes YARN application statuses, step-level logs, and overall cluster health. If CloudWatch logging is enabled in your job configuration, driver and executor logs stream in near real time, so you can quickly detect and diagnose errors, resource constraints, or data skew within your Spark application.

After the workflow is complete, regardless of whether it succeeds or fails, you can perform a detailed post-execution analysis by reviewing the logs stored in the S3 bucket specified in the LogUri parameter. This log directory includes standard output and error logs, along with Spark history files, offering insights into execution behavior and performance metrics.

For continued access to the Spark UI during job execution, you can use persistent application UIs on the EMR console. These links remain accessible even after the cluster is stopped, enabling deeper root-cause analysis and performance tuning for future runs.

This visibility into both workflow orchestration and job execution can help teams optimize their Spark workloads, reduce troubleshooting time, and build confidence in their EMR automation pipelines.

Clean up

To avoid incurring ongoing charges, clean up the resources provisioned during this walkthrough:

  1. Empty the S3 buckets:
    1. On the Amazon S3 console, choose Buckets in the navigation pane.
    2. Select the input, output, and log buckets used in this tutorial.
    3. Choose Empty to remove all objects before deleting the buckets (optional).
  2. Delete the CloudFormation stack:
    1. On the AWS CloudFormation console, choose Stacks in the navigation pane.
    2. Select the stack you created for this solution and choose Delete.
    3. Confirm the deletion to remove associated resources.

Conclusion

In this post, we showed how to build a fully automated and cost-effective Spark processing pipeline using Step Functions, EventBridge, and Amazon EMR on EC2. The workflow provisions a transient EMR cluster, runs a Spark job to process data, and stops the cluster after the job completes. This approach helps reduce costs while giving you full control over the process. This solution is ideal for scheduled data processing tasks such as ETL jobs, log analytics, or batch reporting, especially when you need detailed control over infrastructure, security, and compliance settings.

To get started, deploy the solution in your environment using the CloudFormation stack provided and adjust it to fit your data processing needs. Check out the Step Functions Developer Guide and Amazon EMR Management Guide to explore further.

Share your feedback and ideas in the comments or connect with your AWS Solutions Architect to fine-tune this pattern for your use case.


About the authors

Senthil Kamala Rathinam

Senthil Kamala Rathinam

Senthil is a Solutions Architect at Amazon Web Services, specializing in Data and Analytics for banking customers across North America. With deep expertise in Data and Analytics, AI/ML, and Generative AI, he helps organizations unlock business value through data-driven transformation. Beyond work, Senthil enjoys spending time with his family and playing badminton.

Shashi Makkapati

Shashi Makkapati

Shashi is a Senior Solutions Architect serving banking customers across North America. He specializes in data analytics, AI/ML, and generative AI, focusing on innovative solutions that transform financial organizations. Shashi is passionate about leveraging technology to solve complex business challenges in the banking sector. Outside of work, he enjoys traveling and spending quality time with his family.

Tuning guide for AMD Amazon EC2 instances

Post Syndicated from Suyash Nadkarni original https://aws.amazon.com/blogs/compute/tuning-guide-for-amd-amazon-ec2-instances/

As organizations migrate more mission-critical workloads to the cloud, optimizing for price-performance becomes a key consideration. Amazon Elastic Compute Cloud (Amazon EC2) instances powered by AMD EPYC processors deliver high core density, large memory bandwidth, and hardware-enabled security features, making them a strong option for a wide range of compute, memory, and I/O-intensive workloads. In this post, we explain how to choose the right AMD-based Amazon EC2 instance types and describe tuning techniques that can help users improve workload efficiency. Whether you’re running simulations, large-scale analytics, or inference workloads, this post provides practical guidance for optimizing AMD-powered Amazon EC2 instance.

Amazon EC2 offers AMD-based instances built on multiple generations of AMD EPYC processors. This post focuses on optimization strategies for the 3rd and 4th generation families, which provide enhanced capabilities for compute and memory-intensive workloads.

  • 3rd generation (M6a, R6a, C6a, Hpc6a): Balances compute, memory, and storage—well-suited for analytics, web servers, and high-performance computing.
  • 4th generation (M7a, R7a, C7a, Hpc7a): Deliver up to 50% better performance over earlier AMD generations These instances introduce AVX-512 support, DDR5 memory, and Simultaneous Multithreading (SMT) turned off, SMT is a technology that allows a single physical core to run multiple threads concurrently; with SMT disabled, each virtual CPU (vCPU) maps directly to a physical core, which can improve workload isolation and consistency.

Choosing the right AMD EPYC powered Amazon EC2 instance type

Selecting the right AMD EPYC powered Amazon EC2 instance type starts with understanding how your application uses compute, memory, storage, and networking resources. Each instance family is optimized for specific workload characteristics.

Compute-intensive workloads

These workloads involve large-scale calculations, simulations, or encoding tasks, and they often need high CPU throughput and advanced instruction set support.

Recommended instances: C7a, Hpc7a, C6a, Hpc6a
Use cases: Scientific computing, financial modelling, media transcoding, encryption, machine learning (ML) inference

Big data and analytics

Applications that process and analyze large datasets benefit from high memory bandwidth and a balanced compute-to-memory ratio.

Recommended instances: R7a, M7a, R6a, M6a
Use cases: Stream processing, real-time analytics, business intelligence tools, distributed caching

Database workloads

Database workloads typically need consistent memory performance and high I/O throughput for read/write operations.

Recommended instances: R7a, M7a, R6a, M6a
Use cases: Relational databases (MySQL, PostgreSQL), NoSQL databases (MongoDB, Cassandra), in-memory databases (Redis)

Web and application servers

These applications handle variable request loads and benefit from balanced compute, memory, and network performance.

Recommended instances: C7a, M7a, C6a, M6a
Use cases: Web servers, content management systems, e-commerce platforms, API endpoints

AI/ML on CPU

ML tasks that do not need GPUs—such as inference or preprocessing—can run efficiently on CPU-based instances.

Recommended instances: M7a, R7a, C7a
Use cases: Model inference, natural language processing, computer vision, recommendation engines

High Performance Computing (HPC)

These workloads need high core counts, memory bandwidth, and low-latency networking for tightly coupled computations.

Recommended instances: Hpc7a, Hpc6a, R7a, M7a
Use cases: Computational fluid dynamics, genomics, seismic analysis, engineering simulations

Aligning your instance type with the needs of your workload helps provide predictable performance and cost efficiency. Services such as Amazon EC2 Auto Scaling and AWS Compute Optimizer can assist with ongoing instance selection and scaling decisions.

Optimizing AMD EPYC powered Amazon EC2 instances

Amazon EC2 instances powered by 4th generation AMD EPYC processors use a modular chiplet architecture, as shown in the following figure. Each processor includes multiple Core Complex Dies (CCDs), and each CCD contains one or more core complexes (CCXs). A CCX groups up to eight physical cores, with each core having 1 MB of dedicated L2 cache and all eight cores sharing a 32 MB L3 cache. These CCDs are connected to a central I/O die, which manages memory and interconnects across the chip.

Figure 1: Layout of the ‘Zen 4’ CPU die with 8 cores per die

Figure 1: Layout of the ‘Zen 4’ CPU die with 8 cores per die

The modular architecture of 4th generation AMD EPYC processors enables Amazon EC2 instances such as m7a.24xlarge and m7a.48xlarge to support high core counts-up to 96 physical cores per socket. For example:

  • m7a.24xlarge provides 96 physical cores from a single socket.
  • m7a.48xlarge spans two sockets, offering 192 physical cores.

Understanding how Amazon EC2 instance sizes map to physical processor layouts can help you optimize for performance and cache locality. Workloads that involve shared memory access or thread synchronization, such as high-performance computing or in-memory databases, can benefit from selecting instance sizes that minimize cross-socket communication and make efficient use of shared L3 cache, as shown in the following figure.

Figure 2: Layout of the ‘EPYC Chiplet’ CPU

Figure 2: Layout of the ‘EPYC Chiplet’ CPU

Amazon EC2 instances powered by 4th generation AMD EPYC processors operate with SMT turned off. In this configuration, each vCPU maps directly to a physical core, eliminating resource sharing such as execution units and cache between sibling threads. This design can reduce intra-core interference and help provide more consistent performance under certain workloads. Users can isolate threads at the core level and observe lower variability and more stable throughput for workloads, such as high-performance computing, ML inference, and transactional databases.

CPU optimizations

Tools such as htop can help identify CPU usage patterns, system load averages, and per-process resource consumption. CPU usage should be evaluated in the context of your workload and performance requirements. If usage consistently reaches 100%, then it may indicate that the workload is CPU-bound and not optimally balanced. Before modifying the instance size, enabling Auto Scaling, or switching instance families, evaluations must be conducted for the tuning opportunities that could improve performance without changing infrastructure. Load averages that regularly exceed the number of vCPUs can also signal compute saturation and may warrant further optimization.

L3 cache usage

The L3 cache is a shared, high-speed memory layer used by a group of CPU cores. On AMD-based Amazon EC2, cores are organized into L3 cache slices, each shared by a subset of cores on the same socket. Threads scheduled within the same slice can access shared data more efficiently, reducing memory latency. On 4th generation AMD instances such as m7a.2xlarge or r7a.2xlarge, all vCPUs typically map to cores within a single L3 slice, which ensures consistent cache locality. For larger sizes (for example m7a.8xlarge and above), thread pinning—assigning threads to specific physical cores—can help maintain this locality. Thread pinning can reduce performance variability in workloads with shared-memory access patterns.

You can pin threads using the taskset command:

taskset -c 0-3 ./your_application

This example pins your application to CPU cores 0 through 3. To determine which cores share the same L3 cache region, use tools such as lscpu or lstopo to inspect the system’s CPU topology. Grouping related threads on cores that share an L3 cache can improve performance consistency for workloads with frequent shared-memory access.

Docker container optimization

In containerized environments running on AMD-based Amazon EC2 instances, tuning CPU-related settings can improve workload consistency and efficiency—particularly for compute-intensive or latency-sensitive applications. Although default configurations work for many general-purpose scenarios, certain workloads may benefit from more explicit control over how CPU resources are allocated. By default, container runtimes such as Docker allow the operating system to schedule containers across any available CPU cores. This flexible scheduling can lead to variability in performance when containers move across cores that don’t share cache. To reduce this variability and improve cache efficiency, containers can be pinned to specific cores using the --cpuset-cpus flag.

docker run --cpuset-cpus="1,3" my-container

This setting restricts the container to use only the specified cores. In this example, cores 1 and 3 are used for demonstration. The actual core selection should be based on CPU topology to make sure of cache-efficient scheduling. Pinning containers to cores that share L3 cache can reduce scheduling overhead and improve consistency for workloads with shared-memory access patterns.

CPU frequency governor settings

Some operating systems adjust CPU frequency dynamically to save power. This is typically controlled by a setting called the CPU frequency governor. Although this behavior is efficient for general-purpose workloads, it may introduce latency or performance variability in compute-sensitive environments. For workloads that need consistently high CPU performance—such as high-throughput data processing, simulations, or real-time applications—we recommend setting the CPU governor to performance mode. This makes sure that the CPU runs at its maximum frequency under load, avoiding time spent ramping up from lower power states.

You can apply this setting on bare metal instances or Amazon EC2 Dedicated Hosts using the following command:

sudo cpupower frequency-set -g performance

Before applying, consider benchmarking workload performance with other CPU frequency governors (such as ondemand or schedutil) to make sure that the performance setting provides measurable benefits without unnecessary energy trade-offs.

Use architecture-specific compiler flags

When compiling performance-sensitive C or C++ applications, architecture-specific flags such as -march=znverX can unlock AMD EPYC–specific optimizations, including improved vectorization and floating-point performance. Although this is beneficial for compute-heavy workloads, it may reduce portability across architectures. To balance performance and flexibility, consider implementing runtime feature detection and dispatching an approach used by many optimized libraries to adapt behavior based on the underlying CPU.

Before using these flags, verify that your compiler version supports them and make sure that the target EC2 instance architecture matches the specified flag. For example, a binary compiled with -march=znver4 may fail with an illegal instruction error (SIGILL) if run on earlier-generation instances such as M5a.The following table outlines the appropriate flags and minimum supported compiler versions for each AMD EPYC generation:

AMD EPYC Generation -march Flag Minimum GCC Version Minimum LLVM/Clang Version
4th generation (for example M7a) znver4 GCC 12 Clang 15
3rd generation (for example M6a) znver3 GCC 11 Clang 13
2nd generation (for example M5a) znver2 GCC 9 Clang 11

The following flags are supported for GCC 11+ or LLVM Clang 13+:

# 4th Gen EPYC (M7a, R7a, C7a, Hpc7a)
-march=znver4

# 3rd Gen EPYC (M6a, R6a, C6a)
-march=znver3

# 2nd Gen EPYC (M5a, R5a, C5a)
-march=znver2

When to enable AVX-512 and VNNI instructions

4th generation AMD EPYC powered Amazon EC2 instances support advanced single instruction, multiple data (SIMD) instruction sets such as AVX2, AVX-512, and VNNI. These can improve throughput for vector-heavy workloads such as ML inference, image processing, or scientific simulations. However, these flags are generation-specific—attempting to run binaries compiled with AVX-512 on unsupported instances (for example 2nd generation M5a) may result in runtime errors such as illegal instruction (SIGILL).

When compiling C or C++ code:

gcc -mavx2 -mavx512f -O2 your_program.c -o your_program

To better understand which optimizations are applied, use the following:

-ftree-vectorizer-verbose=2 -fopt-info-vec-missed

This helps identify loops that benefit from vectorization and those that don’t. Only enable these optimizations if your workload benefits and you’ve validated compatibility with the instance generation in use. Avoid applying AVX flags indiscriminately, because it may reduce portability and increase binary complexity.

AMD Optimizing CPU Libraries

The AMD Optimizing CPU Libraries (AOCL) provide performance-tuned math libraries specifically designed for AMD EPYC processors. These libraries include optimized implementations of commonly used functions in scientific computing, engineering, and ML workloads. You can link your applications against AOCL to use processor-specific optimizations without rewriting your code. AOCL includes libraries for vector and scalar math, random number generation, FFT, BLAS, and LAPACK, among others.

Setting up AOCL

  • Set the AOCL_ROOT environment variable to point to the installation directory:
    export AOCL_ROOT=/path/to/aocl

  • Compile your application with the appropriate include and library paths:
    gcc -I$AOCL_ROOT/include -L$AOCL_ROOT/lib -lamdlibm -lm your_program.c -o your_program

  • Vector and scalar math optimization: you can enable more vectorized or scalar math tuning flags for specific workloads:
    # Vector math optimization
    gcc -lamdlibm -fveclib=AMDLIBM -lm your_program.c -o your_program
    		
    # Faster scalar math
    gcc -lamdlibm -fsclrlib=AMDLIBM -lamdlibmfast -lm your_program.c -o your_program

  • AOCL runtime profiling: AOCL supports runtime profiling, which helps developers identify which mathematical operations dominate execution time. To enable profiling, run the following:
    export AOCL_PROFILE=1
    ./your_program

After running this, a report file named aocl_profile_report.txt is generated. It provides a function-level breakdown of call counts, execution time, and thread usage. Developers can use this to focus optimization efforts on high-impact operations.

Conclusion

This post explored how to select AMD-based Amazon EC2 instance types that align with specific workload characteristics, and how to apply tuning techniques focused on CPU usage, thread placement, cache efficiency, and math library optimization. These approaches are especially relevant for compute-bound or latency-sensitive workloads where consistent performance is critical.

Ready to get started? Sign in to the AWS Management Console and launch AMD EPYC powered Amazon EC2 instances to begin optimizing your workloads today.

AWS Weekly Roundup: Amazon EC2, Amazon Q Developer, IPv6 updates, and more (September 1, 2025)

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-amazon-q-developer-ipv6-updates-and-more-september-1-2025/

My LinkedIn feed was absolutely packed this week with pictures from the AWS Heroes Summit event in Seattle. It was heartwarming to see so many familiar faces and new Heroes coming together.

AWS Heroes Summit 2025

For those not familiar with the AWS Heroes program, it’s a global community recognition initiative that honors individuals who make outstanding contributions to the AWS community. These Heroes share their deep AWS knowledge through content creation, speaking at events, organizing community gatherings, and contributing to open-source projects.

The AWS Heroes Summit brings these exceptional community leaders together, providing a unique platform for knowledge exchange, networking, and collaboration. As someone who regularly interacts with Heroes through our AWS initiatives, I always find these summits invaluable – they offer deep technical discussions, early access to AWS roadmaps, and opportunities to provide direct feedback to AWS service teams. The insights and connections made at these events often translate into better resources and guidance for the broader AWS community.

Last week’s launches

In addition to this inspiring community, here are some AWS launches that caught my attention:

  • AWS expands Internet Protocol v6 (IPv6) support to AWS App Runner, AWS Client VPN, and RDS Data API — Three more AWS services now support IPv6 connectivity, helping you meet compliance requirements and removes the need for handling address translation between IPv4 and IPv6. AWS App Runner now supports IPv6-based inbound and outbound traffic on both public and private App Runner service endpoints. AWS Client VPN announced support for remote access to IPv6 workloads, allowing you to establish secure VPN connections to your IPv6-enabled VPC resources. Finally, RDS Data API now supports IPv6, enabling dual-stack configuration (IPv4 and IPv6) connectivity for your Aurora databases.
  • We launched two new instance families this week: the new storage-optimized I8ge and the general-purpose M8i instances —Our I8ge instances, powered by AWS Graviton4 processors, deliver up to 60% better compute performance compared to their Graviton2-based predecessors. These instances feature third-generation AWS Nitro SSDs, providing up to 55% better real-time storage performance per TB and significantly lower I/O latency. With 120 TB of storage and sizes up to 48xlarge (including two metal options), they offer the highest storage density among AWS Graviton-based storage optimized instances. We also launched M8i and M8i-flex instances with custom Intel Xeon 6 processors. These instances deliver up to 15% better price-performance and 2.5x more memory bandwidth than their predecessors. M8i-flex instances are ideal for general-purpose workloads, available from large to 16xlarge. For demanding applications, you can choose from our SAP-certified M8i instances in 13 sizes, including 2 bare metal options and a new 96xlarge size.
  • Amazon EC2 Mac Dedicated hosts now support Host Recovery and Reboot-based host maintenance — you can enable two new capabilities for your EC2 Mac Dedicated Hosts: Host Recovery and Reboot-based Host Maintenance. Host Recovery automatically detects potential hardware issues on Mac Dedicated Hosts and seamlessly migrates Mac instances to a new replacement host, minimizing disruption to workloads. Reboot-based Host Maintenance automatically stops and restarts instances on replacement hosts when scheduled maintenance events occur, eliminating the need for manual intervention during planned maintenance windows.
  • Amazon Q Developer now supports MCP admin control — Administrators have now the ability to enable or disable the MCP functionality for all the Q Developer clients in their organization. When an administrator disables the functionality, users will not be allowed to add any MCP servers, nor will any previously defined servers be initialized.

Other AWS news

Here are some additional projects and blog posts that you might find interesting:

  • Mastering Amazon Q Developer with Rules — I read an interesting article about Amazon Q Developer’s rules feature this weekend that I want to share with you. What caught my attention is how it solves a pain point I often encounter when working with AI assistants – having to repeatedly explain my coding preferences and standards. With rules, you define your preferences once in Markdown files, and Amazon Q Developer automatically follows them for every interaction. I particularly like how transparent the system is, showing which rules it’s following, and how it helps maintain consistency across teams. Since implementing rules in my projects, I’ve seen more consistent code quality, all while reducing the cognitive load of having to repeatedly explain our standards.
  • Strategies for excelling across all four exam domains of the AWS Certified Machine Learning – Specialty certification. The AWS Training & Certification team, where I spent my first three years at AWS, shared how to prepare for the AWS Certified Machine Learning – Specialty certification, whether you’re starting from scratch or building upon existing AWS Certifications. They share the prerequisites and guidance to help you get ready for this certification and demonstrate your expertise in building ML solutions with AWS.
  • As is now our tradition after Prime Day, we shared the impressive metrics showing how AWS services scaled to support one of the world’s largest shopping events. Amazon Prime Day 2025 was the biggest ever, setting records for both sales volume and total items sold during the 4-day event. This year was particularly special as we saw a significant transformation in the Prime Day experience through advancements in our generative AI offerings, with customers using Alexa+, Rufus, and AI Shopping Guides to discover deals and get product information. The numbers are staggering – Amazon DynamoDB handled tens of trillions of API calls while maintaining high availability, delivering single-digit millisecond responses and peaking at 151 million requests per second. Amazon API Gateway processed over 1 trillion internal service requests—a 30 percent increase in requests on average per day compared to Prime Day 2024.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

  • AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Toronto (September 4), Los Angeles (September 17), and Bogotá (October 9).
  • AWS re:Invent 2025 — This flagship annual conference is coming to Las Vegas from December 1–5. The event catalog is now available. Mark your calendars for this not to be missed gathering of the AWS community.
  • AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Adria (September 5), Baltic (September 10), Aotearoa (September 18), South Africa (September 20), Bolivia (September 20), Portugal (September 27).

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

New general-purpose Amazon EC2 M8i and M8i Flex instances are now available

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/new-general-purpose-amazon-ec2-m8i-and-m8i-flex-instances-are-now-available/

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) general-purpose M8i and M8i-Flex instances powered by custom Intel Xeon 6 processors available only on AWS with sustained all-core 3.9 GHz turbo frequency. These instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. They also deliver up to 15 percent better price performance, up to 20 percent higher performance, and 2.5 times more memory bandwidth compared to previous generation M7i and M7i-Flex instances.

M8i and M8i-flex instances are ideal for running general purpose workloads such as general web application servers, virtual desktops, batch processing, microservices, databases, and enterprise applications. In terms of performance, these instances are specifically up to 60 percent faster for NGINX web applications, up to 30 percent faster for PostgreSQL database workloads, and up to 40 percent faster for AI deep learning recommendation models compared to M7i and M7i-Flex instances.

As like R8i and R8i-Flex instances, these instances use the new sixth generation AWS Nitro Cards, delivering up to two times more network and Amazon Elastic Block Storage (Amazon EBS) bandwidth compared to the previous generation instances. It greatly improves network throughput for workloads handling small packets such as web, application, and gaming servers. They also support bandwidth configuration with 25 percent allocation adjustments between network and Amazon EBS bandwidth, enabling better database performance, query processing, and logging speeds.

M8i instances
M8i instances provide up to 384 vCPUs and 1.5 TB memory including bare metal instances that provide dedicated access to the underlying physical hardware. These SAP-certified instances help you to run large application servers and databases, gaming servers, CPU-based inference, and video streaming that need the largest instance sizes or high CPU continuously.

Here are the specs for M8i instances:

Instance size vCPUs Memory (GiB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
m8i.large 2 8 Up to 12.5 Up to 10
m8i.xlarge 4 16 Up to 12.5 Up to 10
m8i.2xlarge 8 32 Up to 15 Up to 10
m8i.4xlarge 16 64 Up to 15 Up to 10
m8i.8xlarge 32 128 15 10
m8i.12xlarge 48 192 22.5 15
m8i.16xlarge 64 256 30 20
m8i.24xlarge 96 384 40 30
m8i.32xlarge 128 512 50 40
m8i.48xlarge 192 768 75 60
m8i.96xlarge 384 1536 100 80
m8i.metal-48xl 192 768 75 60
m8i.metal-96xl 384 1536 100 80

M8i-Flex instances
M8i-Flex instances are a lower-cost variant of the M8i instances, with 5 percent better price performance at 5 percent lower prices. They’re designed for workloads that benefit from the latest generation performance but don’t fully utilize all compute resources. These instances can reach up to the full CPU performance 95 percent of the time.

Here are the specs for the M8i-Flex instances:

Instance size vCPUs Memory (GiB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
m8i-flex.large 2 8 Up to 12.5 Up to 10
m8i-flex.xlarge 4 16 Up to 12.5 Up to 10
m8i-flex.2xlarge 8 32 Up to 15 Up to 10
m8i-flex.4xlarge 16 64 Up to 15 Up to 10
m8i-flex.8xlarge 32 128 Up to 15 Up to 10
m8i-flex.12xlarge 48 192 Up to 22.5 Up to 15
m8i-flex.16xlarge 64 256 Up to 30 Up to 20

If you’re currently using earlier generations of general-purpose instances, you can adopt M8i-Flex instances without having to make changes to your application or your workload.

Now available
Amazon EC2 M8i and M8i-Flex instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Spain) AWS Regions. M8i and M8i-Flex instances can be purchased as On-Demand, Savings Plan, and Spot instances. M8i instances are also available in Dedicated Instances and Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

Give M8i and M8i-Flex instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 M8i instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy