Tag Archives: Graviton

How potential performance upside with AWS Graviton helps reduce your costs further

2025-11-24 Markus Adhiwiyogo

Post Syndicated from Markus Adhiwiyogo original https://aws.amazon.com/blogs/compute/how-potential-performance-upside-with-aws-graviton-helps-reduce-your-costs-further/

Amazon Web Services (AWS) provides many mechanisms to optimize the price performance of workloads running on Amazon Elastic Compute Cloud (Amazon EC2), and the selection of the optimal infrastructure to run on can be one of the most impactful levers. When we started building the AWS Graviton processor, our goal was to optimize AWS Graviton features and capabilities to deliver a processor that provides the best price performance across a broad array of cloud workloads running on Amazon EC2. That goal continues to be our guiding principle, and today customers who adopt AWS Graviton-based EC2 instances see up to 40% better price performance on their cloud workloads when compared to equivalent non-Graviton EC2 instances. The price performance improvement is the result of both the performance improvement and the lower price in using AWS Graviton-based instances.

Price performance blends the cost of infrastructure with the amount of work you can achieve with infrastructure usage. After talking to many AWS Graviton customers, we’ve learned that the cost savings go beyond the lower AWS Graviton-based instances price. Many AWS Graviton customers told us that the performance increase from AWS Graviton allows them to consume fewer computing hours than comparable non-Graviton instances for equivalent workload throughput. In turn, this leads to further cost reduction.

The following are some of examples from our customers:

Pinterest achieved 47% cost savings and 38% savings on compute resources while reducing carbon emissions by 62% for its web API workload.
SAP powers its SAP HANA Cloud with AWS Graviton to enhance its price performance by 35% while lowering carbon impact by 45%.
Sprinklr improved their machine learning (ML) inference workloads’ throughput by up to 20% while reducing costs by up to 25%.

You can find more customer examples in the AWS Graviton testimonials page.

To help organizations capture similar benefits, we’ve enhanced the AWS Graviton Savings Dashboard (GSD) with new features that account for both pricing and performance improvements. In the following section we explore these new capabilities and how they can help optimize your infrastructure costs.

Understanding performance-driven cost optimization in the GSD

The GSD helps organizations identify ideal workloads for AWS Graviton migration through automated resource matching and data-driven visualizations. You can learn the GSD details and setup in this AWS compute post.

Although the dashboard has traditionally focused on calculating direct cost savings from the AWS Graviton pricing advantages, we’ve observed that customers often experience more benefits when their applications perform more efficiently on AWS Graviton processors, leading to decreased compute resource usage. To better reflect these real-world scenarios, we’ve enhanced the dashboard with new features highlighting Normalized Instance Hours (NIH) analysis capabilities so that you can model potential savings based on both pricing benefits and compute hour reductions. Although this tool helps estimate potential savings, actual performance improvements can only be determined by testing your specific workloads on AWS Graviton instances. Performance is always workload and use case specific, so we encourage you to test your AWS Graviton-based workloads using the Optimization and Performance Runbook to help you determine the actual possible NIH percent reduction.

Key dashboard components

This section outlines the following three key dashboard components: NIH reduction analysis, enhanced cost analysis visualizations, and detailed savings analysis.

NIH reduction analysis

The dashboard now features a new slider that lets you model potential cost savings by inputting the percentage reduction in NIH. Many organizations have found it challenging to calculate their total possible savings since the benefits come from two sources: the lower instance pricing of AWS Graviton and the reduced compute hours.

You can use the slider to model different cost scenarios by adjusting a theoretical NIH reduction between 0% and 40%. You can use this slider to input NIH reductions validated through your workload testing, model the combined impact of both pricing benefits and reduced compute hours, and explore different scenarios to help prioritize which workloads to test first.

Figure 1: NIH slider location

Assume that your testing shows that your workload runs just as effectively with 15% fewer normalized instance hours on AWS Graviton. You can now plug that exact number into the slider to see your modeled savings combining both pricing differences and compute hour reductions. Although we’ve heard success stories of significant reductions from customers, we recommend starting your initial estimate with a conservative 10% baseline and adjusting based on your own testing results.

Enhanced cost analysis visualizations

The dashboard presents key visualizations that demonstrate the direct relationship between NIH reduction and cost savings. First, you see the Potential Graviton Base Savings from pricing differences alone. In the following diagram, we can observe an example of $61.54K of cost savings from migrating to equivalent AWS Graviton instances. Next, the Estimated Additional Savings Due to Performance in the same diagram shows $42.40K in savings if your performance testing confirms a 15% NIH reduction in your workload. Finally, the dashboard sums these two values into the Total Potential Graviton Savings of $103.94K. The Total Potential Graviton Savings helps visualize how both pricing benefits and any validated compute hour reductions could contribute to your overall savings.

Figure 2: Visualization with relationship between NIH reduction and cost savings

The Amortized Cost Breakdown and Normalized Instance Hrs Breakdown charts in the following figure show 6-month historical trends, helping you spot patterns such as seasonal spikes or high-usage periods. These patterns can help you identify where even small efficiency improvements might yield significant savings, for example, workloads with consistently high usage or predictable peak periods that would be good candidates for testing.

Figure 3: Amortized Cost, NIH, and Total Potential Savings Breakdown charts

Detailed savings analysis

Building on our commitment to help customers optimize cloud costs, we’ve enhanced the Potential Graviton Savings Details table with two columns focused on performance-based savings modeling. The Estimated Additional Savings Due to Performance column shows the modeled savings based on your chosen NIH reduction percentage, while Total Potential Graviton Savings combines this with the base pricing benefits.

Figure 4: Potential Graviton Savings Details table

As you examine your current instance family, you can observe both baseline AWS Graviton savings and these added saving opportunities clearly laid out in a comprehensive breakdown. The analysis presents your total savings potential in both dollar amounts and percentages. This allows you to build a compelling business case for migration. Although this detailed breakdown provides valuable planning insights, remember that actual savings may vary depending on your specific workload patterns, implementation approaches, and operational considerations.

Conclusion

The Graviton Savings Dashboard (GSD) serves as a powerful analytics tool that streamlines your journey to cost-effective cloud computing. The GSD provides clear visualizations and interactive features to help you understand and maximize potential savings when migrating to AWS Graviton-based instances. To further explore the new features, navigate to the GSD interactive demo, where you can model an example of potential savings using the NIH reduction slider and detailed cost breakdowns.

Ready to explore how AWS Graviton can transform your infrastructure costs? Visit the GSD page to deploy or update your GSD dashboard. Access implementation guides, such as the CFM Technical Implementation Playbook (CFM TIPs), and start optimizing your cloud spend today with the enhanced capabilities of the GSD.

Over 85,000 AWS customers have discovered the benefits of AWS Graviton, with many completing their adoptions in just hours. We have created this resource guide so that you can accelerate your AWS Graviton adoption with minimal effort and enjoy significant price performance benefits.

“What I always tell customers is one week, one application, one engineer, and see what you can do. They always are pleasantly surprised by how much progress they can make. If you’re out there and you haven’t yet moved to AWS Graviton, what are you waiting for? Let’s make it happen!”

Dave Brown, VP, AWS Compute & ML Services

Important note about performance testing
The GSD does not attempt to estimate the potential NIH percent reduction or your workload’s performance when transitioned to AWS Graviton. You can use it to perform what-if analysis of your potential savings for a projected NIH percent reduction. In the absence of this variable, GSD only considers the price delta between instance types and misses an important contributor to the overall savings potential of AWS Graviton from the performance upside. Compute performance is always workload and use case specific, so we encourage you to test your AWS Graviton-based workloads using the Optimization and Performance Runbook to help you determine the actual possible NIH percent reduction.

Implementing advanced AWS Graviton adoption strategies across AWS Regions

2025-08-26 Matt Howard

Post Syndicated from Matt Howard original https://aws.amazon.com/blogs/compute/implementing-advanced-aws-graviton-adoption-strategies-across-aws-regions/

AWS Graviton Processors can offer cost savings, improved performance, and reduce your carbon footprint when using Amazon Elastic Compute Cloud (Amazon EC2) instances. When expanding your Graviton deployment across multiple AWS Regions, careful planning helps you navigate considerations around regional instance type availability and capacity optimization. This post shows how to implement advanced configuration strategies for Graviton-enabled EC2 Auto Scaling groups across multiple Regions, helping you maximize instance availability, reduce costs, and maintain consistent application performance even in AWS Regions with limited Graviton instance type availability.

Instance type flexibility strategies

One of the most effective strategies for maximizing Graviton availability is to be flexible across multiple instance types and families. Instance families (such as m7g, c7g, and r7g) group similar instances with different sizes, where each size offers proportionally more vCPUs and memory. When configuring EC2 Auto Scaling groups, aim for at least 10 instance types rather than limiting to just one or two specific types. EC2 Auto Scaling supports this flexibility through the mixed instances group, which allows you to specify multiple instance types in a single group. Consider this example AWS CloudFormation template snippet for an EC2 Auto Scaling group MixedInstancesPolicy that only specifies two Graviton instance types across two different families:

"MixedInstancesPolicy": {
  "Overrides": [
    {
      "InstanceType": "m7g.large"
    },
    {
      "InstanceType": "c7g.xlarge"
    }
  ]
}

This limited selection significantly reduces your ability to access available capacity pools. Assuming this workload needs a minimum of 2 vCPU and 8 GiB of memory, you can add these additional eight Graviton instance types: m6g.large, m8g.large, m6gd.large, m7gd.large, m8gd.large, c6g.xlarge, c6gd.xlarge, and c8g.xlarge. These allow you to meet the recommendation of being flexible across 10 instance types. While some of these instance types may have different price points, you can manage these cost implications through allocation strategies discussed later in this post.

To efficiently identify all compatible Graviton instance types available for your workload, you can use the GetInstanceTypesFromInstanceRequirements Amazon EC2 API. This approach removes the manual effort of researching and choosing individual instance types.

aws ec2 get-instance-types-from-instance-requirements \
--architecture-types arm64 \
--virtualization-types hvm \
--instance-requirements '{"VCpuCount": {"Min": 2,"Max":8}, "MemoryMiB": {"Min": 8000}, "InstanceGenerations":["current"]}' \
--region us-east-1

This example command returns dozens of compatible Graviton instance types across multiple families (c7g, c7gd, c7gn, m7g, m7gd, etc.), thus expanding your capacity options. An EC2 Auto Scaling group’s mixed instance policy can allow up to 40 instance types, thus you have more room for even greater flexibility.

After expanding your instance type selection, you need to configure how EC2 Auto Scaling chooses between the available instance types. The OnDemandAllocationStrategy CloudFormation property controls this behavior, offering two approaches: “lowest-price” and “prioritized”. With the “lowest-price” strategy, EC2 Auto Scaling launches instances from the lowest-priced capacity pool available:

"OnDemandAllocationStrategy": "lowest-price"

This strategy helps manage costs when you’ve included a variety of instance types. Even with expanded instance type flexibility, your workloads will automatically select the most cost-effective option from available capacity pools. Alternatively, you can use the “prioritized” strategy when you want more control over which instance types are chosen first:

"OnDemandAllocationStrategy": "prioritized"

Regional adaptation techniques

Not all AWS Regions have the same Graviton instance types available. Regional variation in instance type availability creates a challenge when deploying applications consistently across multiple AWS Regions. To handle these differences, expand your instance type flexibility beyond the minimum 10 types to make sure of sufficient options in each AWS Region where you operate.

To implement this flexibility across AWS Regions, you must determine which Graviton instance types are available in each target AWS Region. AWS provides several methods to access this information: check the Amazon EC2 Instance Types by Region documentation for a comprehensive list, use the DescribeInstanceTypeOfferings Amazon EC2 API to programmatically identify available types, or visit the EC2 Instance Types page in the AWS Management Console.

You can also run the GetInstanceTypesFromInstanceRequirements API across different AWS Regions to understand Regional differences. For example, running identical queries in the US East (N. Virginia) and Asia Pacific (Taipei) Regions reveals significant variations: over 70 compatible instance types in the US East (N. Virginia) and 27 in Asia Pacific (Taipei) Regions.

# Query for US East (N. Virginia)
aws ec2 get-instance-types-from-instance-requirements \
--architecture-types arm64 \
--virtualization-types hvm \
--instance-requirements '{"VCpuCount": {"Min": 2,"Max":8}, "MemoryMiB": {"Min": 8000}, "InstanceGenerations":["current"]}' \
--region us-east-1

# Query for Asia Pacific (Taipei)
aws ec2 get-instance-types-from-instance-requirements \
--architecture-types arm64 \
--virtualization-types hvm \
--instance-requirements '{"VCpuCount": {"Min": 2,"Max":8}, "MemoryMiB": {"Min": 8000}, "InstanceGenerations":["current"]}' \
--region ap-east-2

When operating across multiple AWS Regions, design a single mixed instance policy that works everywhere by including instance types available in all AWS Regions where you operate. Based on the previous query results, you might include these 10 instance types that are available in both AWS Regions: m6g.large, m7g.large, m6gd.large, m7gd.large, c6g.xlarge, c7g.xlarge, m6g.xlarge, m7g.xlarge, c6gn.xlarge, and m6gd.xlarge.

You should also span your EC2 Auto Scaling group across multiple Availability Zones (AZs) for greater resiliency and access to deeper capacity pools. To determine available AZs in your AWS Region, refer to the Availability Zones documentation or check your Amazon Virtual Private Cloud (Amazon VPC) to identify which AZs its subnets use through the DescribeSubnets Amazon EC2 API. Configure your EC2 Auto Scaling group to use all available AZs using the CloudFormation AWS::AutoScaling::AutoScalingGroup AvailabilityZones parameter:

"AvailabilityZones": [
  "us-west-2a",
  "us-west-2b",
  "us-west-2c",
  "us-west-2d",
]

Best practices for EC2 Spot Instances usage with Graviton-based instances

Although optimizing for regional availability and AZ distribution provides a strong foundation, further enhancing your Graviton deployment strategy with proper Amazon EC2 Spot Instances configuration can significantly improve cost efficiency without sacrificing reliability. When using Spot Instances with Graviton, you should implement strategies that maximize your chances of obtaining and maintaining capacity.

First, the Spot Instance Advisor provides valuable information about the interruption frequency of different instance types across AWS Regions. Use this tool to identify Graviton instance types with lower interruption rates in your target AWS Regions. Then, expand your mixed instance group to include these other instance types. Especially for Spot Instance workloads, maximize your instance type flexibility by specifying up to the full limit of 40 instance types for EC2 Auto Scaling groups mixed instance policies. This broad selection increases your chances of finding available Spot Instances capacity.

Beyond instance type selection, the allocation strategy you choose significantly impacts your ability to maintain Spot Instances capacity. Configure your Spot allocation strategy using the AWS::AutoScaling::AutoScalingGroup InstancesDistribution property with the SpotAllocationStrategy parameter set to price-capacity-optimized to choose Spot pools with the lowest interruption risk while still considering price:

"InstancesDistribution": {
  "SpotAllocationStrategy": "price-capacity-optimized"
}

For workloads that can benefit from more time beyond the standard two-minute Spot interruption notice, enable Capacity Rebalancing. This feature, configured using the AWS::AutoScaling::AutoScalingGroup CapacityRebalanceproperty, enables EC2 Auto Scaling to proactively respond to rebalance recommendations by launching a new Spot Instance before a running instance receives the two-minute Spot Instance interruption notice, which provides more time for graceful transitions:

"CapacityRebalance": true

For maximum flexibility and capacity access, consider mixing x86 and ARM architectures in your launch templates. Although the Graviton capacity pools are newer and sometimes smaller than their x86 counterparts, a mixed architecture approach makes sure that you can still launch instances even when one architecture has limited availability. For detailed instructions, refer to the AWS post: Supporting AWS Graviton2 and x86 instance types in the same Auto Scaling group.

Attribute-based instance type selection

Although mixed instance policies with explicit instance type lists provide excellent flexibility, AWS offers an even more powerful approach for dynamic instance selection: attribute-based instance type selection. This streamlines management by allowing you to specify the attributes your application needs rather than specific instance types, automatically adapting to new instance types and handling Regional differences in availability.

Implement attribute-based instance type selection in your EC2 Launch Template through the AWS::EC2::LaunchTemplate InstanceRequirements property:

{
  "InstanceRequirements": {
    "AcceleratorCount": {
      "Max": 0
    },
    "BareMetal": "excluded",
    "BaselinePerformanceFactors": {
      "Cpu": {
        "References": [
          {
            "InstanceFamily": "c7g"
          }
        ]
      }
    },
    "InstanceGenerations": [
      "current"
    ],
    "MemoryMiB": {
      "Min": 8000
    },
    "VCpuCount": {
      "Min": 4
    }
  }
}

The BaselinePerformanceFactors parameter of the AWS::EC2::LaunchTemplate InstanceRequirements property enables performance protection. This feature makes sure that your EC2 Auto Scaling group uses instance types that meet or exceed a specified performance baseline. When you specify an instance family such as “c7g” as the baseline reference, Amazon EC2 automatically excludes instance types that fall below this performance level, even if they match your other specified attributes. For Graviton deployments, specifying “c7g” makes sure that only instance types with performance like or better than Graviton3 processors are chosen.

Attribute-based instance type selection also allows you to specify instance types in your template that may not yet be available in an AWS Region by using the AllowedInstanceTypes parameter:

{
  "AllowedInstanceTypes": [
    "m6g.large",
    "m7g.large",
    "m8g.large"
  ]
}

This approach allows your EC2 Auto Scaling group to use newer instance types where available and automatically deploy them in other AWS Regions as soon as they become available. This single template approach simplifies the deployment and management of your EC2 instance selection in EC2 Auto Scaling groups across many regions.

Special considerations

The following special considerations should be taken into account.

Performance testing with multiple instance types

When implementing instance type flexibility, a common concern is the need to test all instance types with your application. Testing 40 different instance types isn’t practical for most organizations. Instead, consider these streamlined approaches to reduce testing overhead while maintaining performance confidence. First, Graviton instance families within the same generation (for example, c7g, m7g, and r7g) use the same processor, providing similar performance profiles across families. Therefore, you can include multiple instance types from the same generation after testing a representative instance. Second, you should also consider including variants within families (such as c7gd with NVMe storage), because these provide specialized capabilities without changing the fundamental CPU architecture. Third, for maximum flexibility, include multiple instance generations. If your application runs well on Graviton3, then it likely performs even better on Graviton4, allowing you to specify both in your EC2 Auto Scaling group.

Reserving specific Graviton instance types

If your workload needs a specific Graviton instance type, then we recommend that you use EC2 Capacity Reservations, which allow you to reserve compute capacity for EC2 instances in a specific AZ for any duration. On-Demand Capacity Reservations (ODCR) are for immediate use and come with no term commitment. Alternatively, Future-dated Capacity Reservations allow you to specify when you need the capacity to become available along with a commitment duration.

Amazon EMR workloads

Although Amazon EMR clusters must exist in only one AZ, you can use Amazon EMR instance fleets to choose multiple subnets across different AZs. Then, when launching a cluster, Amazon EMR searches across these subnets to find specified instances and purchasing options, thus providing access to a deeper capacity pool. For Instance Fleets you can specify up to 30 EC2 instance types for each primary, core, and task node group, which significantly improves instance flexibility and availability. For more information go to the Responding to Amazon EMR cluster insufficient instance capacity events documentation.

Conclusion

In this post, we covered advanced strategies for maximizing AWS Graviton adoption across multiple AWS Regions. You can use the AWS CloudFormation examples provided in this post as templates for your own implementations. Following these approaches allows you to maintain consistent application performance and maximize Graviton instance availability across all AWS Regions where you operate, even as Graviton availability continues to expand across the AWS global infrastructure. For comprehensive guidance on maximizing your Graviton deployment, explore the AWS Graviton Technical Guide.

Improve RabbitMQ performance on Amazon MQ with AWS Graviton3-based M7g instances

2025-07-23 Vignesh Selvam

Post Syndicated from Vignesh Selvam original https://aws.amazon.com/blogs/big-data/improve-rabbitmq-performance-on-amazon-mq-with-aws-graviton3-based-m7g-instances/

Amazon MQ is a fully managed service for open-source message brokers such as RabbitMQ and Apache ActiveMQ. Today, we are announcing the availability of AWS Graviton3-based Rabbit MQ brokers on Amazon MQ, which runs on Amazon EC2 M7g instances. AWS Graviton processors are custom designed server processors developed by AWS to provide the best price performance for cloud workloads running on Amazon EC2. It uses the Arm (arm64) instruction set. For example, when running an Amazon MQ for RabbitMQ cluster broker using M7g.4xlarge instances, you can achieve up to 50% higher workload capacity and up to 85% higher throughput compared to M5.4xlarge instances. Additionally, M7g brokers on Amazon MQ offer optimized disk sizes for clusters, providing reduction in storage cost savings over M5 brokers depending on the instance size chosen. To learn more, refer to Amazon EC2 M7g instances.

Amazon MQ helps you reduce the operational overhead of using open source message brokers like RabbitMQ while providing security, high availability, and durability. Many organizations use Amazon MQ to decouple applications, asynchronously process messages, and build event-driven architectures. We tested and validated M7g instances for RabbitMQ version 3.13, so you can run your critical messaging workloads on Amazon MQ brokers with improved performance characteristics, while also saving on costs. Amazon MQ supports M7g instances in a wide variety of sizes, ranging from medium to 16xlarge sizes, to suit your different messaging workloads. M7g instances support Amazon MQ for RabbitMQ features, making it straightforward for you to run your existing RabbitMQ workloads with minimal changes. You can get started by provisioning new brokers or upgrading your existing RabbitMQ brokers using Amazon EC2 M5 instances to Graviton3-based M7g instances as the broker type using the AWS Management Console, APIs using the AWS SDK, and the AWS Command Line Interface (AWS CLI).

The following table lists the specific characteristics of M7g instances on Amazon MQ.

M7g specs for Amazon MQ
*Instance Name (MQ.m7g.)**	vCPUs	Memory (GiB)	Network Bandwidth
medium	1	4	Up to 12.5 Gb
large	2	8	Up to 12.5 Gb
xlarge	4	16	Up to 12.5 Gb
2xlarge	8	32	Up to 15 Gb
4xlarge	16	64	Up to 15 Gb
8xlarge	32	128	15 Gb
12xlarge	48	192	22.5 Gb
16xlarge	64	256	30 Gb

M7g instances vs. M5 instances on Amazon MQ

Customers can see both performance improvements and cost savings for their RabbitMQ workloads when moving from M5 instances to M7g instances. In terms of performance, you can size your RabbitMQ brokers for workloads by measuring the workload capacity and throughput. Amazon MQ has improved the performance of RabbitMQ on both workload capacity and throughput for M7g instances. In terms of cost, you pay for the instance per hour, disk usage per Gb-month, and data transfer. Amazon MQ has optimized disk sizes to offer cost savings for customers on disk usage. Let’s first examine the performance improvements.

Workload capacity improvements

Workload capacity represents the total number of connections, channels, and queues that you can use without running into memory alarm. The actual usage of these resources is limited by the high memory watermark value. Every resource (for example, a queue) on creation uses up a small amount of memory, but when these resources are used, the memory used increases depending on the number and size of messages processed up until a memory threshold. The RabbitMQ broker goes into memory alarm when the memory used on a node reaches this pre-defined threshold known as high memory watermark. When a broker raises a memory alarm, it will block all connections that are publishing messages. After the memory alarm has cleared (for example, due to delivering some messages to clients that consume and acknowledge the deliveries), normal service resumes. The open source community guidance for RabbitMQ 3.13 is to configure the memory threshold at 40% of the available memory per node. M5 brokers have the memory threshold set at 40% on Amazon MQ.

We evaluated this recommendation across M7g instances and determined that the memory threshold can be increased for instances on Amazon MQ to more than 40% due to the operational improvements by the service, as illustrated in the following figure. This increase in available memory translates to a higher use of resources like queues, channels, and connections within the resource limits of the broker. The change in available memory results in up to 50% improvement in workload capacity for customers when compared to M5 brokers today.

Throughput improvements

The throughput of a broker varies widely with the queue type and usage pattern of customers. Amazon MQ evaluated the throughput capacity of a RabbitMQ three-node cluster broker by measuring the publish throughput in messages per second for 10 quorum queues with a message size of 1 KB and a ratio of 1:20 for connection to channels. We arrived at this benchmark test after evaluating multiple scenarios with the goal of providing you a simple way to estimate the average throughput you can expect from a RabbitMQ broker when following best practices. You can see up to 85% higher throughput compared to equivalent M5 brokers on Amazon MQ, as illustrated in the following figure.

The performance of a RabbitMQ broker depends on the version, queue type, and usage pattern in addition to the infrastructure used. You might see different performance improvements based on your specific usage patterns and resources used. We recommend using the Amazon MQ sizing guidance to size your broker and benchmarking the performance for your specific workload using M7g instances.

Cost savings on cluster disk usage

Customers using M7g brokers in cluster deployment mode are provisioned with a disk volume per node that varies in size depending on the instance size. For M5 brokers, the RabbitMQ brokers were provisioned with a fixed disk volume of 200 GB per node. The open source guidance around disk sizes is to use a size higher than twice the memory threshold. We tested various disk sizes and identified optimal disk sizes that would provide a better operational posture. With this change, customers using M7g cluster brokers on Amazon MQ will get cost savings due to the smaller disk size provisioned per node as compared to equivalent M5 brokers, as shown in the following table. Single-instance M7g brokers will continue to be provisioned with 200 GB of disk size.

Instance size	Disk Volume M5 cluster(GB)	Disk Volume M7g Cluster(GB)	Cost savings for customersM5 vs. M7g (%)
medium	–	15	–
large	600	45	92.50%
xlarge	600	75	87.50%
2xlarge	600	135	77.50%
4xlarge	600	270	55.00%
8xlarge	–	525	–
12xlarge	–	780	–
16xlarge	–	1035	–

Pricing and Regional availability

M7g instances are available in AWS Regions where Amazon MQ is available at the time of writing except Africa (Cape Town), Canada West (Calgary), and Europe (Milan) Regions. Refer to Amazon MQ Pricing to learn about the availability of specific instance sizes by Region and the pricing for M7g instances.

Summary

In this post, we discussed the performance gains and cost savings achieved while using Graviton-based M7g instances. These instances can provide significant improvement in throughput and workload capacity compared to similar sized M5 instances for Amazon MQ workloads. To get started, create a new broker with M7g brokers using the console, and refer to the Amazon MQ Developer Guide for more information.

About the authors

Vignesh Selvam is the Principal Product Manager for Amazon MQ at AWS. He works with customers to solve their messaging needs and with the open-source communities for innovating with message brokers. Prior to joining AWS, he built products for security and analytics.

Samuel Massé is a Software Development Engineer at AWS. He has been leading the engineering effort to support M7g on the RabbitMQ team. In his free time he enjoys coding unfinished side projects.

Vinodh Kannan Sadayamuthu is a Senior Specialist Solutions Architect at Amazon Web Services (AWS). His expertise centers on AWS messaging and streaming services, where he provides architectural best practices consultation to AWS customers.

New Amazon EC2 C8gn instances powered by AWS Graviton4 offering up to 600Gbps network bandwidth

2025-06-30 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/new-amazon-ec2-c8gn-instances-powered-by-aws-graviton4-offering-up-to-600gbps-network-bandwidth/

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) C8gn network optimized instances powered by AWS Graviton4 processors and the latest 6th generation AWS Nitro Card. EC2 C8gn instances deliver up to 600Gbps network bandwidth, the highest bandwidth among EC2 network optimized instances.

You can use C8gn instances to run the most demanding network intensive workloads, such as security and network virtual appliances (virtual ﬁrewalls, routers, load balancers, proxy servers, DDoS appliances), data analytics, and tightly-coupled cluster computing jobs.

EC2 C8gn instances specifications
C8gn instances provide up to 192 vCPUs and 384 GiB memory, and offer up to 30 percent higher compute performance compared Graviton3-based EC2 C7gn instances.

Here are the specs for C8gn instances:

Instance Name	vCPUs	Memory (GiB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
c8gn.medium	1	2	Up to 25	Up to 10
c8gn.large	2	4	Up to 30	Up to 10
c8gn.xlarge	4	8	Up to 40	Up to 10
c8gn.2xlarge	8	16	Up to 50	Up to 10
c8gn.4xlarge	16	32	50	10
c8gn.8xlarge	32	64	100	20
c8gn.12xlarge	48	96	150	30
c8gn.16xlarge	64	128	200	40
c8gn.24xlarge	96	192	300	60
c8gn.metal-24xl	96	192	300	60
c8gn.48xlarge	192	384	600	60
c8gn.metal-48xl	192	384	600	60

You can launch C8gn instances through the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDKs.

If you’re using C7gn instances now, you will have straightforward experience migrating network intensive workloads to C8gn instances because the new instances offer similar vCPU and memory ratios. To learn more, check out the collection of Graviton resources to help you start migrating your applications to Graviton instance types.

You can also visit the Level up your compute with AWS Graviton page to begin your Graviton adoption journey.

Now available
Amazon EC2 C8gn instances are available today in US East (N. Virginia) and US West (Oregon) Regions. Two metal instance sizes are only available in US East (N. Virginia) Region. These instances can be purchased as On-Demand, Savings Plan, Spot instances, or as Dedicated instances and Dedicated hosts.

Give C8gn instances a try in the Amazon EC2 console. To learn more, refer to the Amazon EC2 C8g instance page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

AWS Weekly Roundup: New AWS Heroes, Amazon Q Developer, EC2 GPU price reduction, and more (June 9, 2025)

2025-06-09 Betty Zheng (郑予彬)

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-new-aws-heroes-amazon-q-developer-ec2-gpu-price-reduction-and-more-june-9-2025/

The AWS Heroes program recognizes a vibrant, worldwide group of AWS experts whose enthusiasm for knowledge-sharing has a real impact within the community. Heroes go above and beyond to share knowledge in a variety of ways in developer community. We introduce our newest AWS Heroes in the second quarter of 2025.

To find and connect with more AWS Heroes near you, visit the categories in which they specialize Community Heroes, Container Heroes, Data Heroes, DevTools Heroes, Machine Learning Heroes, Security Heroes, and Serverless Heroes.

Last week’s launches
In addition to the inspiring celebrations, here are some AWS launches that caught my attention.

Agentic capabilities for Amazon Q Developer Chat in the AWS Management Console and chat applications – In the AWS Management Console, Microsoft Teams, and Slack, Amazon Q Developer can now answer more complex queries by automatically identifying appropriate tools across more than 200 AWS services, breaking queries into executable steps, and showing its reasoning process as it works.
Claude Sonnet 4 model on Amazon Q Developer CLI – Amazon Q Developer CLI now supports Anthropic’s Claude Sonnet 4 and allows developers to switch between premium models (Sonnet 4, 3.7, and 3.5) using simple commands like /model or q chat --model.
Amazon API Gateway now supports dynamic request routing for REST APIs – Amazon API Gateway now supports routing rules for REST APIs using custom domain names, allowing traffic distribution based on HTTP headers, URL base paths, or both.
Amazon Athena announces managed query results to streamline analysis workflows – Managed query results is a new feature that automatically stores, encrypts, and manages the lifecycle of query results for customers at no additional cost.
A new monitoring dashboard in the AWS Network Firewall console – The new dashboard provides enhanced visibility into network traffic patterns, including top flows, TLS Server Name Indication (SNI), HTTP Host headers, long-lived TCP connections, and failed TCP handshakes.

For a full list of AWS announcements, be sure to keep an eye on What’s New at AWS.

Other AWS news
Here are some additional projects, blog posts that you might find interesting:

Up to 45 percent price reduction for Amazon EC2 NVIDIA GPU-accelerated instances – AWS is reducing the price of NVIDIA GPU-accelerated Amazon EC2 instances (P4d, P4de, P5, and P5en) by up to 45 percent for On-Demand and Savings Plan usage. We are also making the very new P6-B200 instances available through Savings Plans to support large-scale deployments.
Introducing public AWS API models – AWS now provides daily updates of Smithy API models on GitHub, enabling developers to build custom SDK clients, understand AWS API behaviors, and create developer tools for better AWS service integration.
The AWS Asia Pacific (Taipei) Region is now open – The new Region provides customers with data residency requirements to securely store data in Taiwan while providing even lower latency. Customers across industries can benefit from the secure, scalable, and reliable cloud infrastructure to drive digital transformation and innovation.
Amazon EC2 has simplified the AMI cleanup workflow – Amazon EC2 now supports automatically deleting underlying Amazon Elastic Block Store (Amazon EBS) snapshots when deregistering Amazon Machine Images (AMIs).
The Lab where AWS designs custom chips – Visit Annapurna Labs in Austin, Texas—a combination of offices, workshops, and even a mini data center—where Amazon Web Services (AWS) engineers are designing the future of computing.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events.

Join re:Inforce from anywhere – If you aren’t able to make it to Philadelphia (June 16–18), tune in remotely. Get free access to the re:Inforce keynote and innovation talks live as they happen.
AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Shanghai (June 19 – 20), Milano (June 18), Mumbai (June 19) and Japan (June 25 – 26).
AWS re:Invent – Mark your calendars for AWS re:Invent (December 1 – 5) in Las Vegas. Registration is now open
AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Mexico (June 14), Nairobi, Kenya (June 14) and Colombia (June 28)

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Betty

Accelerate your AWS Graviton adoption with the AWS Graviton Savings Dashboard

2024-12-11 aostan

Post Syndicated from aostan original https://aws.amazon.com/blogs/compute/accelerate-your-aws-graviton-adoption-with-the-aws-graviton-savings-dashboard/

This post is written by Rajani Guptan, Rosa Corley and Shankar Gopalan.

Are you looking to optimize your AWS infrastructure costs while maintaining high performance? AWS Graviton is a custom-built CPU developed by Amazon Web Services (AWS), and it is designed to deliver the best price performance for a broad range of cloud workloads running on Amazon Elastic Compute Cloud (Amazon EC2). Graviton-based instances provide up to 40% better price performance while using up to 60% less energy than comparable EC2 instances.

AWS users recognize that migrating existing workloads to Graviton-based instances results in better price performance. However, migrating to Graviton necessitates identifying comparable instance types, understanding the performance impacts, and estimating the savings opportunities. Furthermore, prioritizing and tracking migration efforts at scale across a diverse set of services such as Amazon EC2, Amazon Relational Database Service (Amazon RDS), Amazon ElastiCache, and Amazon OpenSearch can be challenging. Therefore, AWS has developed the AWS Graviton Savings Dashboard to help users address these complexities and accelerate their Graviton migration.

In this post, we walk you through the dashboard architecture, deployment steps, features, and capabilities. Whether you are an Executive, FinOps Practitioner, Product Owner, or in Engineering, you can use the dashboard to get the following:

Centralized visibility across accounts/workloads: The dashboard consolidates and tracks Graviton adoption across multiple management accounts, member accounts, and AWS Regions in a single view.
Graviton support across key AWS services: There are dedicated tabs allowing users to review current Graviton usage and potential savings across AWS compute and managed services.
Granular resource-level visibility for managed services: The dashboard provides granular resource level visibility for managed services such as Amazon RDS, ElastiCache, and OpenSearch.
Accurate savings and unit cost estimations: The dashboard provides accurate cost estimations for existing and comparable Graviton-based instance types by using the existing AWS Cost and Usage Report (CUR) data with the AWS public pricing API.
Categorization of migration effort: The dashboard categorizes Graviton migration opportunities into two main groups: Typically Easy and Requires Additional Planning, for EC2 instances. It also identifies Graviton-eligible resources for managed services, which may need version or database upgrades. This categorization helps users prioritize their engineering efforts for migration.

Architecture overview

The solution integrates AWS CUR, the AWS SDK, and AWS Public pricing API to generate comprehensive data on the usage, cost, and resource inventory. This data is stored in Amazon S3 and analyzed using Amazon Athena, providing deep insights into potential cost savings. Then, the results are visualized through Amazon QuickSight, enabling stakeholders to collaborate effectively and make informed, data-driven decisions, as shown in the following figure.

Figure 1: Graviton Savings Dashboard architecture diagram

Although the solution typically costs between $50–$100 per month, the potential return on investment is substantial. The dashboard often identifies measurable cost savings that significantly outweigh its operational expenses. Moreover, it offers additional productivity benefits by streamlining the process of adopting Graviton, saving valuable time and effort for your team. For a detailed breakdown of the dashboard’s cost structure, we invite you to explore our comprehensive Graviton Savings Dashboard Cost Breakdown guide.

Deployment

The Graviton Savings Dashboard is part of the Cloud Intelligence Dashboards framework. You can deploy it using AWS CloudFormation Templates and a ‘cid-cmd’ command line tool. Prior to deploying the dashboard, make sure that you’ve met the prerequisites. These include the following:

Setting up your AWS CUR: We highly recommend that you complete Steps 1 and 2 from the Cloud Intelligence Dashboard Deployment Guide. This makes sure that your CUR is set up with settings that allow for easy installation and troubleshooting if necessary.
Setting up the Inventory Collector Module of the Optimization Data Collection lab: This provides automation to collect metadata and pricing for Amazon RDS, ElastiCache, and OpenSearch for all accounts in your AWS Organizations and AWS Regions.
Preparing QuickSight: If you’re an existing QuickSight user, then you can skip this step. If not, then you must complete Step 3.1 to Prepare QuickSight.

When the prerequisites are in place, you can deploy the dashboard by running three simple commands (shown as follows) using a terminal application with permissions to run API requests in your AWS account.

python3 -m ensurepip --upgrade

pip3 install --upgrade cid-cmd

cid-cmd deploy --dashboard-id graviton-savings

For detailed instructions about the deployment and prerequisites, refer to the AWS Well-Architected Cost Optimization lab.

Examining the results: unlocking insights from your Graviton Savings Dashboard

Now that you’ve successfully deployed the dashboard, we can explore its powerful features and uncover valuable insights. As you read through this section, we encourage you to interact with your dashboard to familiarize yourself with the dashboard’s intuitive interface and functionality.

The Graviton Savings Dashboard is organized into service-specific tabs, each containing two key sections:

Current Graviton Usage and Savings (top section): This section highlights the tangible benefits you’ve already achieved by migrating workloads to Graviton. You can explore the following:

- Monthly Graviton adoption trends
- Usage distribution across different accounts, Regions, and processor types
- Popular Graviton instance families
- Unit cost trends
- Realized Graviton savings

These metrics are calculated by comparing your Graviton usage to comparable non-Graviton instances, which provides a clear picture of your cost optimization efforts, as shown in the following figure.

Figure 2: Current Amazon EC2 Graviton Usage and Savings

Potential Graviton Savings Opportunities (bottom section): This section identifies areas where you can further optimize costs by adopting Graviton instances. It provides the following:

Actionable migration insights
Estimated implementation effort
Potential savings breakdowns by account, instance family/type, OS, and purchase option

These insights compare potential Graviton savings across various attributes, enabling targeted decision-making for future Graviton migrations and cost optimizations, as shown in the following figure.

Figure 3: Amazon EC2 Graviton Opportunity

Using dashboard insights: a FinOps team use case

In this section we explore a use case where you, as the lead of the Cloud Center of Excellence Team, use the insights from this dashboard to address concerns raised by your Chief Technology Officer (CTO).

Your CTO at your company approaches you with the following questions:

Is our organization using the price-performance benefits of Graviton-based EC2 instances?
How does our Graviton usage and spend and savings compare to other processor types within our overall EC2 compute spend?

Step 1: Initial analysis

You begin generating summary reports from the Current Graviton Usage (Figure 2) and Graviton Opportunity (Figure 3) sections of the dashboard. After reviewing these reports, the CTO asks you to engage with the Engineering team to evaluate potential opportunities for increasing Graviton coverage.

Step 2: Engaging with Engineering on Graviton Migration

When presenting the summary reports to the engineering manager, they expressed interest in understanding the effort level required for this project. This information can help them allocate resources and prioritize workloads, thus identifying what can be started in the short-term and what needs additional planning.

Step 3: Detailed analysis

As shown in the following figure, the Engineering team can focus on identifying candidate workloads with the most significant savings impact by segmenting the dashboard data by:

Implementation efforts
Linked accounts
Regions
Instance types
Operating systems

Figure 4: Amazon EC2 Graviton opportunity breakdown

Furthermore, the team can use the dashboard to determine comparable Graviton-based instances for migration and their potential savings, as shown in the following figure.

Figure 5: Potential graviton Savings Details

Step 4: Tracking progress

Over time, the FinOps team and Engineering team can showcase the Graviton migration successes by highlighting the increasing Graviton coverage and realized savings using the dashboard’s charts (Figure 2).

Broader application:

Although this post primarily focuses on EC2 instance migration, the dashboard also provides similar insights for AWS managed services such as Amazon RDS, ElastiCache, and OpenSearch. Individual tabs with visualizations guide your Graviton adoption across these services, as shown in the following figure.

Figure 6: Graviton Savings Dashboard

As demonstrated by this use case, the Graviton Savings Dashboard enables various stakeholders in an organization to collaborate effectively, which leads to efficient outcomes and potential cost savings.

Conclusion

In summary, we showed how the Graviton Savings Dashboard provides clear insights into suitable workloads for Graviton migration, offers easy-to-understand visualizations for monitoring adoption, and automates resource matching and savings calculations. Streamlining the process of identifying and implementing cost-saving opportunities with Graviton-based instances means that the dashboard enables more informed decision-making about your AWS infrastructure. To learn more and get started with the Graviton Savings Dashboard, visit the Graviton Savings Dashboard page and take the first step toward more efficient and cost-effective cloud computing.

Introducing storage optimized Amazon EC2 I8g instances powered by AWS Graviton4 processors and 3rd gen AWS Nitro SSDs

2024-12-01 Channy Yun (윤석찬)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-storage-optimized-amazon-ec2-i8g-instances-powered-by-aws-graviton4-processors-and-3rd-gen-aws-nitro-ssds/

Today, we’re announcing the general availability of Amazon EC2 I8g instances, a new storage optimized instance type to provide the highest real-time storage performance among storage-optimized EC2 instances with the third generation of AWS Nitro SSDs and AWS Graviton4 processors.

AWS Graviton4 is the most powerful and energy efficient processor we have ever designed for a broad range of workloads running on EC2 instances using a 64-bit ARM instruction set architecture. AWS Nitro System SSDs are custom built by AWS and offer high I/O performance, low latency, minimal latency variability, and security with always-on encryption.

EC2 I8g instances are the first instance type to use third-generation AWS Nitro SSDs. These instances offer up to 22.5 TB local NVME SSD storage with up to 65 percent better real-time storage performance per TB and 60 percent lower latency variability compared to the previous generation I4g instances. Based on the AWS Graviton4 processors, I8g instances deliver up to 60 percent better compute performance and two times larger caches compared to I4g.

I8g instances offer up to 96 vCPUs, 768 GiB of memory, and 22.5 TB of storage to deliver more compute and storage choices compared with I4g instances.

Instance name	vCPUs	Memory (Gib)	Storage (GB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
I8g.large	2	16	468	up to 10	up to 10 Gbps
I8g.xlarge	4	32	937	up to 10	up to 10 Gbps
I8g.2xlarge	8	64	1,875	up to 12	up to 10 Gbps
I8g.4xlarge	16	128	3,750	up to 25	up to 10 Gbps
I8g.8xlarge	32	256	7,500 (2 x 3,750)	up to 25	10 Gbps
I8g.12xlarge	48	384	11,520 (3 x 3,750)	up to 28.125	15 Gbps
I8g.16xlarge	64	512	15,000 (4 x 3,750)	up to 37.5	20 Gbps
I8g.24xlarge	96	768	22,500 (6 x 3,750)	up to 56.25	20 Gbps
I8g.metal-24×1	96	768	22,500 (6 x 3,750)	up to 56.25	30 Gbps

You can use I8g instances for I/O intensive workloads that require low latency access to data such as transactional databases (MySQL and PostgreSQL), real-time databases, NoSQL databases, (Aerospike, Apache Druid, MongoDB) and real-time analytics such as Apache Spark.

Additionally, I8g instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads. The Graviton4 processors offer you enhanced security by fully encrypting all high-speed physical hardware interfaces.

Things to know
Here are some things that you should know about EC2 I8g instances:

Operating system – EC2 I8g instances support Amazon Linux 2023, Amazon Linux 2, CentOS Stream 8 or newer, Ubuntu 18.04 or newer, SUSE 15 SP2 or newer, Debian 11 or newer, Red Hat Enterprise 8.2 or newer, CentOS 8.2 or newer, FreeBSD 13 or newer, Rocky Linux 8.4 or newer, Alma Linux 8.4 or newer, and Alpine Linux 3.12.7 or newer.
Networking – You can use I8g instances in storage intensive workloads that typically have burst network usage patterns. All I8g instance sizes have burst network bandwidth and can burst more than 60 minutes, depending on the instance sizes, to support the majority of the workloads requiring instance storage data hydration, backup, and snapshot over the network.
Migration – If you’re using I4g instances now, you will have straightforward experience migrating storage intensive workloads to I8g instances because these instances offer similar memory and storage ratios to existing I4g instances.

Now available
Amazon EC2 I8g instances are now available in the US East (N. Virginia) and US West (Oregon) AWS Regions through On-Demand instances, Savings Plans, Spot Instances, Dedicated Instances, or Dedicated Hosts.

Give EC2 I8g instances a try in the Amazon EC2 console. To learn more, visit the EC2 I8g instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

Now available: Graviton4-powered memory-optimized Amazon EC2 X8g instances

2024-09-18 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-graviton4-powered-memory-optimized-amazon-ec2-x8g-instances/

Graviton-4-powered, memory-optimized X8g instances are now available in ten virtual sizes and two bare metal sizes, with up to 3 TiB of DDR5 memory and up to 192 vCPUs. The X8g instances are our most energy efficient to date, with the best price performance and scale-up capability of any comparable EC2 Graviton instance to date. With a 16 to 1 ratio of memory to vCPU, these instances are designed for Electronic Design Automation, in-memory databases & caches, relational databases, real-time analytics, and memory-constrained microservices. The instances fully encrypt all high-speed physical hardware interfaces and also include additional AWS Nitro System and Graviton4 security features.

Over 50K AWS customers already make use of the existing roster of over 150 Graviton-powered instances. They run a wide variety of applications including Valkey, Redis, Apache Spark, Apache Hadoop, PostgreSQL, MariaDB, MySQL, and SAP HANA Cloud. Because they are available in twelve sizes, the new X8g instances are an even better host for these applications by allowing you to choose between scaling up (using a bigger instance) and scaling out (using more instances), while also providing additional flexibility for existing memory-bound workloads that are currently running on distinct instances.

The Instances
When compared to the previous generation (X2gd) instances, the X8g instances offer 3x more memory, 3x more vCPUs, more than twice as much EBS bandwidth (40 Gbps vs 19 Gbps), and twice as much network bandwidth (50 Gbps vs 25 Gbps).

The Graviton4 processors inside the X8g instances have twice as much L2 cache per core as the Graviton2 processors in the X2gd instances (2 MiB vs 1 MiB) along with 160% higher memory bandwidth, and can deliver up to 60% better compute performance.

The X8g instances are built using the 5th generation of AWS Nitro System and Graviton4 processors, which incorporates additional security features including Branch Target Identification (BTI) which provides protection against low-level attacks that attempt to disrupt control flow at the instruction level. To learn more about this and Graviton4’s other security features, read How Amazon’s New CPU Fights Cybersecurity Threats and watch the re:Invent 2023 AWS Graviton session.

Here are the specs:

Instance Name	vCPUs	Memory (DDR5)	EBS Bandwidth	Network Bandwidth
x8g.medium	1	16 GiB	Up to 10 Gbps	Up to 12.5 Gbps
x8g.large	2	32 GiB	Up to 10 Gbps	Up to 12.5 Gbps
x8g.xlarge	4	64 GiB	Up to 10 Gbps	Up to 12.5 Gbps
x8g.2xlarge	8	128 GiB	Up to 10 Gbps	Up to 15 Gbps
x8g.4xlarge	16	256 GiB	Up to 10 Gbps	Up to 15 Gbps
x8g.8xlarge	32	512 GiB	10 Gbps	15 Gbps
x8g.12xlarge	48	768 GiB	15 Gbps	22.5 Gbps
x8g.16xlarge	64	1,024 GiB	20 Gbps	30 Gbps
x8g.24xlarge	96	1,536 GiB	30 Gbps	40 Gbps
x8g.48xlarge	192	3,072 GiB	40 Gbps	50 Gbps
x8g.metal-24xl	96	1,536 GiB	30 Gbps	40 Gbps
x8g.metal-48xl	192	3,072 GiB	40 Gbps	50 Gbps

The instances support ENA, ENA Express, and EFA Enhanced Networking. As you can see from the table above they provide a generous amount of EBS bandwidth, and support all EBS volume types including io2 Block Express, EBS General Purpose SSD, and EBS Provisioned IOPS SSD.

X8g Instances in Action
Let’s take a look at some applications and use cases that can make use of 16 GiB of memory per vCPU and/or up to 3 TiB per instance:

Databases – X8g instances allow SAP HANA and SAP Data Analytics Cloud to handle larger and more ambitious workloads than before. Running on Graviton4 powered instances, SAP has measured up to 25% better performance for analytical workloads and up to 40% better performance for transactional workloads in comparison to the same workloads running on Graviton3 instances. X8g instances allow SAP to expand their Graviton-based usage to even larger memory bound solutions.

Electronic Design Automation – EDA workloads are central to the process of designing, testing, verifying, and taping out new generations of chips, including Graviton, Trainium, Inferentia, and those that form the building blocks for the Nitro System. AWS and many other chip makers have adopted the AWS Cloud for these workloads, taking advantage of scale and elasticity to supply each phase of the design process with the appropriate amount of compute power. This allows engineers to innovate faster because they are not waiting for results. Here’s a long-term snapshot from one of the clusters that was used to support development of Graviton4 in late 2022 and early 2023. As you can see this cluster runs at massive scale, with peaks as high as 5x normal usage:

You can see bursts of daily and weekly activity, and then a jump in overall usage during the tape-out phase. The instances in the cluster are on the large end of the size spectrum so the peaks represent several hundred thousand cores running concurrently. This ability to spin up compute when we need it and down when we don’t gives us access to unprecedented scale without a dedicated investment in hardware.

The new X8g instances will allow us and our EDA customers to run even more workloads on Graviton processors, reducing costs and decreasing energy consumption, while also helping to get new products to market faster than ever.

Available Now
X8g instances are available today in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) AWS Regions in On Demand, Spot, Reserved Instance, Savings Plan, Dedicated Instance, and Dedicated Host form. To learn more, visit the X8g page.

Using Amazon APerf to go from 50% below to 36% above performance target

2024-08-07 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/using-amazon-aperf-to-go-from-50-below-to-36-above-performance-target/

This post is written by Tyler Jones, Senior Solutions Architect – Graviton, AWS.

Performance tuning the Renaissance Finagle-http benchmark

Sometimes software doesn’t perform the way it’s expected to across different systems. This can be due to a configuration error, code bug, or differences in hardware performance. Amazon APerf is a powerful tool designed to help identify and address performance issues on AWS instances and other computers. APerf captures comprehensive system metrics simultaneously and then visualizes them in an interactive report. The report allows users to analyze metrics such as CPU usage, interrupts, memory usage, and CPU core performance counters (PMU) together. APerf is particularly useful for performance tuning workloads across different instance types, as it can generate side-by-side reports for easy comparison. APerf is valuable for developers, system administrators, and performance engineers who need to optimize application performance on AWS. From here on we use the Renaissance benchmark as an example to demonstrate how APerf is used to debug and find performance bottlenecks.

The example

The Renaissance finagle-http benchmark was unexpectedly found to run 50% slower on a c7g.16xl Graviton3 than on a reference instance, both initially using the Linux-5.15.60 kernel. This is unexpected behavior.Graviton3 should be performing as good or better than our reference instance as it does for other Java based workloads. It’s likely there is a configuration problem somewhere. The Renaissance finagle-http benchmark is written in Scala but produces Java byte code, so our investigation will focus on the Java JVM as well as system-level configurations.

Overview

System performance tuning is an iterative process that is conducted in two main phases, the first focuses on overall system issues, and the second focuses on CPU core bottlenecks. APerf is used to assist in both phases.

APerf can render several instances’ data in one report, side by side, typically a reference system and the system to be tuned. The reference system provides values to compare against. In isolation, metrics are harder to evaluate. A metric may be acceptable in general but the comparison to the reference system makes it easier to spot room for improvement.

APerf helps to identify unusual system behavior, such as high interrupt load, excessive I/O wait, unusual network layer patterns, and other such issues in the first phase. After adjustments are made to address these issues, for example by modifying JVM flags, the second phase starts. Using the system tuning of the first phase, fresh APerf data is collected and evaluated with a focus on CPU core performance metrics.

Any inferior metric of the SUT CPU core, as compared to the reference system, holds potential for improvement. In the following section we discuss the two phases in detail.
For more background on system tuning, refer to the Performance Runbook in the AWS Graviton Getting Started guide.

Initial data collection

Here is an example for how 240 seconds of system data is collected with APerf:

#enable PMU access
echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid
#APerf has to open more than the default limit of files.
ulimit -n 65536
#usually aperf would be run in another terminal. 
#For illustration purposes it is send to the background here
./aperf record --run-name finagle_1 --period 240 &
#With 64 CPUs it takes APerf a little less than 15s to report readiness 
#for data collection.
sleep 15
java -jar renaissance-gpl-0.14.2.jar -r 8 finagle-http

The APerf report is generated as follows:

./aperf report --run finagle_1

Then, the report can be viewed with a web browser:

firefox aperf_report_finagle_1/index.html

The APerf report can render several data sets in the same report, side-by-side.

./aperf report --run finagle_1_c7g --run finagle_1_reference --run ...

Note that it is crucial to examine the CPU usage over time shown in APerf. Some data may be gathered while the system is idle. The metrics during idle times have no significant value.

First phase: system level

In this phase we look for differences on the system level. To do this APerf collects data during runs of finagle-http on c7g.16xl and the reference. The reference system provides the target numbers to compare against. Any large difference warrants closer inspection.

The first differences can be seen in the following figure.

The APerf CPU usage plot shows larger drops on Graviton3 (highlighted in red) at the beginning of each run than on the reference instance.

Figure 1: CPU usage. c7g.16xl on the left, reference on the right.

The log messages about the GC runtime hint at a possible reason as they coincide with the dips in CPU usage.

====== finagle-http (web) [default], iteration 0 started ======
GC before operation: completed in 32.692 ms, heap usage 87.588 MB → 31.411 MB.
====== finagle-http (web) [default], iteration 0 completed (6534.533 ms) ======

The JVM tends to spend a significant amount of time in garbage collection, during which time it has to suspend all threads, and choosing a different GC may have a positive impact.

The default GC on OpenJDK17 is G1GC. Using parallelGC is an alternative given that the instances have 64 CPUs, and thus GC can be performed highly parallel. The Graviton Getting Started guide also recommends checking the GC log when working on Java performance issues. A cross check using the JVM’s -Xlog:gc option confirms the reduced GC time with parallelGC.

The second difference is evident in the following figure, the CPU-to-CPU interrupts (IPI). There is more than 10x higher activity on Graviton 3, which means additional IRQ work c7g.16xl on which the reference system does not have to spend CPU cycles.

Figure 2: IPI0/RES Interrupts. c7g.16xl on the left, reference on the right.

Grepping through kernel commit messages can help find patches that address a particular issue, such as IPI inefficiencies.

This scheduling patch improves performance by 19%. Another IPI patch provides an additional 1% improvement. Switching to Linux-5.15.61 allows us to use these IPI improvements. The following figure shows the effect in APerf.

Figure 3: IPI0 Interrupts (c7g.16xl)

Second phase: CPU core level

Now that the system level issues are mitigated, the focus is on the CPU cores. The data collected by APerf shows PMU data where the reference instance and Graviton3 differ significantly.

PMU metric	Graviton3	Reference system
Branch Prediction Misses/1000 Instructions	16	4
Instruction Translation Lookaside Buffer (TLB) Misses/1000 Instructions	8.3	4
Instructions Per Clock cycle	0.6	0.6*

*Note that reference has a 20% higher CPU clock than c7g.16xl. As a rule of thumb, instructions per clock (IPC) multiplied by clock rate equals work done by a CPU.

Addressing CPU Core bottlenecks

The first improvement in PMU metrics stems from the parallelGC option. Although the intention was to increase CPU usage, the following figure shows lowered branch miss counts as well. Limiting the JIT tiered compilation to only use C2 mode helps branch prediction by reducing branch indirection and increasing the locality of executed code. Finally adding Transparent Huge Pages helps branch prediction logic and avoids lengthy address translation look-ups in DDR memory. The following graphs show the effects of the chosen JVM options.

Branch misses/1000 instructions (c7g.16xl)

Figure 4: Branch misses per 1k instructions

JVM Options from left to right:

-XX:+UseParallelGC
-XX:+UseParallelGC -XX:-TieredCompilation
-XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

With the options listed under the preceding figure, APerf shows the branch prediction miss rate decreasing from the initial 16 to 11. Branch mis-predictions incur significant performance penalties as they result in wasted cycles spent computing results that ultimately need to be discarded. Furthermore, these mis-predictions cause the prefetching and cache subsystems to fail to load the necessary subsequent instructions into cache. Consequently, costly pipeline stalls and frontend stalls occur, preventing the CPU from executing instructions.

Code sparsity

Figure 5: Code sparsity

JVM Options from left to right:

-XX:+UseParallelGC
-XX:+UseParallelGC -XX:-TieredCompilation
-XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

Code sparsity is a measure of how compact the instruction code is packed and how closely related code is placed. This is where turning off tiered compilation shows its effect. Lower sparsity helps branch prediction and the cache subsystem.

Instruction TLB misses/1000 instructions (c7g.16xl)

Figure 6: Instruction TLB misses per 1k Instructions

JVM Options from left to right:

-XX:+UseParallelGC
-XX:+UseParallelGC -XX:-TieredCompilation
-XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

The big decrease in TLB misses is caused by the use of transparent huge pages, which increase the likelihood that a virtual address translation is present in the TLB, since fewer entries are needed. Translation table walks are avoided that otherwise need to traverse entries in DDR memory that cost hundreds of CPU cycles to read.

Instructions per clock cycle (IPC)

Figure 7: Instructions per clock cycle

JVM Options from left to right:

-XX:+UseParallelGC
-XX:+UseParallelGC -XX:-TieredCompilation
-XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

The preceding figure shows IPC increasing from 0.58 to 0.71 as JVM flags are added.

Results

This table summarizes the measures taken for performance improvement and their results.

JVM option set	baseline	1	2	3
IPC	0.6	0.6	0.63	0.71
Branch misses/1000 instructions	16	14	12	11
ITLB misses/1000 instructions	8.3	7.7	8.8	1.1
Benchmark runtime [ms]	6000	3922	3843	3512
execution time improvement vs baseline

+parallelGC
+parallelGC -tieredCompilation
+parallelGC -tieredCompilation +UseTransparentHugePages

Re-examining the setup of our testing enviornment: Changing where the load-generator lives

With the preceding efforts, c7g.16xl is within 91% of the reference system. The c7g.16xl branch prediction miss rate is still higher at 11 than the references at 4. As shown in the preceding figure, reduced branch prediction misses have a strong positive effect on performance. What follows is an experiment to achieve parity or better with the reference system based on the reduction of branch prediction misses.

Finagle-http serves HTTP requests generated by wrk2, which is a load generator implemented in C. The expectation is that the c7g.16xl branch predictor works better with the native wrk2 binary, unlike the Renaissance load generator, which is executing on the JVM. The wrk2 load generator and the finagle-http are assigned through taskset to separate sets of CPUs: 16 CPUs for wrk2 and 48 CPUs for finagle-http. The idea here is to have the branch predictors on these CPU sets focus on a limited code set. The following diagram illustrates the difference between the Renaissance and the experimental setup.

With this CPU performance tuned setup, c7g.16xl can now handle a 36% higher request load than the reference using the same configuration, at an average latency limit of 1.5ms. This illustrates the impact that system tuning with APerf can have. The same system that scored 50% lower than the comparison system now exceeds it by 36%.
The following APerf data shows the improvement of key PMU metrics that lead to the performance jump.

Branch prediction misses/1000 instructions

Figure 8: Branch misses per 1k instructions

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

The branch prediction miss rate is reduced to 1.5 from 11 with the Java-only setup.

IPC

Figure 9: Instructions Per Clock Cycle

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

The IPC steps up from 0.7 to 1.5 due to the improvement in branch prediction.

Code sparsity

Figure 10: Code Sparsity

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

The code sparsity decreases to 0.014 from 0.21, a factor of 15.

Conclusion

AWS created the APerf tool to aid in root cause analysis and help address performance issues for any workload. APerf is a standalone binary that captures relevant data simultaneously as a time series, such as CPU usage, interrupt frequency, memory usage, and CPU core metrics (PMU counters). APerf can generate reports for multiple data captures, making it easy to spot differences between instance types. We were able to use this data to analyze why Graviton3 was underperforming and to also see the impact of our changes in terms of performance. Using APerf we were able to successfully adjust configuration parameters and go from 50% below our performance target to 36% more performant than our reference system and associated performance target. Without Aperf, collecting these metrics and visualizing them is a non-trival task. With Aperf you can capture and visualize these metrics with two short commands, saving you time and effort so you can focus on what matters most: getting the most performance from your application.

AWS Graviton4-based Amazon EC2 R8g instances: best price performance in Amazon EC2

2024-07-09 Esra Kayabali

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-graviton4-based-amazon-ec2-r8g-instances-best-price-performance-in-amazon-ec2/

Today, I am very excited to announce that the new AWS Graviton4-based Amazon Elastic Compute Cloud (Amazon EC2) R8g instances, that have been available in preview since re:Invent 2023, are now generally available to all. AWS offers more than 150 different AWS Graviton-powered Amazon EC2 instance types globally at scale, has built more than 2 million Graviton processors, and has more than 50,000 customers using AWS Graviton-based instances to achieve the best price performance for their applications.

AWS Graviton4 is the most powerful and energy efficient processor we have ever designed for a broad range of workloads running on Amazon EC2. Like all the other AWS Graviton processors, AWS Graviton4 uses a 64-bit Arm instruction set architecture. AWS Graviton4-based Amazon EC2 R8g instances deliver up to 30% better performance than AWS Graviton3-based Amazon EC2 R7g instances. This helps you to improve performance of your most demanding workloads such as high-performance databases, in-memory caches, and real time big data analytics.

Since the preview announcement at re:Invent 2023, over 100 customers, including Epic Games, SmugMug, Honeycomb, SAP, and ClickHouse have tested their workloads on AWS Graviton4-based R8g instances and observed significant performance improvement over comparable instances. SmugMug achieved 20-40% performance improvements using AWS Graviton4-based instances compared to AWS Graviton3-based instances for their image and data compression operations. Epic Games found AWS Graviton4 instances to be the fastest EC2 instances they have ever tested and Honeycomb.io achieved more than double the throughput per vCPU compared to the non-Graviton based instances that they used four years ago.

Let’s look at some of the improvements that we have made available in our new instances. R8g instances offer larger instance sizes with up to 3x more vCPUs (up to 48xl), 3x the memory (up to 1.5TB), 75% more memory bandwidth, and 2x more L2 cache over R7g instances. This helps you to process larger amounts of data, scale up your workloads, improve time to results, and lower your TCO. R8g instances also offer up to 50 Gbps network bandwidth and up to 40 Gbps EBS bandwidth compared to up to 30 Gbps network bandwidth and up to 20 Gbps EBS bandwidth on Graviton3-based instances.

R8g instances are the first Graviton instances to offer two bare metal sizes (metal-24xl and metal-48xl). You can right size your instances and deploy workloads that benefit from direct access to physical resources. Here are the specs for R8g instances:

Instance Size	vCPUs	Memory	Network Bandwidth	EBS Bandwidth
r8g.medium	1	8 GiB	up to 12.5 Gbps	up to 10 Gbps
r8g.large	2	16 GiB	up to 12.5 Gbps	up to 10 Gbps
r8g.xlarge	4	32 GiB	up to 12.5 Gbps	up to 10 Gbps
r8g.2xlarge	8	64 GiB	up to 15 Gbps	up to 10 Gbps
r8g.4xlarge	16	128 GiB	up to 15 Gbps	up to 10 Gbps
r8g.8xlarge	32	256 GiB	15 Gbps	10 Gbps
r8g.12xlarge	48	384 GiB	22.5 Gbps	15 Gbps
r8g.16xlarge	64	512 GiB	30 Gbps	20 Gbps
r8g.24xlarge	96	768 GiB	40 Gbps	30 Gbps
r8g.48xlarge	192	1,536 GiB	50 Gbps	40 Gbps
r8g.metal-24xl	96	768 GiB	40 Gbps	30 Gbps
r8g.metal-48xl	192	1,536 GiB	50 Gbps	40 Gbps

If you are looking for more energy-efficient compute options to help you reduce your carbon footprint and achieve your sustainability goals, R8g instances provide the best energy efficiency for memory-intensive workloads in EC2. Additionally, these instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads. The Graviton4 processors offer you enhanced security by fully encrypting all high-speed physical hardware interfaces.

R8g instances are ideal for all Linux-based workloads including containerized and micro-services-based applications built using Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Registry (Amazon ECR), Kubernetes, and Docker, and as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP. AWS Graviton4 processors are up to 30% faster for web applications, 40% faster for databases, and 45% faster for large Java applications than AWS Graviton3 processors. To learn more, visit the AWS Graviton Technical Guide.

Check out the collection of Graviton resources to help you start migrating your applications to Graviton instance types. You can also visit the AWS Graviton Fast Start program to begin your Graviton adoption journey.

Now available
R8g instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions.

You can purchase R8g instances as Reserved Instances, On-Demand, Spot Instances, and via Savings Plans. For further information, visit Amazon EC2 pricing.

To learn more about Graviton-based instances, visit AWS Graviton Processors or the Amazon EC2 R8g Instances.

— Esra

AWS Weekly Roundup — AWS Chips Taste Test, generative AI updates, Community Days, and more — April 1, 2024

2024-04-01 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-chips-taste-test-generative-ai-updates-community-days-and-more-april-1-2024/

Today is April Fool’s Day. About 10 years ago, some tech companies would joke about an idea that was thought to be fun and unfeasible on April 1st, to the delight of readers. Jeff Barr has also posted seemingly far-fetched ideas on this blog in the past, and some of these have surprisingly come true! Here are examples:

Year	Joke	Reality
2010	Introducing QC2 – the Quantum Compute Cloud, a production-ready quantum computer to solve certain types of math and logic problems with breathtaking speed.	In 2019, we launched Amazon Braket, a fully managed service that allows scientists, researchers, and developers to begin experimenting with computers from multiple quantum hardware providers in a single place.
2011	Announcing AWS $NAME, a scalable event service to find and automatically integrate with your systems on the cloud, on premises, and even your house and room.	In 2019, we introduced Amazon EventBridge to make it easy for you to integrate your own AWS applications with third-party applications. If you use AWS IoT Events, you can monitor and respond to events at scale from your IoT devices at home.
2012	New Amazon EC2 Fresh Servers to deliver a fresh (physical) EC2 server in 15 minutes using atmospheric delivery and communucation from a fleet of satellites.	In 2021, we launched AWS Outposts Server, 1U/2U physical servers with built-in AWS services. In 2023, Project Kuiper completed successful tests of an optical mesh network in low Earth orbit. Now, we only need to develop satellite warehouse and atmospheric re-entry technology to follow Amazon PrimeAir’s drone delivery.
2013	PC2 – The New Punched Card Cloud, a new mf (mainframe) instance family, Mainframe Machine Images (MMI), tape storage, and punched card interfaces for mainframe computers used from the 1970s to ’80s.	In 2022, we launched AWS Mainframe Modernization to help you modernize your mainframe applications and deploy them to AWS fully managed runtime environments.

Jeff returns! This year, we have AWS “Chips” Taste Test for him to indulge in, drawing unique parallels between chip flavors and silicon innovations. He compared the taste of “Golden Nacho Cheese,” “Al Chili Lime,” and “BBQ Training Wheels” with AWS Graviton, AWS Inferentia, and AWS Trainium chips.

What’s your favorite? Watch a fun video in the LinkedIn and X post of AWS social media channels.

Last week’s launches
If we stay curious, keep learning, and insist on high standards, we will continue to see more ideas turn into reality. The same goes for the generative artificial intelligence (generative AI) world. Here are some launches that utilize generative AI technology this week.

Knowledge Bases for Amazon Bedrock – Anthropic’s Claude 3 Sonnet foundation model (FM) is now generally available on Knowledge Bases for Amazon Bedrock to connect internal data sources for Retrieval Augmented Generation (RAG).

Knowledge Bases for Amazon Bedrock support metadata filtering, which improves retrieval accuracy by ensuring the documents are relevant to the query. You can narrow search results by specifying which documents to include or exclude from a query, resulting in more relevant responses generated by FMs such as Claude 3 Sonnet.

Finally, you can customize prompts and number of retrieval results in Knowledge Bases for Amazon Bedrock. With custom prompts, you can tailor the prompt instructions by adding context, user input, or output indicator(s), for the model to generate responses that more closely match your use case needs. You can now control the amount of information needed to generate a final response by adjusting the number of retrieved passages. To learn more these new features, visit Knowledge bases for Amazon Bedrock in the AWS documentation.

Amazon Connect Contact Lens – At AWS re:Invent 2023, we previewed a generative AI capability to summarize long customer conversations into succinct, coherent, and context-rich contact summaries to help improve contact quality and agent performance. These generative AI–powered post-contact summaries are now available in Amazon Connect Contact Lens.

Amazon DataZone – At AWS re:Invent 2023, we also previewed a generative AI–based capability to generate comprehensive business data descriptions and context and include recommendations on analytical use cases. These generative AI–powered recommendations for descriptions are now available in Amazon DataZone.

There are also other important launches you shouldn’t miss:

A new Local Zone in Miami, Florida – AWS Local Zones are an AWS infrastructure deployment that places compute, storage, database, and other select services closer to large populations, industry, and IT centers where no AWS Region exists. You can now use a new Local Zone in Miami, Florida, to run applications that require single-digit millisecond latency, such as real-time gaming, hybrid migrations, and live video streaming. Enable the new Local Zone in Miami (use1-mia2-az1) from the Zones tab in the Amazon EC2 console settings to get started.

New Amazon EC2 C7gn metal instance – You can use AWS Graviton based new C7gn bare metal instances to run applications that benefit from deep performance analysis tools, specialized workloads that require direct access to bare metal infrastructure, legacy workloads not supported in virtual environments, and licensing-restricted business-critical applications. The EC2 C7gn metal size comes with 64 vCPUs and 128 GiB of memory.

AWS Batch multi-container jobs – You can use multi-container jobs in AWS Batch, making it easier and faster to run large-scale simulations in areas like autonomous vehicles and robotics. With the ability to run multiple containers per job, you get the advanced scaling, scheduling, and cost optimization offered by AWS Batch, and you can use modular containers representing different components like 3D environments, robot sensors, or monitoring sidecars.

Amazon Guardduty EC2 Runtime Monitoring – We are announcing the general availability of Amazon GuardDuty EC2 Runtime Monitoring to expand threat detection coverage for EC2 instances at runtime and complement the anomaly detection that GuardDuty already provides by continuously monitoring VPC Flow Logs, DNS query logs, and AWS CloudTrail management events. You now have visibility into on-host, OS-level activities and container-level context into detected threats.

GitLab support for AWS CodeBuild – You can now use GitLab and GitLab self-managed as the source provider for your CodeBuild projects. You can initiate builds from changes in source code hosted in your GitLab repositories. To get started with CodeBuild’s new source providers, visit the AWS CodeBuild User Guide.

Retroactive support for AWS cost allocation tags – You can enable AWS cost allocation tags retroactively for up to 12 months. Previously, when you activated resource tags for cost allocation purposes, the tags only took effect prospectively. Submit a backfill request, specifying the duration of time you want the cost allocation tags to be backfilled. Once the backfill is complete, the cost and usage data from prior months will be tagged with the current cost allocation tags.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news about generative AI that you might have missed:

Amazon and Anthropic’s AI investiment – Read the latest milestone in our strategic collaboration and investment with Anthropic. Now, Anthropic is using AWS as its primary cloud provider and will use AWS Trainium and Inferentia chips for mission-critical workloads, including safety research and future FM development. Earlier this month, we announced access to Anthropic’s most powerful FM, Claude 3, on Amazon Bedrock. We announced availability of Sonnet on March 4 and Haiku on March 13. To learn more, watch the video introducing Claude on Amazon Bedrock.

Virtual building assistant built on Amazon Bedrock – BrainBox AI announced the launch of ARIA (Artificial Responsive Intelligent Assistant) powered by Amazon Bedrock. ARIA is designed to enhance building efficiency by assimilating seamlessly into the day-to-day processes related to building management. To learn more, read the full customer story and watch the video on how to reduce a building’s CO2 footprint with ARIA.

Solar models available on Amazon SageMaker JumpStart – Upstage Solar is a large language model (LLM) 100 percent pre-trained with Amazon SageMaker that outperforms and uses its compact size and powerful track record to specialize in purpose training, making it versatile across languages, domains, and tasks. Now, Solar Mini is available on Amazon SageMaker JumpStart. To learn more, watch how to deploy Solar models in SageMaker JumpStart.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community. Last week’s highlight was news that Linux Foundation launched Valkey community, an open source alternative to the Redis in-memory, NoSQL data store.

Upcoming AWS Events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Paris (April 3), Amsterdam (April 9), Sydney (April 10–11), London (April 24), Berlin (May 15–16), and Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Dubai (May 29), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore cloud security in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania for two-and-a-half days of immersive cloud security learning designed to help drive your business initiatives. Read the story from AWS Chief Information Security Officer (CISO) Chris Betz about a bit of what you can expect at re:Inforce.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Mumbai (April 6), Poland (April 11), Bay Area (April 12), Kenya (April 20), and Turkey (May 18).

You can browse all upcoming AWS led in-person and virtual events and developer-focused events such as AWS DevDay.

That’s all for this week. Check back next Monday for another Week in Review!

— Channy

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS.

AWS Graviton4 is an Even Bigger Arm Server Processor and Tranium2 for AI

2023-11-28 Cliff Robinson

Post Syndicated from Cliff Robinson original https://www.servethehome.com/aws-graviton4-is-an-even-bigger-arm-server-processor-and-tranium2-for-ai-nvidia/

Today AWS made the much-anticipated announcement of Graviton4 which should be available in 2024. This is AWS’s latest Graviton processor and the fourth generation launched in the last five years. The company also announced its second-generation Tranium2 processor for AI workloads. AWS Graviton4 is an Even Bigger Arm Server Processor AWS is continuing on its […]

The post AWS Graviton4 is an Even Bigger Arm Server Processor and Tranium2 for AI appeared first on ServeTheHome.

Join the preview for new memory-optimized, AWS Graviton4-powered Amazon EC2 instances (R8g)

2023-11-28 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/join-the-preview-for-new-memory-optimized-aws-graviton4-powered-amazon-ec2-instances-r8g/

We are opening up a preview of the next generation of Amazon Elastic Compute Cloud (Amazon EC2) instances. Equipped with brand-new Graviton4 processors, the new R8g instances will deliver better price performance than any existing memory-optimized instance. The R8g instances are suitable for your most demanding memory-intensive workloads: big data analytics, high-performance databases, in-memory caches and so forth.

Graviton history
Let’s take a quick look back in time and recap the evolution of the Graviton processors:

November 2018 – The Graviton processor made its debut in the A1 instances, optimized for both performance and cost, and delivering cost reductions of up to 45% for scale-out workloads.

December 2019 – The Graviton2 processor debuted with the announcement of M6g, M6gd, C6g, C6gd, R6g, and R6gd instances with up to 40% better price performance than equivalent non-Graviton instances. The second-generation processor delivered up to 7x performance of the first one, including twice the floating point performance.

November 2021 – The Graviton3 processor made its debut with the announcement of the compute-optimized C7g instances. In addition to up to 25% better compute performance, this generation of processors once again doubled floating point and cryptographic performance when compared to the previous generation.

November 2022 – The Graviton 3E processor was announced, for use in the Hpc7g and C7gn instances, with up to 35% higher vector instruction processing performance than the Graviton3.

Today, every one of the top 100 Amazon Elastic Compute Cloud (EC2) customers makes use of Graviton, choosing between more than 150 Graviton-powered instances.

New Graviton4
I’m happy to be able to tell you about the latest in our series of innovative custom chip designs, the energy-efficient AWS Graviton4 processor.

96 Neoverse V2 cores, 2 MB of L2 cache per core, and 12 DDR5-5600 channels work together to make the Graviton4 up to 40% faster for databases, 30% faster for web applications, and 45% faster for large Java applications than the Graviton3.

Graviton4 processors also support all of the security features from the previous generations, and includes some important new ones including encrypted high-speed hardware interfaces and Branch Target Identification (BTI).

R8g instance sizes
The 8th generation R8g instances will be available in multiple sizes with up to triple the number of vCPUs and triple the amount of memory of the 7th generation (R7g) of memory-optimized, Graviton3-powered instances.

Join the preview
R8g instances with Graviton4 processors

— Jeff;

Amazon MSK now provides up to 29% more throughput and up to 24% lower costs with AWS Graviton3 support

2023-11-27 Sai Maddali

Post Syndicated from Sai Maddali original https://aws.amazon.com/blogs/big-data/amazon-msk-now-provides-up-to-29-more-throughput-and-up-to-24-lower-costs-with-aws-graviton3-support/

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data.

Today, we’re excited to bring the benefits of Graviton3 to Kafka workloads, with Amazon MSK now offering M7g instances for new MSK provisioned clusters. AWS Graviton processors are custom Arm-based processors built by AWS to deliver the best price-performance for your cloud workloads. For example, when running an MSK provisioned cluster using M7g.4xlarge instances, you can achieve up to 27% reduction in CPU usage and up to 29% higher write and read throughput compared to M5.4xlarge instances. These performance improvements, along with M7g’s lower prices provide up to 24% in compute cost savings over M5 instances.

In February 2023, AWS launched new Graviton3-based M7g instances. M7g instances are equipped with DDR5 memory, which provides up to 50% higher memory bandwidth than the DDR4 memory used in previous generations. M7g instances also deliver up to 25% higher storage throughput and up to 88% increase in network throughput compared to similar sized M5 instances to deliver price-performance benefits for Kafka workloads. You can read more about M7g features in New Graviton3-Based General Purpose (m7g) and Memory-Optimized (r7g) Amazon EC2 Instances.

Here are the specs for the M7g instances on MSK:

Name	vCPUs	Memory	Network Bandwidth	Storage Bandwidth
M7g.large	2	8 GiB	up to 12.5 Gbps	up to 10 Gbps
M7g.xlarge	4	16 GiB	up to 12.5 Gbps	up to 10 Gbps
M7g.2xlarge	8	32 GiB	up to 15 Gbps	up to 10 Gbps
M7g.4xlarge	16	64 GiB	up to 15 Gbps	up to 10 Gbps
M7g.8xlarge	32	128 GiB	15 Gbps	10 Gbps
M7g.12xlarge	48	192 GiB	22.5 Gbps	15 Gbps
M7g.16xlarge	64	256 GiB	30 Gbps	20 Gbps

M7g instances on Amazon MSK

Organizations are adopting Amazon MSK to capture and analyze data in real time, run machine learning (ML) workflows, and build event-driven architectures. Amazon MSK enables you to reduce operational overhead and run your applications with higher availability and durability. It also offers a consistent reduction in price-performance with capabilities such as Tiered Storage. With compute making up a large portion of Kafka costs, customers wanted a way to optimize them further and see Graviton instances providing them the quickest path. Amazon MSK has fully tested and validated M7g on all Kafka versions starting with version 2.8.2, making it to run critical workloads and benefit from Gravition3 cost savings.

You can get started by provisioning new clusters with the Graviton3-based M7g instances as the broker type using the AWS Management Console, APIs via the AWS SDK, and the AWS Command Line Interface (AWS CLI). M7g instances support all Amazon MSK and Kafka features, making it straightforward for you to run all your existing Kafka workloads with minimal changes. Amazon MSK supports Graviton3-based M7g instances from large through 16xlarge sizes to run all Kafka workloads.

Let’s take the M7g instances on MSK provisioned clusters for a test drive and see how it compares with Amazon MSK M5 instances.

M7g instances in action

Customers run a wide variety of workloads on Amazon MSK; some are latency sensitive, and some are throughput bound. In this post, we focus on M7g performance impact on throughput-bound workloads. M7g comes with an increase in network and storage throughput, providing a higher throughput per broker compared to an M5-based cluster.

To understand the implications, let’s look at how Kafka uses available throughput for writing or reading data. Every broker in the MSK cluster comes with a bounded storage and network throughput entitlement. Predominantly, writes in Kafka consume both storage and network throughput, whereas reads consume mostly network throughput. This is because a Kafka consumer is typically reading real-time data from a page cache and occasionally goes to disk to process old data. Therefore, the overall throughput gains also change based on the workload’s write to read throughput ratios.

Let’s look at the throughput gains based on an example. Our setup includes an MSK cluster with M7g.4xlarge instances and another with M5.4xlarge instances, with three nodes in three different Availability Zones. We also enabled TLS encryption, AWS Identity and Access Management (IAM) authentication, and a replication factor of 3 across both M7g and M5 MSK clusters. We also applied Amazon MSK best practices for broker configurations, including num.network.threads = 8 and num.io.threads = 16. On the client side for writes, we optimized the batch size with appropriate linger.ms and batch.size configurations. For the workload, we assumed 6 topics each with 64 partitions (384 per broker). For ingestion, we generated load with an average message size of 512 bytes and with one consumer group per topic. The amount of load sent to the clusters was identical.

As we ingest more data into the MSK cluster, the M7g.4xlarge instance supports higher throughput per broker, as shown in the following graph. After an hour of consistent writes, M7g.4xlarge brokers support up to 54 MB/s of write throughput vs. 40 MB/s with M5-based brokers, which represents a 29% increase.

We also see another important observation: M7g-based brokers consume much fewer CPU resources than M5s, even though they support 29% higher throughput. As seen in the following chart, CPU utilization of an M7g-based broker is on average 40%, whereas on an M5-based broker, it’s 47%.

As covered previously, customers may see different performance improvements based on the number of consumer group, batch sizes, and instance size. We recommend referring to MSK Sizing and Pricing to calculate M7g performance gains for your use case or creating a cluster based on M7g instances and benchmark the gains on your own.

Lower costs, with lesser operational burden, and higher resiliency

Since its launch, Amazon MSK has made it cost-effective to run your Kafka workloads, while still improving overall resiliency. Since day 1, you have been able to run brokers in multiple Availability Zones without worrying about additional networking costs. In October 2022, we launched Tiered Storage, which provides virtually unlimited storage at up to 50% lower costs. When you use Tiered Storage, you not only save on overall storage cost but also improve the overall availability and elasticity of your cluster.

Continuing down this path, we are now reducing compute costs for customers while still providing performance improvements. With M7g instances, Amazon MSK provides 24% savings on compute costs compared to similar sized M5 instances. When you move to Amazon MSK, you can not only lower your operational overhead using features such as Amazon MSK Connect, Amazon MSK Replicator, and automatic Kafka version upgrades, but also improve over resiliency and reduce their infrastructure costs.

Pricing and Regions

M7g instances on Amazon MSK are available today in the US (Ohio, N. Virginia, N. California, Oregon), Asia Pacific (Hyderabad, Mumbai, Seoul, Singapore, Sydney, Tokyo), Canada (Central), and EU (Ireland, London, Spain, Stockholm) Regions.

Refer to Amazon MSK pricing to learn about Graivton3-based instances with Amazon MSK pricing.

Summary

In this post, we discussed the performance gains achieved while using Graviton-based M7g instances. These instances can provide significant improvement in read and write throughput compared to similar sized M5 instances for Amazon MSK workloads. To get started, create a new cluster with M7g brokers using the AWS Management Console, and read our documentation for more information.

About the Authors

Sai Maddali is a Senior Manager Product Management at AWS who leads the product team for Amazon MSK. He is passionate about understanding customer needs, and using technology to deliver services that empowers customers to build innovative applications. Besides work, he enjoys traveling, cooking, and running.

Umesh is a Streaming Solutions Architect at AWS. He works with AWS customers to design and build real time data processing systems. He has 13 years of working experience in software engineering including architecting, designing, and developing data analytics systems.

Lanre Afod is a Solutions Architect focused with Global Financial Services at AWS, passionate about helping customers with deploying secure, scalable, high available and resilient architectures within the AWS Cloud.

The attendee’s guide to the AWS re:Invent 2023 Compute track

2023-11-17 Chris Munns

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/the-attendees-guide-to-the-aws-reinvent-2023-compute-track/

This post by Art Baudo – Principal Product Marketing Manager – AWS EC2, and Pranaya Anshu – Product Marketing Manager – AWS EC2

We are just a few weeks away from AWS re:Invent 2023, AWS’s biggest cloud computing event of the year. This event will be a great opportunity for you to meet other cloud enthusiasts, find productive solutions that can transform your company, and learn new skills through 2000+ learning sessions.

Even if you are not able to join in person, you can catch-up with many of the sessions on-demand and even watch the keynote and innovation sessions live.

If you’re able to join us, just a reminder we offer several types of sessions which can help maximize your learning in a variety of AWS topics. Breakout sessions are lecture-style 60-minute informative sessions presented by AWS experts, customers, or partners. These sessions are recorded and uploaded a few days after to the AWS Events YouTube channel.

re:Invent attendees can also choose to attend chalk-talks, builder sessions, workshops, or code talk sessions. Each of these are live non-recorded interactive sessions.

Chalk-talk sessions: Attendees will interact with presenters, asking questions and using a whiteboard in session.
Builder Sessions: Attendees participate in a one-hour session and build something.
Workshops sessions: Attendees join a two-hour interactive session where they work in a small team to solve a real problem using AWS services.
Code talk sessions: Attendees participate in engaging code-focused sessions where an expert leads a live coding session.

To start planning your re:Invent week, check-out some of the Compute track sessions below. If you find a session you’re interested in, be sure to reserve your seat for it through the AWS attendee portal.

Explore the latest compute innovations

This year AWS compute services have launched numerous innovations: From the launch of over 100 new Amazon EC2 instances, to the general availability of Amazon EC2 Trn1n instances powered by AWS Trainium and Amazon EC2 Inf2 instances powered by AWS Inferentia2, to a new way to reserve GPU capacity with Amazon EC2 Capacity Blocks for ML. There’s a lot of exciting launches to take in.

Explore some of these latest and greatest innovations in the following sessions:

CMP102 | What’s new with Amazon EC2
Provides an overview on the latest Amazon EC2 innovations. Hear about recent Amazon EC2 launches, learn how about differences between Amazon EC2 instances families, and how you can use a mix of instances to deliver on your cost, performance, and sustainability goals.
CMP217 | Select and launch the right instance for your workload and budget
Learn how to select the right instance for your workload and budget. This session will focus on innovations including Amazon EC2 Flex instances and the new generation of Intel, AMD, and AWS Graviton instances.
CMP219-INT | Compute innovation for any application, anywhere
Provides you with an understanding of the breadth and depth of AWS compute offerings and innovation. Discover how you can run any application, including enterprise applications, HPC, generative artificial intelligence (AI), containers, databases, and games, on AWS.

Customer experiences and applications with machine learning

Machine learning (ML) has been evolving for decades and has an inflection point with generative AI applications capturing widespread attention and imagination. More customers, across a diverse set of industries, choose AWS compared to any other major cloud provider to build, train, and deploy their ML applications. Learn about the generative AI infrastructure at Amazon or get hands-on experience building ML applications through our ML focused sessions, such as the following:

CMP206 | Behind-the-scenes look at generative AI infrastructure at Amazon
Learn how to power performant generative AI applications while keeping costs under control. Get a behind-the-scenes look at how purpose-built infrastructure from AWS including AWS Trainium and AWS Inferentia2 are used by Amazon teams.
CMP329-R | PyTorch best practices for generative AI & LLM inference architectures
Understand best practices for using PyTorch to deploy Large Language Models (LLMs) on a cluster of Amazon EC2 Inf2 instances managed by Amazon Elastic Kubernetes Service (EKS).
CMP402 | Build a generative AI chatbot using your own data with Amazon Titan
Build and deploy a generative AI-powered chatbot on AWS using AWS Inferentia, AWS Trainium, Amazon SageMaker, Amazon Bedrock, Amazon S3, and Amazon OpenSearch Service.

Discover what powers AWS compute

AWS has invested years designing custom silicon optimized for the cloud to deliver the best price performance for a wide range of applications and workloads using AWS services. Learn more about the AWS Nitro System, processors at AWS, and ML chips.

CMP306 | Deep dive into the AWS Nitro System
Delve further into the design, architecture, and new innovations to the AWS Nitro platform in this session.
CMP309 | Compute innovations enabled by the AWS Nitro System
Deep dive in to the AWS Nitro System, the underlying platform for modern EC2 instances, the Nitro System, which has allowed AWS to innovate faster, further reduce your costs, and deliver increased security.
CMP313 | AWS Graviton: The best price performance for your AWS workloads
Learn more about AWS Graviton, hear customer stories about recent Graviton adoptions, how you can migrate your workloads to Graviton while saving cost, and gaining sustainability benefits.

Optimize your compute costs

At AWS, we focus on delivering the best possible cost structure for our customers. Frugality is one of our founding leadership principles. Cost effective design continues to shape everything we do, from how we develop products to how we run our operations. Come learn of new ways to optimize your compute costs through AWS services, tools, and optimization strategies in the following sessions:

CMP207 | Capacity, availability, cost efficiency: Pick three
Use AWS strategies such as using attribute-based instance type selection, prioritizing operational flexibility, and Amazon EC2 Flex instances to help you get compute resources, when you need them, while keeping cost under control.
CMP211 | Smart savings: Amazon EC2 cost-optimization strategies
Discover the basics of cost optimization from AWS Savings Plans and Amazon EC2 Auto Scaling to more advanced strategies like Amazon EC2 Spot Instances, AWS Graviton, automation, and more.
CMP406 | Reduce costs and improve sustainability with AWS Graviton
Learn how using AWS Graviton based instances can provide up to 40% better price performance over comparable current-generation instances for a wide variety of workloads.

Check out workload-specific sessions

Amazon EC2 offers the broadest and deepest compute platform to help you best match the needs of your workload. More SAP, high performance computing (HPC), ML, and Windows workloads run on AWS than any other cloud. Join sessions focused around your specific workload to learn about how you can leverage AWS solutions to accelerate your innovations.

CMP103 | Bring your small business to the cloud with Amazon Lightsail
Join to learn how you can quickly deploy and configure a WordPress blog, point-of-sale system, ecommerce site, or remote Windows desktop on AWS for a simple monthly price.
CMP203 | Scaling SAP HANA workloads on AWS
Discover how AWS SAP HANA customers benefit from the flexibility and reliability of the AWS Cloud. learn how you can migrate your mission-critical SAP HANA workloads.
CMP213 | Confidently run your production HPC workloads on AWS
Explore the HPC portfolio of services and products available in the AWS Cloud. Learn how customers can scale simulations and modeling and other memory- and data-intensive technical workloads.
CMP318 | Build a spatial data lake with Visual Asset Management System
Get a hands-on look at how to use the open source Visual Asset Management System (VAMS), built by AWS, to deploy a foundation for spatial data lakes, helping business units better collaborate and use 3D data while avoiding redundancies.
CMP333 | Accelerate Apple application development with Amazon EC2 Mac instances
Dive deep into best practices to automate the provisioning, configuration, and scaling of your Amazon EC2 Mac instances.

Hear from AWS customers

AWS serves millions of customers of all sizes across thousands of use cases, every industry, and around the world. Hear customers dive into how AWS compute solutions have helped them transform their businesses.

CMP212 | Sustainable compute: Reducing costs and carbon emissions with AWS
Learn how to build sustainable and cost-efficient cloud operations that align with your business goals and see how Adobe is leveraging AWS Graviton processors to do the same.
CMP214 | HPC on AWS for semiconductors and healthcare life sciences
See a demo of how organizations like Arm are using AWS for demanding simulation workloads.
CMP320 | Powering Peloton’s journey to personalization with AWS
Learn how Peloton’s solution architecture and technology stack uses Amazon EC2, Amazon S3, and Amazon EKS to build and serve individualized class recommendations in real time with low latency for a great user experience.

Ready to unlock new possibilities?

The AWS Compute team looks forward to seeing you in Las Vegas. Come meet us at the Compute Booth in the Expo. And if you’re looking for more session recommendations, check-out additional re:Invent attendee guides curated by experts.

GoDaddy benchmarking results in up to 24% better price-performance for their Spark workloads with AWS Graviton2 on Amazon EMR Serverless

2023-11-02 Mukul Sharma

Post Syndicated from Mukul Sharma original https://aws.amazon.com/blogs/big-data/godaddy-benchmarking-results-in-up-to-24-better-price-performance-for-their-spark-workloads-with-aws-graviton2-on-amazon-emr-serverless/

This is a guest post co-written with Mukul Sharma, Software Development Engineer, and Ozcan IIikhan, Director of Engineering from GoDaddy.

GoDaddy empowers everyday entrepreneurs by providing all the help and tools to succeed online. With more than 22 million customers worldwide, GoDaddy is the place people come to name their ideas, build a professional website, attract customers, and manage their work.

GoDaddy is a data-driven company, and getting meaningful insights from data helps us drive business decisions to delight our customers. At GoDaddy, we embarked on a journey to uncover the efficiency promises of AWS Graviton2 on Amazon EMR Serverless as part of our long-term vision for cost-effective intelligent computing.

In this post, we share the methodology and results of our benchmarking exercise comparing the cost-effectiveness of EMR Serverless on the arm64 (Graviton2) architecture against the traditional x86_64 architecture. EMR Serverless on Graviton2 demonstrated an advantage in cost-effectiveness, resulting in significant savings in total run costs. We achieved 23.85% improvement in price-performance for sample production Spark workloads—an outcome that holds tremendous potential for businesses striving to maximize their computing efficiency.

Solution overview

GoDaddy’s intelligent compute platform envisions simplification of compute operations for all personas, without limiting power users, to ensure out-of-box cost and performance optimization for data and ML workloads. As a part of this vision, GoDaddy’s Data & ML Platform team plans to use EMR Serverless as one of the compute solutions under the hood.

The following diagram shows a high-level illustration of the intelligent compute platform vision.

Benchmarking EMR Serverless for GoDaddy

EMR Serverless is a serverless option in Amazon EMR that eliminates the complexities of configuring, managing, and scaling clusters when running big data frameworks like Apache Spark and Apache Hive. With EMR Serverless, businesses can enjoy numerous benefits, including cost-effectiveness, faster provisioning, simplified developer experience, and improved resilience to Availability Zone failures.

At GoDaddy, we embarked on a comprehensive study to benchmark EMR Serverless using real production workflows at GoDaddy. The purpose of the study was to evaluate the performance and efficiency of EMR Serverless and develop a well-informed adoption plan. The results of the study have been extremely promising, showcasing the potential of EMR Serverless for our workloads.

Having achieved compelling results in favor of EMR Serverless for our workloads, our attention turned to evaluating the utilization of the Graviton2 (arm64) architecture on EMR Serverless. In this post, we focus on comparing the performance of Graviton2 (arm64) with the x86_64 architecture on EMR Serverless. By conducting this apples-to-apples comparative analysis, we aim to gain valuable insights into the benefits and considerations of using Graviton2 for our big data workloads.

By using EMR Serverless and exploring the performance of Graviton2, GoDaddy aims to optimize their big data workflows and make informed decisions regarding the most suitable architecture for their specific needs. The combination of EMR Serverless and Graviton2 presents an exciting opportunity to enhance the data processing capabilities and drive efficiency in our operations.

AWS Graviton2

The Graviton2 processors are specifically designed by AWS, utilizing powerful 64-bit Arm Neoverse cores. This custom-built architecture provides a remarkable boost in price-performance for various cloud workloads.

In terms of cost, Graviton2 offers an appealing advantage. As indicated in the following table, the pricing for Graviton2 is 20% lower compared to the x86 architecture option.

	x86_64	arm64 (Graviton2)
per vCPU per hour	$0.052624	$0.042094
per GB per hour	$0.0057785	$0.004628
per storage GB per hour*	$0.000111

*Ephemeral storage: 20 GB of ephemeral storage is available for all workers by default—you pay only for any additional storage that you configure per worker.

For specific pricing details and current information, refer to Amazon EMR pricing.

AWS benchmark

The AWS team performed benchmark tests on Spark workloads with Graviton2 on EMR Serverless using the TPC-DS 3 TB scale performance benchmarks. The summary of their analysis are as follows:

Graviton2 on EMR Serverless demonstrated an average improvement of 10% for Spark workloads in terms of runtime. This indicates that the runtime for Spark-based tasks was reduced by approximately 10% when utilizing Graviton2.
Although the majority of queries showcased improved performance, a small subset of queries experienced a regression of up to 7% on Graviton2. These specific queries showed a slight decrease in performance compared to the x86 architecture option.
In addition to the performance analysis, the AWS team considered the cost factor. Graviton2 is offered at a 20% lower cost than the x86 architecture option. Taking this cost advantage into account, the AWS benchmark set yielded an overall 27% better price-performance for workloads. This means that by using Graviton2, users can achieve a 27% improvement in performance per unit of cost compared to the x86 architecture option.

These findings highlight the significant benefits of using Graviton2 on EMR Serverless for Spark workloads, with improved performance and cost-efficiency. It showcases the potential of Graviton2 in delivering enhanced price-performance ratios, making it an attractive choice for organizations seeking to optimize their big data workloads.

GoDaddy benchmark

During our initial experimentation, we observed that arm64 on EMR Serverless consistently outperformed or performed on par with x86_64. One of the jobs showed a 7.51% increase in resource usage on arm64 compared to x86_64, but due to the lower price of arm64, it still resulted in a 13.48% cost reduction. In another instance, we achieved an impressive 43.7% reduction in run cost, attributed to both the lower price and reduced resource utilization. Overall, our initial tests indicated that arm64 on EMR Serverless delivered superior price-performance compared to x86_64. These promising findings motivated us to conduct a more comprehensive and rigorous study.

Benchmark results

To gain a deeper understanding of the value of Graviton2 on EMR Serverless, we conducted our study using real-life production workloads from GoDaddy, which are scheduled to run at a daily cadence. Without any exceptions, EMR Serverless on arm64 (Graviton2) is significantly more cost-effective compared to the same jobs run on EMR Serverless on the x86_64 architecture. In fact, we recorded an impressive 23.85% improvement in price-performance across the sample GoDaddy jobs using Graviton2.

Like the AWS benchmarks, we observed slight regressions of less than 5% in the total runtime of some jobs. However, given that these jobs will be migrated from Amazon EMR on EC2 to EMR Serverless, the overall total runtime will still be shorter due to the minimal provisioning time in EMR Serverless. Additionally, across all jobs, we observed an average speed up of 2.1% in addition to the cost savings achieved.

These benchmarking results provide compelling evidence of the value and effectiveness of Graviton2 on EMR Serverless. The combination of improved price-performance, shorter runtimes, and overall cost savings makes Graviton2 a highly attractive option for optimizing big data workloads.

Benchmarking methodology

As an extension of a larger benchmarking EMR Serverless for GoDaddy study, where we divided Spark jobs into brackets based on total runtime (quick-run, medium-run, long-run), we measured effect of architecture (arm64 vs. x86_64) on total cost and total runtime. All other parameters were kept the same to achieve an apples-to-apples comparison.

The team followed these steps:

Prepare the data and environment.
Choose two random production jobs from each job bracket.
Make necessary changes to avoid inference with actual production outputs.
Run tests to execute scripts over multiple iterations to collect accurate and consistent data points.
Validate input and output datasets, partitions, and row counts to ensure identical data processing.
Gather relevant metrics from the tests.
Analyze results to draw insights and conclusions.

The following table shows the summary of an example Spark job.

Metric	EMR Serverless (Average) – X86_64	EMR Serverless (Average) – Graviton	X86_64 vs Graviton (% Difference)
Total Run Cost	$2.76	$1.85	32.97%
Total Runtime (hh:mm:ss)	00:41:31	00:34:32	16.82%
EMR Release Label	emr-6.9.0
Job Type	Spark
Spark Version	Spark 3.3.0
Hadoop Distribution	Amazon 3.3.3
Hive/HCatalog Version	Hive 3.1.3, HCatalog 3.1.3

Summary of results

The following table presents a comparison of job performance between EMR Serverless on arm64 (Graviton2) and EMR Serverless on x86_64. For each architecture, every job was run at least three times to obtain the accurate average cost and runtime.

Job	Average x86_64 Cost	Average arm64 Cost	Average x86_64 Runtime (hh:mm:ss)	Average arm64 Runtime (hh:mm:ss)	Average Cost Savings %	Average Performance Gain %
1	$1.64	$1.25	00:08:43	00:09:01	23.89%	-3.24%
2	$10.00	$8.69	00:27:55	00:28:25	13.07%	-1.79%
3	$29.66	$24.15	00:50:49	00:53:17	18.56%	-4.85%
4	$34.42	$25.80	01:20:02	01:24:54	25.04%	-6.08%
5	$2.76	$1.85	00:41:31	00:34:32	32.97%	16.82%
6	$34.07	$24.00	00:57:58	00:51:09	29.57%	11.76%
Average					23.85%	2.10%

Note that the improvement calculations are based on higher-precision results for more accuracy.

Conclusion

Based on this study, GoDaddy observed a significant 23.85% improvement in price-performance for sample production Spark jobs utilizing the arm64 architecture compared to the x86_64 architecture. These compelling results have led us to strongly recommend internal teams to use arm64 (Graviton2) on EMR Serverless, except in cases where there are compatibility issues with third-party packages and libraries. By adopting an arm64 architecture, organizations can achieve enhanced cost-effectiveness and performance for their workloads, contributing to more efficient data processing and analytics.

About the Authors

Mukul Sharma is a Software Development Engineer on Data & Analytics (DnA) organization at GoDaddy. He is a polyglot programmer with experience in a wide array of technologies to rapidly deliver scalable solutions. He enjoys singing karaoke, playing various board games, and working on personal programming projects in his spare time.

Ozcan Ilikhan is a Director of Engineering on Data & Analytics (DnA) organization at GoDaddy. He is passionate about solving customer problems and increasing efficiency using data and ML/AI. In his spare time, he loves reading, hiking, gardening, and working on DIY projects.

Harsh Vardhan Singh Gaur is an AWS Solutions Architect, specializing in analytics. He has over 6 years of experience working in the field of big data and data science. He is passionate about helping customers adopt best practices and discover insights from their data.

Ramesh Kumar Venkatraman is a Senior Solutions Architect at AWS who is passionate about containers and databases. He works with AWS customers to design, deploy, and manage their AWS workloads and architectures. In his spare time, he loves to play with his two kids and follows cricket.

Let’s Architect! Cost-optimizing AWS workloads

2023-08-30 Luca Mezzalira

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-cost-optimizing-aws-workloads/

Every software component built by engineers and architects is designed with a purpose: to offer particular functionalities and, ultimately, contribute to the generation of business value. We should consider fundamental factors, such as the scalability of the software and the ease of evolution during times of business changes. However, performance and cost are important factors as well since they can impact the business profitability.

This edition of Let’s Architect! follows a similar series post from 2022, which discusses optimizing the cost of an architecture. Today, we focus on architectural patterns, services, and best practices to design cost-optimized cloud workloads. We also want to identify solutions, such as the use of Graviton processors, for increased performance at lower price. Cost optimization is a continuous process that requires the identification of the right tools for each job, as well as the adoption of efficient designs for your system.

AWS re:Invent 2022 – Manage and control your AWS costs

Govern cloud usage and avoid cost surprises without slowing down innovation within your organization. In this re:Invent 2022 session, you can learn how to set up guardrails and operationalize cost control within your organizations using services, such as AWS Budgets and AWS Cost Anomaly Detection, and explore the latest enhancements in the AWS cost control space. Additionally, Mercado Libre shares how they automate their cloud cost control through central management and automated algorithms.

Take me to this re:Invent 2022 video!

Work backwards from team needs to define/deploy cloud governance in AWS environments

Compute optimization

When it comes to optimizing compute workloads, there are many tools available, such as AWS Compute Optimizer, Amazon EC2 Spot Instances, Amazon EC2 Reserved Instances, and Graviton instances. Modernizing your applications can also lead to cost savings, but you need to know how to use the right tools and techniques in an effective and efficient way.

For AWS Lambda functions, you can use the AWS Lambda Cost Optimization video to learn how to optimize your costs. The video covers topics, such as understanding and graphing performance versus cost, code optimization techniques, and avoiding idle wait time. If you are using Amazon Elastic Container Service (Amazon ECS) and AWS Fargate, you can watch a Twitch video on cost optimization using Amazon ECS and AWS Fargate to learn how to adjust your costs. The video covers topics like using spot instances, choosing the right instance type, and using Fargate Spot.

Finally, with Amazon Elastic Kubernetes Service (Amazon EKS), you can use Karpenter, an open-source Kubernetes cluster auto scaler to help optimize compute workloads. Karpenter can help you launch right-sized compute resources in response to changing application load, help you adopt spot and Graviton instances. To learn more about Karpenter, read the post How CoStar uses Karpenter to optimize their Amazon EKS Resources on the AWS Containers Blog.

Take me to Cost Optimization using Amazon ECS and AWS Fargate!
Take me to AWS Lambda Cost Optimization!
Take me to How CoStar uses Karpenter to optimize their Amazon EKS Resources!

Karpenter launches and terminates nodes to reduce infrastructure costs

AWS Lambda general guidance for cost optimization

AWS Graviton deep dive: The best price performance for AWS workloads

The choice of the hardware is a fundamental driver for performance, cost, as well as resource consumption of the systems we build. Graviton is a family of processors designed by AWS to support cloud-based workloads and give improvements in terms of performance and cost. This re:Invent 2022 presentation introduces Graviton and addresses the problems it can solve, how the underlying CPU architecture is designed, and how to get started with it. Furthermore, you can learn the journey to move different types of workloads to this architecture, such as containers, Java applications, and C applications.

Take me to this re:Invent 2022 video!

AWS Graviton processors are specifically designed by AWS for cloud workloads to deliver the best price performance

AWS Well-Architected Labs: Cost Optimization

The Cost Optimization section of the AWS Well Architected Workshop helps you learn how to optimize your AWS costs by using features, such as AWS Compute Optimizer, Spot Instances, and Reserved Instances. The workshop includes hands-on labs that walk you through the process of optimizing costs for different types of workloads and services, such as Amazon Elastic Compute Cloud, Amazon ECS, and Lambda.

Take me to this AWS Well-Architected lab!

Savings Plans is a flexible pricing model that can help reduce expenses compared with on-demand pricing

See you next time!

Thanks for joining us to discuss cost optimization! In 2 weeks, we’ll talk about in-memory databases and caching systems.

To find all the blogs from this series, visit the Let’s Architect! list of content on the AWS Architecture Blog.

New Amazon EC2 Instances (C7gd, M7gd, and R7gd) Powered by AWS Graviton3 Processor with Local NVMe-based SSD Storage

2023-07-28 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-instances-c7gd-m7gd-and-r7gd-powered-by-aws-graviton3-processor-with-local-nvme-based-ssd-storage/

We launched Amazon EC2 C7g instances in May 2022 and M7g and R7g instances in February 2023. Powered by the latest AWS Graviton3 processors, the new instances deliver up to 25 percent higher performance, up to two times higher floating-point performance, and up to 2 times faster cryptographic workload performance compared to AWS Graviton2 processors.

Graviton3 processors deliver up to 3 times better performance compared to AWS Graviton2 processors for machine learning (ML) workloads, including support for bfloat16. They also support DDR5 memory that provides 50 percent more memory bandwidth compared to DDR4. Graviton3 also uses up to 60 percent less energy for the same performance as comparable EC2 instances, which helps you reduce your carbon footprint.

The C7g instances are well suited for compute-intensive workloads, such as high performance computing (HPC), batch processing, ad serving, video encoding, gaming, scientific modeling, distributed analytics, and CPU-based machine learning inference. The M7g instances are for general purpose workloads such as application servers, microservices, gaming servers, mid-sized data stores, and caching fleets. The R7g instances are a great fit for memory-intensive workloads such as open-source databases, in-memory caches, and real-time big data analytics.

Today, we’re adding a d variant to all three instance families. The new Amazon EC2 C7gd, M7gd, and R7gd instance types have NVM Express (NVMe) locally attached up to 2 x 1.9 TB SSD drives that are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the instance. These instances have up to 45 percent better real-time NVMe storage performance than comparable Graviton2-based instances.

These are a great fit for applications that need access to high-speed, low-latency local storage, including those that need temporary storage of data for scratch space, temporary files, and caches. The data on an instance store volume persists only during the life of the associated EC2 instance.

Here are the specs for these instances:

Instance Size	vCPU	Memory (GiB)	Local NVMe Storage (GB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
C7gd/M7gd/R7gd		C7gd/M7gd/R7gd	C7gd/M7gd/R7gd
medium	1	2/ 4 / 8	1 x 59	Up to 12.5	Up to 10
large	2	4 / 8 / 16	1 x 118	Up to 12.5	Up to 10
xlarge	4	8 / 16 / 32	1 x 237	Up to 12.5	Up to 10
2xlarge	8	16 / 32 / 64	1 x 474	Up to 15	Up to 10
4xlarge	16	32 / 64 / 128	1 x 950	Up to 15	Up to 10
8xlarge	32	64 / 128 / 256	1 x 1900	15	10
12xlarge	48	96 / 192/ 384	2 x 1425	22.5	15
16xlarge	64	128 / 256 / 512	2 x 1900	30	20

These instances are built on the AWS Nitro System, a combination of AWS-designed dedicated hardware and a lightweight hypervisor that allows the delivery of isolated multitenancy, private networking, and fast local storage. They provide up to 20 Gbps Amazon Elastic Block Store (Amazon EBS) bandwidth and up to 30 Gbps network bandwidth. The 16xlarge instances also support Elastic Fabric Adapter (EFA) for applications that need a high level of inter-node communication.

Now Available
Amazon EC2 C7gd, M7gd, and R7gd instances are now available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and Europe (Ireland). As usual with Amazon EC2, you only pay for what you use. For more information, see the Amazon EC2 pricing page.

If you’re optimizing applications for Arm architecture, be sure to have a look at our Getting Started collection of resources or learn more about AWS Graviton3-based EC2 instances.

To learn more, visit our Amazon EC2 C7g instances, M7g instances or R7g instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

Mixing AWS Graviton with x86 CPUs to optimize cost and resiliency using Amazon EKS

2023-07-13 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/mixing-aws-graviton-with-x86-cpus-to-optimize-cost-and-resilience-using-amazon-eks/

This post is written by Yahav Biran, Principal SA, and Yuval Dovrat, Israel Head Compute SA.

This post shows you how to integrate AWS Graviton-based Amazon EC2 instances into an existing Amazon Elastic Kubernetes Service (Amazon EKS) environment running on x86-based Amazon EC2 instances. Customers use mixed-CPU architectures to enable their application to utilize a wide selection of Amazon EC2 instance types and improve overall application resilience. In order to successfully run a mixed-CPU application, it is strongly recommended that you test application performance in a test environment before running production applications on Graviton-based instances. You can follow AWS’ transition guide to learn more about porting your application to AWS Graviton.

This example shows how you can use KEDA for controlling application capacity across CPU types in EKS. KEDA will trigger a deployment based on the application’s response latency as measured by the Application Load Balancer (ALB). To simplify resource provisioning, Karpenter, an open-source Kubernetes node provisioning software, and AWS Load Balancer Controller, are shown as well.

Solution Overview

There are two solutions that this post covers to test a mixed-CPU application. The first configuration (shown in Figure 1 below) is the “A/B Configuration”. It uses an Application Load Balancer (ALB)-based Ingress to control traffic flowing to x86-based and Graviton-based node pools. You use this configuration to gradually migrate a live application from x86-based instances to Graviton-based instances, while validating the response time with Amazon CloudWatch.

Figure 1, config 1: A/B Configuration

In the second configuration, the “Karpenter Controlled Configuration” (shown in Figure 2 below as Config 2), Karpenter automatically controls the instance blend. Karpenter is configured to use weighted provisioners with values that prioritize AWS Graviton-based Amazon EC2 instances over x86-based Amazon EC2 instances.

Figure 2, config II: Karpenter Controlled Configuration, with Weighting provisioners topology

It is recommended that you start with the “A/B” configuration to measure the response time of live requests. Once your workload is validated on Graviton-based instances, you can build the second configuration to simplify the deployment configuration and increase resiliency. This enables your application to automatically utilize x86-based instances if needed, for example, during an unplanned large-scale event.

You can find the step-by-step guide on GitHub to help you to examine and try the example app deployment described in this post. The following provides an overview of the step-by-step guide.

Code Migration to AWS Graviton

The first step is migrating your code from x86-based instances to Graviton-based instances. AWS has multiple resources to help you migrate your code. These include AWS Graviton Fast Start Program, AWS Graviton Technical Guide GitHub Repository, AWS Graviton Transition Guide, and Porting Advisor for Graviton.

After making any required changes, you might need to recompile your application for the Arm64 architecture. This is necessary if your application is written in a language that compiles to machine code, such as Golang and C/C++, or if you need to rebuild native-code libraries for interpreted/JIT compiled languages such as the Python/C API or Java Native Interface (JNI).

To allow your containerized application to run on both x86 and Graviton-based nodes, you must build OCI images for both the x86 and Arm64 architectures, push them to your image repository (such as Amazon ECR), and stitch them together by creating and pushing an OCI multi-architecture manifest list. You can find an overview of these steps in this AWS blog post. You can also find the AWS Cloud Development Kit (CDK) construct on GitHub to help get you started.

To simplify the process, you can use a Linux distribution package manager that supports cross-platform packages and avoid platform-specific software package names in the Linux distribution wherever possible. For example, use:

RUN pip install httpd

instead of:

ARG ARCH=aarch64 or amd64
RUN yum install httpd.${ARCH}

This blog post shows you how to automate multi-arch OCI image building in greater depth.

Application Deployment

Config 1 – A/B controlled topology

This topology allows you to migrate to Graviton while validating the application’s response time (approximately 300ms) on both x86 and Graviton-based instances. As shown in Figure 1, this design has a single Listener that forwards incoming requests to two Target Groups. One Target Group is associated with Graviton-based instances, while the other Target Group is associated with x86-based instances. The traffic ratio associated with each target group is defined in the Ingress configuration.

Here are the steps to create Config 1:

Create two KEDA ScaledObjects that scale the number of pods based on the latency metric (AWS/ApplicationELB-TargetResponseTime) that matches the target group (triggers.metadata.dimensionValue). Declare the maximum acceptable latency in targetMetricValue:0.3.
Below is the Graviton deployment scaledObject (spec.scaleTargetRef), note the comments that denote the value of the x86 deployment scaledObject

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
…
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: armsimplemultiarchapp #amdsimplemultiarchapp
…
  triggers:                 
    - type: aws-cloudwatch
      metadata:
        namespace: "AWS/ApplicationELB"
        dimensionName: "LoadBalancer"
        dimensionValue: "app/simplemultiarchapp/xxxxxx"
        metricName: "TargetResponseTime"
        targetMetricValue: "0.3"

Once the topology has been created, add Amazon CloudWatch Container Insights to measure CPU, network throughput, and instance performance.
To simplify testing and control for potential performance differences in instance generations, create two dedicated Karpenter provisioners and Kubernetes Deployments (replica sets) and specify the instance generation, CPU count, and CPU architecture for each one. This example uses c7g (Graviton3) and c6i (Intel) . You will remove these constraints in the next topology to allow more allocation flexibility.

The x86-based instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "6"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In 
    values:
    - "2"
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64

The Graviton-based instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: arm64provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "7"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In
    values:
    - "2"
  - key: kubernetes.io/arch
    operator: In
    values:
    - arm64

Create two Kubernetes Deployment resources—one per CPU architecture—that use nodeSelector to schedule one Deployment on Graviton-based instances, and another Deployment on x86-based instances. Similarly, create two NodePort Service resources, where each Service points to its architecture-specific ReplicaSet.
Create an Application Load Balancer using the AWS Load Balancer Controller to distribute incoming requests among the different pods. Control the traffic routing in the ingress by adding an ingress.kubernetes.io/actions.weighted-routing annotation. You can adjust the weight in the example below to meet your needs. This migration example started with a 100%-to-0% x86-to-Graviton ratio, adjusting over time by 10% increments until it reached a 0%-to-100% x86-to-Graviton ratio.

…
alb.ingress.kubernetes.io/actions.weighted-routing: | 
{
…
  "targetGroups":[
    {
      "serviceName":"armsimplemultiarchapp-svc",
      "servicePort":"80","weight":50
    },
    {
      "serviceName":"amdsimplemultiarchapp-svc",
      "servicePort":"80","weight":50}]
    }
 }

spec:
  ingressClassName: alb
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: weighted-routing

You can simulate live user requests to an example application ALB endpoint. Amazon CloudWatch populates ALB Target Group request/second metrics, dimensioned by HTTP response code, to help assess the application throughput and CPU usage.

During the simulation, you will need to verify the following:

Both Graviton-based instances and x86-based instances pods process a variable amount of traffic.
The application response time (p99) meets the performance requirements (300ms).

The orange (Graviton) and blue (x86) curves of HTTP 2xx responses (figure 4) show the application throughput (HTTP requests/seconds) for each CPU architecture during the migration.

Figure 3 HTTP 2XX per CPU architecture

Figure 4 shows an example of application response time during the transition from x86-based instances to Graviton-based instances. The latency associated with each instance family grows and shrinks as the live request simulation changes the load on the application. In this example, the latency on x86 instances (until 07:00) grew up to 300ms because most of the request load was directed at to x86-based pods. It began to converge at around 08:00 when more pods were powered by Graviton-based instances. Finally, after 15:00, the request load was processed by Graviton-based instances entirely.

Figure 4: Target Response Time p99

Config 2 – Karpenter Controlled Configuration

After fully testing the application on Graviton-based EC2 instances, you are ready to simplify the deployment topology with weighted provisioners while preserving the ability to launch x86-based instances as needed.

Here are the steps to create Config 2:

Reuse the CPU-based provisioners from the previous topology, but assign a higher .spec.weight to Graviton-based instances provisioner. The x86 provisioner is still deployed in case x86-based instances are required. The karpenter.k8s.aws/instance-family can be expanded beyond those set in Config 1 or excluded by switching the operator to NotIn.

The x86-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: kubernetes.io/arch
    operator: In
    values: [amd64]

The Graviton-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: priority-arm64provisioner
spec:
  weight: 10
  requirements:
  - key: kubernetes.io/arch
    operator: In
    values: [arm64]

Next, merge the two Kubernetes deployments into one deployment similar to the original before migration (i.e., no specific nodeSelector that points to a CPU-specific provisioner).

The two services are also combined into a single Kubernetes service and the actions.weighted-routing annotation is removed from the ingress resources:

spec:
  rules:
    - http:
        paths:
          - path: /app
            pathType: Prefix
            backend:
              service:
                name: simplemultiarchapp-svc

Unite the two KEDA ScaledObject resources from the first configuration and point them to a single deployment, e.g., simplemultiarchapp. The new KEDA ScaledObject will be:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: simplemultiarchapp-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: simplemultiarchapp
…

Figure 5 Config 2 – Weighting provisioners results

A synthetic limit on Graviton CPU capacity is set to illustrate the scaling to x86_64 CPUs (Provisioner.limits.resources.cpu). The total application throughput (figure 6) is shown by aarch64_200 (blue) and x86_64_200 (orange). Mixing CPUs did not impact the target response time (Figure 6). Karpenter behaved as expected: prioritizing Graviton-based instances, and bursting to x86-based Amazon EC2 instances when CPU limits were crossed.

Figure 6 Config 2 -HTTP response time p99 with mixed-CPU provisioner

Conclusion

The use of a mixed-CPU architecture enables your application to utilize a wide selection of Amazon EC2 instance types and improves your applications’ resilience while meeting your service-level objectives. Application metrics can be used to control the migration with AWS ALB Ingress, Karpenter, and KEDA. Moreover, AWS Graviton-based Amazon EC2 instances can deliver up to 40% better price performance than x86-based Amazon EC2 instances. Learn more about this example on GitHub and more announcements about Gravtion.

New – Amazon EC2 Hpc7g Instances Powered by AWS Graviton3E Processors Optimized for High Performance Computing Workloads

2023-06-20 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-hpc7g-instances-powered-by-aws-graviton3e-processors-optimized-for-high-performance-computing-workloads/

At AWS re:Invent 2022, Adam Selipsky, CEO of AWS, explained high performance computing (HPC) workloads typically can either be compute-intensive, compute- and networking-intensive, or data- and memory-intensive in his keynote.

Compute workloads include weather forecasting, computational fluid dynamics, and financial options pricing. To help with this, you have Amazon EC2 Hpc6a instances, which deliver up to 65 percent better price performance over comparable compute optimized x86-based instances.

Other HPC workloads require modeling the performance of complex structures—things like wind turbines, concrete buildings, and industrial equipment. Without enough data and memory, these models can take days or weeks to run in a cost-effective way. The Amazon EC2 Hpc6id instance is designed to deliver leading price performance for data and memory-intensive HPC workloads with higher memory bandwidth per core, faster local solid-state drive (SSD) storage, and enhanced networking with Elastic Fabric Adapter (EFA).

Announcing Amazon EC2 Hpc7g Instances
Compute-intensive HPC workloads such as weather forecasting, computational fluid dynamics, and financial options pricing also require more network performance, even better price performance, and greater energy efficiency.

Today we are announcing the general availability of Amazon EC2 Hpc7g instances, a new purpose-built instance type for tightly coupled compute and network-intensive HPC workloads.

Hpc7g instances are powered by AWS Graviton3E processors that provide up to two times better floating-point performance and 200 Gbps dedicated EFA bandwidth than EC2 C6gn instances powered by AWS Graviton2 processors and are up to 60 percent more energy efficient than comparable x86 instances.

Here’s a quick infographic that shows you how the Hpc7g instances and the Graviton3E processors compare to previous instances and processors:

Hpc7g instances feature sizes of up to 64 cores of the latest AWS custom Graviton3E CPUs with 128 GiB RAM. Here are the detailed specs:

Instance Name	CPUs	RAM (GiB)	EFA Network Bandwidth (Gbps)	Attached Storage
hpc7g.4xlarge	16	128	Up to 200	EBS Only
hpc7g.8xlarge	32	128	Up to 200	EBS Only
hpc7g.16xlarge	64	128	Up to 200	EBS Only

Hpc7g instances are the most cost-efficient option to scale your HPC clusters on AWS. If you are considering migrating your largest HPC workloads requiring tens of thousands of cores at scale to AWS, you can take advantage of up to 200 Gbps EFA bandwidth to reduce the latency and run message passing interface (MPI) applications on parallel computing architectures while ensuring minimized power consumption on Hpc7g instances.

You can choose to use smaller sizes of Hpc7g instances to pick a lower number of cores and evenly distribute memory and network resources across the remaining cores to increase per-core performance to help reduce software licensing costs.

You can also use Hpc7g instances with AWS ParallelCluster to offer a complete HPC run-time environment that spans both x86 and arm64 instance types, giving you the flexibility to run different workload types within the same HPC cluster. You can compare and contrast performance, thus making it easier to find out what’s best for you and enabling easier porting of your workload.

Customer Story
The Water Institute is an independent, non-profit applied research organization that works across disciplines to advance science and develop integrated methods used to solve complex environmental and societal challenges.

They benchmarked the Hpc7g instances with 200 Gbps EFA using the Advanced Circulation (ADCIRC) model. ADCIRC is deployed throughout many US government agencies to simulate the movement of water due to astronomic tides, riverine flows, and atmospheric forces, including hurricanes and it is often used for real-time forecasting applications and design studies.

The model run for this application is targeted at Southern Louisiana and is the basis for most of the analysis conducted there including levee design, planning studies, and real-time hurricane storm surge forecasting applications. The left graphic above shows the full extent of the domain, while to the right of that, the high-resolution area targeted at Southern Louisiana shows flooding around the levees in New Orleans during a simulation of Hurricane Katrina.

The model contains 1.6 million vertices and 3 million elements. It’s these parameters that affect the computational complexity of the simulations. The simulations depict 18 days of astronomic tide, river inflows, and atmospheric wind and pressure forcing.

The Water Institute benchmarked against many of the instance types that would be useful for their workload types at AWS, including c6gn.16xlarge, hpc7g.16xlarge, hpc6a.48xlarge, and hpc6id.36xlarge.

The Hpc7g instance shows more than 40 percent better performance than the C6gn instance and has comparable performance to other high performance x86 instance types but with a better price-to-performance ratio. With Hpc7g instances, the Water Institute can lower its costs while maintaining the performance levels they expect.

RIKEN, who has built the powerful supercomputer, FUGAKU using arm64, is collaborating with AWS to create a virtual Fugaku using Hpc7g with Graviton3E to support Japanese manufacturers’ increasing demand for compute power. RIKEN has already confirmed that multiple Fugaku applications provide excellent performance on the AWS Graviton3E processor in the AWS cloud environment.

Also, Siemens has optimized the scalability of Simcenter STAR-CCM+ across a broad range of CPU and GPU instances on AWS. This technology is supported on Linux and available through Arm-based EC2 instances or the Fugaku supercomputer.

To hear more voices of customers and partners such as Ansys, Arup, CERFACS, ESI, Jij, ParTec, Rescale, and TotalCAE, see the Hpc7g instances page.

Now Available
Amazon EC2 Hpc7g instances are now generally available in the US East (N. Virginia) Region for purchase in On-Demand, Reserved Instance, and Savings Plan form.

To learn more, see the Amazon EC2 Hpc7g instances page. Give it a try, and please send feedback to AWS re:Post for High Performance Compute or through your usual AWS support contacts.

– Channy

Understanding performance-driven cost optimization in the GSD

Key dashboard components

NIH reduction analysis

Enhanced cost analysis visualizations

Detailed savings analysis

Conclusion

Instance type flexibility strategies

Regional adaptation techniques

Best practices for EC2 Spot Instances usage with Graviton-based instances

Attribute-based instance type selection

Special considerations

Performance testing with multiple instance types

Reserving specific Graviton instance types

Amazon EMR workloads

Conclusion

M7g instances vs. M5 instances on Amazon MQ

Workload capacity improvements

Throughput improvements

Cost savings on cluster disk usage

Pricing and Regional availability

Summary

About the authors

Architecture overview

Deployment

Examining the results: unlocking insights from your Graviton Savings Dashboard

Using dashboard insights: a FinOps team use case

Conclusion

The example

Overview

Initial data collection

First phase: system level

Second phase: CPU core level

Addressing CPU Core bottlenecks

Code sparsity

Instruction TLB misses/1000 instructions (c7g.16xl)

Instructions per clock cycle (IPC)

Results

Re-examining the setup of our testing enviornment: Changing where the load-generator lives

Branch prediction misses/1000 instructions

IPC

Code sparsity

Conclusion

M7g instances on Amazon MSK

M7g instances in action

Lower costs, with lesser operational burden, and higher resiliency

Pricing and Regions

Summary

About the Authors

Explore the latest compute innovations

Customer experiences and applications with machine learning

Discover what powers AWS compute

Optimize your compute costs

Check out workload-specific sessions

Hear from AWS customers

Ready to unlock new possibilities?

Solution overview

Benchmarking EMR Serverless for GoDaddy

AWS Graviton2

AWS benchmark

GoDaddy benchmark

Benchmark results

Benchmarking methodology

Summary of results

Conclusion

About the Authors

Compute optimization

See you next time!

Solution Overview

Code Migration to AWS Graviton

Application Deployment

Config 1 – A/B controlled topology

Config 2 – Karpenter Controlled Configuration

Conclusion

The collective thoughts of the interwebz

Summary of results