Tag Archives: Amazon EC2

Automating multi-AZ high availability for WebLogic administration server

Post Syndicated from Jack Zhou original https://aws.amazon.com/blogs/architecture/automating-multi-az-high-availability-for-weblogic-administration-server/

Oracle WebLogic Server is used by enterprises to power production workloads, including Oracle E-Business Suite (EBS) and Oracle Fusion Middleware applications.

Customer applications are deployed to WebLogic Server instances (managed servers) and managed using an administration server (admin server) within a logical organization unit, called a domain. Clusters of managed servers provide application availability and horizontal scalability, while the single-instance admin server does not host applications.

There are various architectures detailing WebLogic-managed server high availability (HA). In this post, we demonstrate using Availability Zones (AZ) and a floating IP address to achieve a “stretch cluster” (Oracle’s terminology).

Overview of a WebLogic domain

Figure 1. Overview of a WebLogic domain

Overview of problem

The WebLogic admin server is important for domain configuration, management, and monitoring both application performance and system health. Historically, WebLogic was configured using IP addresses, with managed servers caching the admin server IP to reconnect if the connection was lost.

This can cause issues in a dynamic Cloud setup, as replacing the admin server from a template changes its IP address, causing two connectivity issues:

  1. Communication­ within the domain: the admin and managed servers communicate via the T3 protocol, which is based on Java RMI.
  2. Remote access to admin server console: allowing internet admin access and what additional security controls may be required is beyond the scope of this post.

Here, we will explore how to minimize downtime and achieve HA for your admin server.

Solution overview

For this solution, there are three approaches customers tend to follow:

  1. Use a floating virtual IP to keep the address static. This solution is familiar to WebLogic administrators as it replicates historical on-premise HA implementations. The remainder of this post dives into this practical implementation.
  2. Use DNS to resolve the admin server IP address. This is also a supported configuration.
  3. Run in a “headless configuration” and not (normally) run the admin server.
    • Use WebLogic Scripting Tool to issues commands
    • Collect and observe metrics through other toolsRunning “headless” requires a high level of operational maturity. It may not be compatible for certain vendor packaged applications deployed to WebLogic.

Using a floating IP address for WebLogic admin server

Here, we discuss the reference WebLogic deployment architecture on AWS, as depicted in Figure 2.

Reference WebLogic deployment with multi-AZ admin HA capability

Figure 2. Reference WebLogic deployment with multi-AZ admin HA capability

In this example, a WebLogic domain resides in a virtual private cloud’s (VPC) private subnet. The admin server is on its own Amazon Elastic Compute Cloud (Amazon EC2) instance. It’s bound to the private IP 10.0.11.8 that floats across AZs within the VPC. There are two ways to achieve this:

  1. Create a “dummy” subnet in the VPC (in any AZ), with the smallest allowed subnet size of /28. Excluding the first “4” and the last IP of the subnet because they’re reserved, choose an address. For a 10.0.11.0/28 subnet, we will use 10.0.11.8 and configure WebLogic admin server to bind to that.
  2. Use an IP outside of the VPC. We discuss this second way and compare both processes in the later section “Alternate solution for multi-AZ floating IP”.

This example Amazon Web Services stretch architecture with one WebLogic domain and one admin server:

  • Create a VPC across two or more AZs, with one private subnet in each AZ for managed servers and an additional “dummy” subnet.
  • Create two EC2 instances, one for each of the WebLogic Managed Servers (distributed across the private subnets).
  • Use an Auto Scaling group to ensure a single admin server running.
    • Create an Amazon EC2 launch template for the admin server.
    • Associate the launch template and an Auto Scaling group with minimum, maximum, and desired capacity of 1. The Auto Scaling Group (ASG) detects EC2 and/or AZ degradation and launches a new instance in a different AZ if the current fails.
    • Create an AWS Lambda function (example to follow) to be called by the Auto Scale group lifecycle hook to update the route tables.
    • Update the user data commands (example to follow) of the launch template to:
      • Add the floating IP address to the network interface
      • Start the admin server using the floating IP

To route traffic to the floating IP, we update route tables for both public and private subnets.

We create a Lambda function launched by the Auto Scale group lifecycle hook pending:InService when a new admin instance is created. This Lambda code updates routing rules in both route tables mapping the dummy subnet CIDR (10.0.11.0/28) of the “floating” IP to the admin Amazon EC2. This updates routes in both the public and private subnets for the dynamically launched admin server, enabling managed servers to connect.

Enabling internet access to the admin server

If enabling internet access to the admin server, create an internet-facing Application Load Balancer (ALB) attached to the public subnets. With the route to the admin server, the ALB can forward traffic to it.

  • Create an IP-based target group that points to the floating IP.
  • Add a forwarding rule in the ALB to route WebLogic admin traffic to the admin server.

User data commands in the launch template to make admin server accessible upon ASG scale out

In the admin server EC2 launch template, add user data code to monitor the ASG lifecycle state. When it reaches InService state, a Lambda function is invoked to update route tables. Then, the script starts the WebLogic admin server Java process (and associated NodeManager, if used).

The admin server instance’s SourceDestCheck attribute needs to be set to false, enabling it to bind to the logical IP. This change can also be done in the Lambda function.

When a user accesses the admin server from the internet:

  1. Traffic flows to the elastic IP address associated to the internet-facing ALB.
  2. The ALB forwards to the configured target group.
  3. The ALB uses the updated routes to reach 10.0.11.8 (admin server).

When managed servers communicate with the admin server, they use the updated route table to reach 10.0.11.8 (admin server).

The Lambda function

Here, we present a Lambda function example that sets the EC2 instance SourceDeskCheck attribute to false and updates the route rules for the dummy subnet CIDR (the “floating” IP on the admin server EC2) in both public and private route tables.

import { AutoScalingClient, CompleteLifecycleActionCommand } from "@aws-sdk/client-auto-scaling";
import { EC2Client, DeleteRouteCommand, CreateRouteCommand, ModifyInstanceAttributeCommand } from "@aws-sdk/client-ec2";
export const handler = async (event, context, callback) => {
console.log('LogAutoScalingEvent');
console.log('Received event:', JSON.stringify(event, null, 2));

// IMPORTANT: replace with your dummy subnet CIDR that the floating IP resides in
const destCIDR = "10.0.11.0/28";
// IMPORTANT: replace with your route table IDs
const rtTables = ["rtb-**************ff0", "rtb-**************af5"];

const asClient = new AutoScalingClient({region: event.region});
const eventDetail = event.detail;

const ec2client = new EC2Client({region: event.region});

const inputModifyAttr = {
"SourceDestCheck": {
"Value": false
},
"InstanceId": eventDetail['EC2InstanceId'],
};

const commandModifyAttr = new ModifyInstanceAttributeCommand(inputModifyAttr);
await ec2client.send(commandModifyAttr);

// modify route in two route tables
for (const rt of rtTables) {
const inputDelRoute = { // DeleteRouteRequest
DestinationCidrBlock: destCIDR,
DryRun: false,
RouteTableId: rt, // required
};
const cmdDelRoute = new DeleteRouteCommand(inputDelRoute);
try {
const response = await ec2client.send(cmdDelRoute);
console.log(response);
} catch (error) {
console.log(error);
}

const inputCreateRoute = { // addRouteRequest
DestinationCidrBlock: destCIDR,
DryRun: false,
InstanceId: eventDetail['EC2InstanceId'],
RouteTableId: rt, // required
};

const cmdCreateRoute = new CreateRouteCommand(inputCreateRoute);
await ec2client.send(cmdCreateRoute);
}

// continue on ASG lifecycle
const params = {
AutoScalingGroupName: eventDetail['AutoScalingGroupName'], /* required */
LifecycleActionResult: 'CONTINUE', /* required */
LifecycleHookName: eventDetail['LifecycleHookName'], /* required */
InstanceId: eventDetail['EC2InstanceId'],
LifecycleActionToken: eventDetail['LifecycleActionToken']
};
const cmdCompleteLifecycle = new CompleteLifecycleActionCommand(params);
const response = await asClient.send(cmdCompleteLifecycle);
console.log(response);
return response;
};

Amazon EC2 user data

The following code in Amazon EC2 user data shows how to add logical secondary IP address to the Amazon EC2 primary ENI, keep polling the ASG lifecycle state, and start the admin server Java process upon Amazon EC2 entering the InService state.

Content-Type: text/x-shellscript; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="userdata.txt"
#!/bin/bash

ip addr add 10.0.11.8/28 br 10.0.11.255 dev eth0
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
for x in {1..30}
do 
  target_state=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/autoscaling/target-lifecycle-state)
  if [ \"$target_state\" = \"InService\" ]; then
    su -c 'nohup /mnt/efs/wls/fmw/install/Oracle/Middleware/Oracle_Home/user_projects/domains/domain1/bin/startWebLogic.sh &' ec2-user 
    break
  fi
  sleep 10
done

Alternate solution for multi-AZ floating IP

An alternative solution for the floating IP is to use an IP external to the VPC. The configurations for ASG, Amazon EC2 launch template, and ASG lifecycle hook Lambda function remain the same. However, the ALB cannot access the WebLogic admin console webapp from the internet due to its requirement for a VPC-internal subnet. To access the webapp in this scenario, stand up a bastion host in a public subnet.

While this approach “saves” 16 VPC IP addresses by avoiding a dummy subnet, there are disadvantages:

  • Bastion hosts are not AZ-failure resilient.
  • Missing true multi-AZ resilience like the first solution.
  • Requires additional cost and complexity in managing multiple bastion hosts across AZs or a VPN.

Conclusion

AWS has a track record of efficiently running Oracle applications, Oracle EBS, PeopleSoft, and mission critical JEE workloads. In this post, we delved into a HA solution using a multi-AZ floating IP for the WebLogic admin server, and using ASG to ensure a singular admin server. We showed how to use ASG lifecycle hooks and Lambda to automate route updates for the floating IP and configuring an ALB to allow Internet access for the admin server. This solution achieves multi-AZ resilience for WebLogic admin server with automated recovery, transforming a traditional WebLogic admin server from a pet to cattle.

New – Amazon EC2 M2 Pro Mac Instances Built on Apple Silicon M2 Pro Mac Mini Computers

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-m2-pro-mac-instances-built-on-apple-silicon-m2-pro-mac-mini-computers/

Today, we are announcing the general availability of Amazon EC2 M2 Pro Mac instances. These instances deliver up to 35 percent faster performance over the existing M1 Mac instances when building and testing applications for Apple platforms.

New EC2 M2 Pro Mac instances are powered by Apple M2 Pro Mac Mini computers featuring 12 core CPU, 19 core GPU, 32 GiB of memory, and 16 core Apple Neural Engine and uniquely enabled by the AWS Nitro System through high-speed Thunderbolt connections, offering these Mac mini computers as fully integrated and managed compute instances with up to 10 Gbps of Amazon VPC network bandwidth and up to 8 Gbps of Amazon EBS storage bandwidth. EC2 M2 Pro Mac instances support macOS Ventura (version 13.2 or later) as AMIs.

A Story of EC2 Mac Instances
When Jeff Barr first introduced Amazon EC2 Mac Instances in 2020, customers were surprised to be able to run macOS on Amazon EC2 to build, test, package, and sign applications developed with Xcode applications for the Apple platform, including macOS, iOS, iPadOS, tvOS, and watchOS.

In his keynote in AWS re:Invent 2020, Peter DeSantis revealed the secret to build EC2 Mac instances powered by the AWS Nitro System, which makes it possible to offer Apple Mac mini computers as fully integrated and managed compute instances with Amazon VPC networking and Amazon EBS storage, just like any other EC2 instances.

“We did not need to make any changes to the Mac hardware. We simply connected a Nitro controller via the Mac’s Thunderbolt connection. When you launch a Mac instance, your Mac-compatible Amazon Machine Image (AMI) runs directly on the Mac Mini, with no hypervisor. The Nitro controller sets up the instance and provides secure access to the network and any storage attached. And that Mac Mini can now natively use any AWS service.”

In July 2022, we introduced Amazon EC2 M1 Mac Instances built around the Apple-designed M1 System on Chip (SoC). Developers building for iPhone, iPad, Apple Watch, and Apple TV applications can choose either x86-based EC2 Mac instances or Arm-based EC2 M1 instances. If you want to re-architect your apps to natively support Macs with Apple Silicon using EC2 M1 instances, you can build and test your apps to deliver up to 60 percent better price performance over the EC2 Mac instances for iPhone and Mac app build workloads with all the benefits of AWS.

Many customers take advantage of EC2 Mac instances to deliver a complete end-to-end build pipeline on macOS on AWS. With EC2 Mac instances, they can scale their iOS build fleet; easily use custom macOS environments with AMIs; and debug any build or test failures with fully reproducible macOS environments.

Customers have reported up to 4x reduction in build times, up to 3x increase in parallel builds, up to 80 percent reduction in machine-related build failures, and up to 50 percent reduction in fleet size. They can continue to prioritize their time on innovating products and features while reducing the tedious effort required to manage on-premises macOS infrastructure.

To accelerate this innovation, EC2 Mac instances recently began to support replacing root volumes on a running EC2 Mac instance, enabling you to restore the root volume of an EC2 Mac instance to its initial launch state or to a specific snapshot, without requiring you to stop or terminate the instance.

You can also use in-place operating system updates from within the guest environment on EC2 M1 Mac instances to a specific or latest macOS version, including the beta version, by registering your instances with the Apple Developer Program. Developers can now integrate the latest macOS features into their applications and test existing applications for compatibility before public macOS releases.

Getting Started with EC2 M2 Pro Instances
As with other EC2 Mac instances, EC2 M2 Pro Mac instances also support Dedicated Host tenancy with a minimum host allocation duration of 24 hours to align with macOS licensing.

To get started, you should allocate a Mac-dedicated host, a physical server fully dedicated for your own use in your AWS account. After the host is allocated, you can launch, stop, and start your own macOS environment as one instance on that host for one dedicated host.

After the host is allocated, you can start an EC2 Mac instance on it. The procedure is no different from starting any EC2 instance type. Choose your macOS AMI version and select the mac2-m2pro.metal instance type in the Application and OS Images section.

In the Advanced details section, select Dedicated host in Tenancy and a dedicated host you just created in Tenancy host ID.

When you use EC2 Mac instances for the first time, you can use SSH to connect to the newly launched instance as usual or enable Apple Remote Desktop and start a VNC session to the EC2 instance. To learn more, see Sebastien’s series of articles to launch and connect your Mac instance.

When you no longer need the Mac dedicated host, you can terminate your running Mac instance and release the underlying host. Note again that after being allocated, a Mac dedicated host can only be released after 24 hours to align with Apple’s macOS licensing.

Now Available
Amazon EC2 M2 Pro Mac instances are available in the US West (Oregon) and US East (Ohio) AWS Regions, with additional regions coming soon.

To learn more or get started, see Amazon EC2 Mac Instances or visit the EC2 Mac documentation.  You can send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

New – Amazon EC2 R7a Instances Powered By 4th Gen AMD EPYC Processors for Memory Optimized Workloads

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-r7a-instances-powered-by-4th-gen-amd-epyc-processors-for-memory-optimized-workloads/

We launched the memory optimized Amazon EC2 R6a instances in July 2022 powered by 3rd Gen AMD EPYC (Milan) processors, running at frequencies up to 3.6 GHz. Many customers who run workloads that are dependent on x86 instructions, such as SAP, are looking for ways to optimize their cloud utilization. They’re taking advantage of the compute choice that EC2 offers.

Today, we’re announcing the general availability of new memory optimized Amazon EC2 R7a instances powered by 4th Gen AMD EPYC (Genoa) processors with a maximum frequency of 3.7 GHz, which offer up to 50 percent higher performance compared to the previous generation instances. You can use this increased performance to process data faster, consolidate workloads, and lower the cost of ownership.

R7a instances also support AVX-512, Vector Neural Network Instructions (VNNI), and brain floating point (bfloat16). These instances feature Double Data Rate 5 (DDR5) memory, which enables high-speed access to data in-memory, and deliver 2.25 times more memory bandwidth compared to R6a instances for lower latency. Moreover, these instances support always-on memory encryption using AMD secure memory encryption (SME).

These instances are SAP-certified and ideal for high performance, memory-intensive workloads, such as SQL and NoSQL databases, distributed web scale in-memory caches, in-memory databases, real-time big data analytics, and Electronic Design Automation (EDA) applications.

R7a instances feature sizes of up to 192 vCPUs with 1536 GiB RAM. Here are the detailed specs:

Name vCPUs Memory (GiB) Network Bandwidth (Gbps) EBS Bandwidth (Gbps)
r7a.medium 1 8 Up to 12.5 Up to 10
r7a.large 2 16 Up to 12.5 Up to 10
r7a.xlarge 4 32 Up to 12.5 Up to 10
r7a.2xlarge 8 64 Up to 12.5 Up to 10
r7a.4xlarge 16 128 Up to 12.5 Up to 10
r7a.8xlarge 32 256 12.5 10
r7a.12xlarge 48 384 18.75 15
r7a.16xlarge 64 512 25 20
r7a.24xlarge 96 768 37.5 30
r7a.32xlarge 128 1024 50 40
r7a.48xlarge 192 1536 50 40

R7a instances have up to 50 Gbps enhanced networking and 40 Gbps EBS bandwidth, which is similar to R6a instances. You have a new medium instance size, which you can use to right-size your workloads more accurately, offering 1 vCPUs, 8 GiB. Additionally, with R7a instances you can attach up to 128 EBS volumes to an instance compared to up to 28 EBS volume attachments with R6a instances. R7a instances support AES-256 compared to AES-128 in R6a instances for enhanced security.

R7a instances are built on the AWS Nitro System and support Elastic Fabric Adapter (EFA) for workloads that benefit from lower network latency and highly scalable inter-node communication, such as high-performance computing and video processing.

Now Available
Amazon EC2 R7a instances are now available in AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and EU (Ireland). As usual with Amazon EC2, you only pay for what you use. For more information, see the Amazon EC2 pricing page.

To learn more, visit the EC2 R7a instances page, and AWS/AMD partner page. You can send feedback to [email protected], AWS re:Post for EC2, or through your usual AWS Support contacts.

Channy

New – Amazon EC2 R7iz Instances Memory-Optimized for High CPU Performance, Memory-Intensive Workloads

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/new-amazon-ec2-r7iz-instances-memory-optimized-for-high-cpu-performance-memory-intensive-workloads/

Today we’re announcing general availability of the Amazon EC2 R7iz instances. R7iz instances are the fastest 4th Generation Intel Xeon Scalable-based (Sapphire Rapids) instances in the cloud with 3.9 GHz sustained all-core turbo frequency. R7iz instances are suitable for workloads where there’s a requirement for more memory to process additional data, larger sizes of instances to scale up, higher compute and memory performance to reduce completion times, and higher networking and Amazon Elastic Block Store (Amazon EBS) performance to improve latency. The high compute performance of the R7iz instances, combined with a large amount of memory, results in increased overall performance for applications that include front-end electronic design automation (EDA), relational database workloads with high per core licensing fees, and financial, actuarial, and data analytics simulation workloads. This can help you speed time to market for product development while reducing licensing costs.

R7iz Instances

The specs for the R7iz instances are as follows.

vCPUs
Memory (GiB)
Network Bandwidth
EBS Bandwidth
r7iz.large 2 16 Up to 12.5 Gbps Up to 10 Gbps
r7iz.xlarge 4 32 Up to 12.5 Gbps Up to 10 Gbps
r7iz.2xlarge 8 64 Up to 12.5 Gbps Up to 10 Gbps
r7iz.4xlarge 16 128 Up to 12.5 Gbps Up to 10 Gbps
r7iz.8xlarge 32 256 12.5 Gbps 10 Gbps
r7iz.12xlarge 48 384 25 Gbps 19 Gbps
r7iz.16xlarge 64 512 25 Gbps 20 Gbps
r7iz.32xlarge 128 1024 50 Gbps 40 Gbps

You can attach up to 88 EBS volumes to each R7iz instance; by way of comparison, the z1d instances allow you to attach up to 28 volumes.

We are also getting ready to launch two sizes of bare metal R7iz instances:

vCPUs
Memory (GiB)
Network Bandwidth
EBS Bandwidth
r7iz.metal-16xl 64 512 25 Gbps 20 Gbps
r7iz.metal-32xl 128 1024 50 Gbps 40 Gbps

 Built-in Accelerators
R7iz instances also include four built-in accelerators: Advanced Matrix Extensions (AMX), Intel Data Streaming accelerator (DSA), Intel In-Memory Analytics Accelerator (IAA), and Intel QuickAssist Technology( QAT). Some of these accelerators require the use of specific kernel versions, drivers, and/or compilers. The Advanced Matrix Extensions are available on all sizes of R7iz instances while the Intel QAT, Intel IAA, and Intel DSA accelerators will be available on the r7iz.metal-16xl and r7iz.metal-32xl instances (coming soon).

Available Now
R7iz instances are generally available today in the US East (N. Virginia), and US West (Oregon) AWS Regions. As usual with Amazon EC2, you pay only for what you use. For more information, see Amazon EC2 pricing.

To learn more, visit our Amazon EC2 R7iz instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Veliswa

AWS Weekly Roundup: Farewell EC2-Classic, EBS at 15 Years, and More (Sept. 4, 2023)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-farewell-ec2-classic-ebs-at-15-years-and-more-sept-4-2023/

Last week, there was some great reading about Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic Block Store (Amazon EBS) written by AWS tech leaders.

Dr. Werner Vogels wrote Farewell EC2-Classic, it’s been swell, celebrating the 17 years of loyal duty of the original version that started what we now know as cloud computing. You can read how it made the process of acquiring compute resources simple, even though the stack running behind the scenes was incredibly complex.

We have come a long way since 2006, and we’re not done innovating for our customers. As celebrated in this year’s AWS Storage Day, Amazon EBS was launched 15 years ago this month. James Hamilton, SVP and distinguished engineer at Amazon, wrote Amazon EBS at 15 Years, about how the service has evolved to handle over 100 trillion I/O operations a day, and transfers over 13 exabytes of data daily.

As Dr. Werner said in his piece, “it’s a reminder that building evolvable systems is a strategy, and revisiting your architectures with an open mind is a must.” Our innovation efforts driven by customer feedback continue today, and this week is no different.

Last Week’s Launches
Here are some launches that got my attention:

Renaming Amazon Kinesis Data Analytics to Amazon Managed Service for Apache Flink – You can now use Amazon Managed Service for Apache Flink, a fully managed and serverless service for you to build and run real-time streaming applications using Apache Flink. All your existing running applications in Kinesis Data Analytics will work as-is, without any changes. To learn more, see my blog post.

Extended Support for Amazon Aurora and Amazon RDS – You can now get more time for support, up to three years, for Amazon Aurora and Amazon RDS database instances running MySQL 5.7, PostgreSQL 11, and higher major versions. This e will allow you time to upgrade to a new major version to help you meet your business requirements even after the community ends support for these versions.

Enhanced Starter Template for AWS Step Functions Workflow Studio – You can now use starter templates to streamline the process of creating and prototyping workflows swiftly, plus a new code mode, which enables builders to move easily between design and code authoring views. With the improved authoring experience in Workflow Studio, you can seamlessly alternate between a drag-and-drop visual builder experience or the new code editor so that you can pick your preferred tool to accelerate development.

To learn more, see Enhancing Workflow Studio with new features for streamlined authoring in the AWS Compute Blog.

Email Delivery History for Every Email in Amazon SES – You can now troubleshoot individual email delivery problems, confirm delivery of critical messages, and identify engaged recipients on a granular, single email basis. Email senders can investigate trends in delivery performance and see delivery and engagement status for each email sent using Amazon SES Virtual Deliverability Manager.

Response Streaming through Amazon SageMaker Real-time Inference – You can now continuously stream inference responses back to the client to help you build interactive experiences for various generative AI applications such as chatbots, virtual assistants, and music generators.

For more details on how to use response streaming along with examples, see Invoke to Stream an Inference Response and How containers should respond in the AWS documentation, and Elevating the generative AI experience: Introducing streaming support in Amazon SageMaker hosting in the AWS Machine Learning Blog.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you might have missed:

AI & Sports: How AWS & the NFL are Changing the Game – Over the last 5 years, AWS has partnered with the National Football League (NFL), helping fans better understand the game, helping broadcasters tell better stories, and helping teams use data to improve operations and player safety. Watch AWS CEO, Adam Selipsky, former NFL All-Pro Larry Fitzgerald, and the NFL Network’s Cynthia Frelund during their earlier livestream discussing the intersection of artificial intelligence and machine learning in sports.

Amazon Bedrock Story from Amazon Science – This is a good article explaining the benefits of using Amazon Bedrock to build and scale generative AI applications with leading foundation models, including Amazon’s Titan FMs, which focus on responsible AI to avoid toxic content.

Amazon EC2 Flexibility Score – This is an open source tool developed by AWS to assess any configuration used to launch instances through an Auto Scaling Group (ASG) against the recommended EC2 best practices. It converts the best practice adoption into a “flexibility score” that can be used to identify, improve, and monitor the configurations.

To learn more open-source news and updates, see this newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS re:InventAWS re:Invent 2023Ready to start planning your re:Invent? Browse the session catalog now. Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community.

AWS Global SummitsAWS Summits – The last in-person AWS Summit will be held in Johannesburg on Sept. 26.

AWS Community Days AWS Community Day– Join a community-led conference run by AWS user group leaders in your region: Aotearoa (Sept. 6), Lebanon (Sept. 9), Munich (Sept. 14), Argentina (Sept. 16), Spain (Sept. 23), and Chile (Sept. 30). Visit the landing page to check out all the upcoming AWS Community Days.

CDK Day – A community-led fully virtual event on Sept. 29 with tracks in English and Spanish about CDK and related projects. Learn more at the website.

You can browse all upcoming AWS-led in-person and virtual events, and developer-focused events such as AWS DevDay.

Channy

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Using and Managing Security Groups on AWS Snowball Edge devices

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/using-and-managing-security-groups-on-aws-snowball-edge-devices/

This blog post is written by Jared Novotny & Tareq Rajabi, Specialist Hybrid Edge Solution Architects. 

The AWS Snow family of products are purpose-built devices that allow petabyte-scale movement of data from on-premises locations to AWS Regions. Snow devices also enable customers to run Amazon Elastic Compute Cloud (Amazon EC2) instances with Amazon Elastic Block Storage (Amazon EBS), and Amazon Simple Storage Service (Amazon S3) in edge locations.

Security groups are used to protect EC2 instances by controlling ingress and egress traffic. Once a security group is created and associated with an instance, customers can add ingress and egress rules to control data flow. Just like the default VPC in a region, there is a default security group on Snow devices. A default security group is applied when an instance is launched and no other security group is specified.  This default security group in a region allows all inbound traffic from network interfaces and instances that are assigned to the same security group, and allows and all outbound traffic. On Snowball Edge, the default security group allows all inbound and outbound traffic.

In this post, we will review the tools and commands required to create, manage and use security groups on the Snowball Edge device.

Some things to keep in mind:

  1. AWS Snowball Edge is limited to 50 security groups.
  2. An instance will only have one security group, but each group can have a total of 120 rules. This is comprised of 60 inbound and 60 outbound rules.
  3. Security groups can only have allow statements to allow network traffic.
  4. Deny statements aren’t allowed.
  5. Some commands in the Snowball Edge client (AWS CLI) don’t provide an output.
  6. AWS CLI commands can use the name or the security group ID.

Prerequisites and tools

Customers must place an order for Snowball Edge from their AWS Console to be able to run the following AWS CLI commands and configure security groups to protect their EC2 instances.

The AWS Snowball Edge client is a standalone terminal application that customers can run on their local servers and workstations to manage and operate their Snowball Edge devices. It supports Windows, Mac, and Linux systems.

AWS OpsHub is a graphical user interface that you can use to manage your AWS Snowball devices. Furthermore, it’s the easiest tool to use to unlock Snowball Edge devices. It can also be used to configure the device, launch instances, manage storage, and provide monitoring.

Customers can download and install the Snowball Edge client and AWS OpsHub from AWS Snowball resources.

Getting Started

To get started, when a Snow device arrives at a customer site, the customer must unlock the device and launch an EC2 instance. This can be done via AWS OpsHub or the AWS Snowball Edge Client. AWS Snow Family of devices support both Virtual Network Interfaces (VNI) and Direct Network interfaces (DNI), customers should review the types of interfaces before deciding which one is best for their use case. Note that security groups are only supported with VNIs, so that is what was used in this post. A post explaining how to use these interfaces should be reviewed before proceeding.

Viewing security group information

Once the AWS Snowball Edge is unlocked, configured, and has an EC2 instance running, we can dig deeper into using security groups to act as a virtual firewall and control incoming and outgoing traffic.

Although the AWS OpsHub tool provides various functionalities for compute and storage operations, it can only be used to view the name of the security group associated to an instance in a Snowball Edge device:

view the name of the security group associated to an instance in a Snowball Edge device

Every other interaction with security groups must be through the AWS CLI.

The following command shows how to easily read the outputs describing the protocols, sources, and destinations. This particular command will show information about the default security group, which allows all inbound and outbound traffic on EC2 instances running on the Snowball Edge.

In the following sections we review the most common commands with examples and outputs.

View (all) existing security groups:

aws ec2 describe-security-groups --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge
{
    "SecurityGroups": [
        {
            "Description": "default security group",
            "GroupName": "default",
            "IpPermissions": [
                {
                    "IpProtocol": "-1",
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ]
                }
            ],
            "GroupId": "s.sg-8ec664a23666db719",
            "IpPermissionsEgress": [
                {
                    "IpProtocol": "-1",
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ]
                }
            ]
        }
    ]
}

Create new security group:

aws ec2 create-security-group --group-name allow-ssh--description "allow only ssh inbound" --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

The output returns a GroupId:

{  "GroupId": "s.sg-8f25ee27cee870b4a" }

Add port 22 ingress to security group:

aws ec2 authorize-security-group-ingress --group-ids.sg-8f25ee27cee870b4a --protocol tcp --port 22 --cidr 10.100.10.0/24 --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

{    "Return": true }

Note that if you’re using the default security group, then the outbound rule is still to allow all traffic.

Revoke port 22 ingress rule from security group

aws ec2 revoke-security-group-ingress --group-ids.sg-8f25ee27cee870b4a --ip-permissions IpProtocol=tcp,FromPort=22,ToPort=22, IpRanges=[{CidrIp=10.100.10.0/24}] --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

{ "Return": true }

Revoke default egress rule:

aws ec2 revoke-security-group-egress --group-ids.sg-8f25ee27cee870b4a  --ip-permissions IpProtocol="-1",IpRanges=[{CidrIp=0.0.0.0/0}] --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

{ "Return": true }

Note that this rule will remove all outbound ephemeral ports.

Add default outbound rule (revoked above):

aws ec2 authorize-security-group-egress --group-id s.sg-8f25ee27cee870b4a --ip-permissions IpProtocol="-1", IpRanges=[{CidrIp=0.0.0.0/0}] --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

{  "Return": true }

Changing an instance’s existing security group:

aws ec2 modify-instance-attribute --instance-id s.i-852971d05144e1d63 --groups s.sg-8f25ee27cee870b4a --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

Note that this command produces no output. We can verify that it worked with the “aws ec2 describe-instances” command. See the example as follows (command output simplified):

aws ec2 describe-instances --instance-id s.i-852971d05144e1d63 --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge


    "Reservations": [{
            "Instances": [{
                    "InstanceId": "s.i-852971d05144e1d63",
                    "InstanceType": "sbe-c.2xlarge",
                    "LaunchTime": "2022-06-27T14:58:30.167000+00:00",
                    "PrivateIpAddress": "34.223.14.193",
                    "PublicIpAddress": "10.100.10.60",
                    "SecurityGroups": [
                        {
                            "GroupName": "allow-ssh",
                            "GroupId": "s.sg-8f25ee27cee870b4a"
                        }      ], }  ] }

Changing and instance’s security group back to default:

aws ec2 modify-instance-attribute --instance-ids.i-852971d05144e1d63 --groups s.sg-8ec664a23666db719 --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

Note that this command produces no output. You can verify that it worked with the “aws ec2 describe-instances” command. See the example as follows:

aws ec2 describe-instances –instance-ids.i-852971d05144e1d63 –endpoint Https://MySnowIPAddress:8008 –profile SnowballEdge

{
    "Reservations": [
        {  "Instances": [ {
                    "AmiLaunchIndex": 0,
                    "ImageId": "s.ami-8b0223704ca8f08b2",
                    "InstanceId": "s.i-852971d05144e1d63",
                    "InstanceType": "sbe-c.2xlarge",
                    "LaunchTime": "2022-06-27T14:58:30.167000+00:00",
                    "PrivateIpAddress": "34.223.14.193",
                    "PublicIpAddress": "10.100.10.60",
                             "SecurityGroups": [
                        {
                            "GroupName": "default",
                            "GroupId": "s.sg-8ec664a23666db719" ] }

Delete security group:

aws ec2 delete-security-group --group-ids.sg-8f25ee27cee870b4a --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

Sample walkthrough to add a SSH Security Group

As an example, assume a single EC2 instance “A” running on a Snowball Edge device. By default, all traffic is allowed to EC2 instance “A”. As per the following diagram, we want to tighten security and allow only the management PC to SSH to the instance.

1. Create an SSH security group:

aws ec2 create-security-group --group-name MySshGroup--description “ssh access” --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

2. This will return a “GroupId” as an output:

{   "GroupId": "s.sg-8a420242d86dbbb89" }

3. After the creation of the security group, we must allow port 22 ingress from the management PC’s IP:

aws ec2 authorize-security-group-ingress --group-name MySshGroup -- protocol tcp --port 22 -- cidr 192.168.26.193/32 --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

4. Verify that the security group has been created:

aws ec2 describe-security-groups ––group-name MySshGroup –endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

{
	“SecurityGroups”:   [
		{
			“Description”: “SG for web servers”,
			“GroupName”: :MySshGroup”,
			“IpPermissinos”:  [
				{ “FromPort”: 22,
			 “IpProtocol”: “tcp”,
			 “IpRanges”: [
			{
				“CidrIp”: “192.168.26.193.32/32”
						} ],
					“ToPort”:  22 }],}

5. After the security group has been created, we must associate it with the instance:

aws ec2 modify-instance-attribute –-instance-id s.i-8f7ab16867ffe23d4 –-groups s.sg-8a420242d86dbbb89 --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

6. Optionally, we can delete the Security Group after it is no longer required:

aws ec2 delete-security-group --group-id s.sg-8a420242d86dbbb89 --endpoint Http://MySnowIPAddress:8008 --profile SnowballEdge

Note that for the above association, the instance ID is an output of the “aws ec2 describe-instances” command, while the security group ID is an output of the “describe-security-groups” command (or the “GroupId” returned by the console in Step 2 above).

Conclusion

This post addressed the most common commands used to create and manage security groups with the AWS Snowball Edge device. We explored the prerequisites, tools, and commands used to view, create, and modify security groups to ensure the EC2 instances deployed on AWS Snowball Edge are restricted to authorized users. We concluded with a simple walkthrough of how to restrict access to an EC2 instance over SSH from a single IP address. If you would like to learn more about the Snowball Edge product, there are several resources available on the AWS Snow Family site.

AWS Weekly Roundup – AWS AppSync, AWS CodePipeline, Events and More – August 21, 2023

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-appsync-aws-codepipeline-events-and-more-august-21-2023/

In a few days, I will board a plane towards the south. My tour around Latin America starts. But I won’t be alone in this adventure, you can find some other News Blog authors, like Jeff or Seb, speaking at AWS Community Days and local events in Peru, Argentina, Chile, and Uruguay. If you see us, come and say hi. We would love to meet you.

Latam Community in reInvent 2022

Last Week’s Launches
Here are some launches that got my attention during the previous week.

AWS AppSync now supports JavaScript for all resolvers in GraphQL APIs – Last year, we announced that AppSync now supports JavaScript pipeline resolvers. And starting last week, developers can use JavaScript to write unit resolvers, pipeline resolvers, and AppSync functions that are run on the AppSync Javascript runtime.

AWS CodePipeline now supports GitLabNow you can use your GitLab.com source repository to build, test, and deploy code changes using AWS CodePipeline, in addition to other providers like AWS CodeCommit, Bitbucket, GitHub.com, and GitHub Enterprise Server.

Amazon CloudWatch Agent adds support for OpenTelemetry traces and AWS X-Ray With the new version of the agent you are now able to collect metrics, logs, and traces with a single agent, not only for CloudWatch but also for OpenTelemetry and AWS X-Ray. Simplifying the installation, configuration, and management of telemetry collection.

New instance types: Amazon EC2 M7a and Amazon EC2 Hpc7a – The new Amazon EC2 M7a is a general purpose instance type powered by 4th Gen AMD EPYC processor. In the announcement blog, you can find all the specifics for this instance type. The new Amazon EC2 Hpc7a instances are also powered by 4th Gen AMD EPYC processors. These instance types are optimized for high performance computing and Channy Yun wrote a blog post describing the different characteristics of the Amazon EC2 Hpc7a instance type.

AWS DeepRacer Educator PlaybooksLast week we introduced the AWS DeepRacer educator playblooks, these are a tool for educators to integrate foundational machine learning (ML) curriculum and labs into their classrooms. Educators can use these playbooks to easily upskill students in the basics of ML with autonomous vehicles.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you might have missed:

Guide for using AWS Lambda to process Apache Kafka StreamsJulian Wood just published the most complete guide you can find on how to use Lambda with Apache Kafka. If you are an Amazon Kinesis user, don’t worry. We’ve got you covered with this video series where you will find similar topics.

Using AWS Lambda with Kafka guide

The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the ones in FrenchGermanItalian, and Spanish.

AWS Open-Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

Join AWS Hybrid Cloud & Edge Day to learn how to deploy your applications in the everywhere cloud

AWS Global SummitsAWS Summits – The 2023 AWS Summits season is almost ending with the last two in-person events in Mexico City (August 30) and Johannesburg (September 26).

AWS re:Invent reInvent(November 27–December 1) – But don’t worry because re:Invent season is coming closer. Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Registration is now open.

AWS Community Days AWS Community Day– Join a community-led conference run by AWS user group leaders in your region:Taiwan (August 26), Aotearoa (September 6), Lebanon (September 9), Munich (September 14), Argentina (September 16), Spain (September 23), and Chile (September 30). Check all the upcoming AWS Community Days here.

CDK Day (September 29) – A community-led fully virtual event with tracks in English and in Spanish about CDK and related projects. Learn more in the website.

That’s all for this week. Check back next Monday for another Week in Review!

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

— Marcia

New – Amazon EC2 Hpc7a Instances Powered by 4th Gen AMD EPYC Processors Optimized for High Performance Computing

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-hpc7a-instances-powered-by-4th-gen-amd-epyc-processors-optimized-for-high-performance-computing/

In January 2022, we launched Amazon EC2 Hpc6a instances for customers to efficiently run their compute-bound high performance computing (HPC) workloads on AWS with up to 65 percent better price performance over comparable x86-based compute-optimized instances.

As their jobs grow more complex, customers have asked for more cores with more compute performance and more memory and network performance to reduce the time to complete jobs. Additionally, as customers look to bring more of their HPC workloads to EC2, they have asked how we can make it easier to distribute processes to make the best use of memory and network bandwidth, to align with their workload requirements.

Today, we are announcing the general availability of Amazon EC2 Hpc7a instances, the next generation of instance types that are purpose-built for tightly coupled HPC workloads. Hpc7a instances powered by the 4th Gen AMD EPYC processors (Genoa) deliver up to 2.5 times better performance compared to Hpc6a instances. These instances offer 300 Gbps Elastic Fabric Adapter (EFA) bandwidth powered by the AWS Nitro System, for fast and low-latency internode communications.

Hpc7a instances feature Double Data Rate 5 (DDR5) memory, which provides 50 percent higher memory bandwidth compared to DDR4 memory to enable high-speed access to data in memory. These instances are ideal for compute-intensive, latency-sensitive workloads such as computational fluid dynamics (CFD) and numerical weather prediction (NWP).

If you are running on Hpc6a, you can use Hpc7a instances and take advantage of the 2 times higher core density, 2.1 times higher effective memory bandwidth, and 3 times higher network bandwidth to lower the time needed to complete jobs compared to Hpc6a instances.

Here’s a quick infographic that shows you how the Hpc7a instances and the 4th Gen AMD EPYC processor (Genoa) compare to the previous instances and processor:

Hpc7a instances feature sizes of up to 192 cores of the AMD EPYC processors CPUs with 768 GiB RAM. Here are the detailed specs:

Instance Name CPUs RAM (Gib)
EFA Network Bandwidth (Gbps)
Attached Storage
Hpc7a.12xlarge 24 768 Up to 300 EBS Only
Hpc7a.24xlarge 48 768 Up to 300 EBS Only
Hpc7a.48xlarge 96 768 Up to 300 EBS Only
Hpc7a.96xlarge 192 768 Up to 300 EBS Only

These instances provide higher compute, memory, and network performance to run the most compute-intensive workloads, such as CFD, weather forecasting, molecular dynamics, and computational chemistry on AWS.

Similar to EC2 Hpc7g instances released a month earlier, we are offering smaller instance sizes that makes it easier for customers to pick a smaller number of CPU cores to activate while keeping all other resources constant based on their workload requirements. For HPC workloads, common scenarios include providing more memory bandwidth per core for CFD workloads, allocating fewer cores in license-bound scenarios, and supporting more memory per core. To learn more, see Instance sizes in the Amazon EC2 Hpc7 family – a different experience in the AWS HPC Blog.

As with Hpc6a instances, you can use the Hpc7a instance to run your largest and most complex HPC simulations on EC2 and optimize for cost and performance. You can also use the new Hpc7a instances with AWS Batch and AWS ParallelCluster to simplify workload submission and cluster creation. You can also use Amazon FSx for Lustre for submillisecond latencies and up to hundreds of gigabytes per second of throughput for storage.

To achieve the best performance for HPC workloads, these instances have Simultaneous Multithreading (SMT) disabled, they’re available in a single Availability Zone, and they have limited external network and EBS bandwidth.

Now Available
Amazon EC2 Hpc7a instances are available today in three AWS Regions: US East (Ohio), EU (Ireland), and US GovCloud for purchase in On-Demand, Reserved Instances, and Savings Plans. For more information, see the Amazon EC2 pricing page.

To learn more, visit our Hpc7a instances page and get in touch with our HPC team, AWS re:Post for EC2, or through your usual AWS Support contacts.

Channy

New – Amazon EC2 M7a General Purpose Instances Powered by 4th Gen AMD EPYC Processors

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-m7a-general-purpose-instances-powered-by-4th-gen-amd-epyc-processors/

In November 2021, we launched Amazon EC2 M6a instances, powered by 3rd Gen AMD EPYC (Milan) processors, running at frequencies up to 3.6 GHz, which offer you up to 35 percent improvement in price performance compared to M5a instances. Many customers who run workloads that are dependent on x86 instructions, such as SAP, are looking for ways to optimize their cloud utilization. They’re taking advantage of the compute choice that EC2 offers.

Today, we’re announcing the general availability of new, general purpose Amazon EC2 M7a instances, powered by the 4th Gen AMD EPYC (Genoa) processors with a maximum frequency of 3.7 GHz, which offer up to 50 percent higher performance compared to M6a instances. This increased performance gives you the ability to process data faster, consolidate workloads, and lower the cost of ownership.

M7a instances support AVX-512, Vector Neural Network Instructions (VNNI) and brain floating point (bfloat16). These instances feature Double Data Rate 5 (DDR5) memory, which enable high-speed access to data in-memory, and deliver 2.25 times more memory bandwidth compared to M6a instances for lower latency.

M7a instances are SAP-certified and ideal for applications that benefit from high performance and high throughput, such as financial applications, application servers, simulation modeling, gaming, mid-size data stores, application development environments, and caching fleets.

M7a instances feature sizes of up to 192 vCPUs with 768 GiB RAM. Here are the detailed specs:

Name vCPUs Memory (GiB) Network Bandwidth (Gbps) EBS Bandwidth (Gbps)
m7a.medium 1 4 Up to 12.5 Up to 10
m7a.large 2 8 Up to 12.5 Up to 10
m7a.xlarge 4 16 Up to 12.5 Up to 10
m7a.2xlarge 8 32 Up to 12.5 Up to 10
m7a.4xlarge 16 64 Up to 12.5 Up to 10
m7a.8xlarge 32 128 12.5 10
m7a.12xlarge 48 192 18.75 15
m7a.16xlarge 64 256 25 20
m7a.24xlarge 96 384 37.5 30
m7a.32xlarge 128 512 50 40
m7a.48xlarge 192 768 50 40
m7a.metal-48xl 192 768 50 40

M7a instances have up to 50 Gbps enhanced networking and 40 Gbps EBS bandwidth, which is similar to M6a instances. But you have a new medium instance size, which enables you to right-size your workloads more accurately, offering 1 vCPUs, 4 GiB, and the largest size offering 192 vCPUs, 768 GiB.

The new instances are built on the AWS Nitro System, a collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware for high performance, high availability, and highly secure cloud instances.

Now Available
Amazon EC2 M7a instances are now available today in AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and EU (Ireland). As usual with Amazon EC2, you only pay for what you use. For more information, see the Amazon EC2 pricing page.

To learn more, visit the EC2 M7a instance and AWS/AMD partner page. You can send feedback to [email protected], AWS re:Post for EC2, or through your usual AWS Support contacts.

Channy

Create, Use, and Troubleshoot Launch Scripts on Amazon Lightsail

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/create-use-and-troubleshoot-launch-scripts-on-amazon-lightsail/

This blog post is written by Brian Graf, Senior Developer Advocate, Amazon Lightsail and Sophia Parafina, Senior Developer Advocate. 

Amazon Lightsail is a virtual private server (VPS) for deploying both operating systems (OS) and pre-packaged applications, such as WordPress, Plesk, cPanel, PrestaShop, and more. When deploying these instances, you can run launch scripts with additional commands such as installation of applications, configuration of system files, or installing pre-requisites for your application.

Where do I add a launch script?

If you’re deploying an instance with the Lightsail console, launch scripts can be added to an instance at deployment. They are added in the ‘deploy instance’ page:

Image of Amazon Lightsail deploy an instance page

The launch script must be added before the instance is deployed, because launch scripts can’t retroactively run after deployment.

Anatomy of a Windows Launch Script

When deploying a Lightsail Windows instance, you can use a batch script or a PowerShell script in the ‘launch script’ textbox.  Of the two options, PowerShell is more extensible and provides greater flexibility for configuration and control.

If you choose to write your launch script as a batch file, you must add <script> </script> tags at the beginning and end of your code respectively. Alternatively, a launch script in PowerShell, must use the <powershell></powershell> tags in a similar fashion.

After the closing </script> or </powershell> tag, you must add a <persist></persist> tag on the following line. The persist tag is used to determine if this is a run-once command or if it should run every time your instance is rebooted or changed from the ‘Stop’ to ‘Start’ state. If you want your script to run every time the instance is rebooted or started, then you must set the persist tag to ‘true’. If you want your launch script to just run once, then you would set your persist tag to ‘false’.

Anatomy of a Linux Launch Script

Like a Windows launch script, a Linux launch script requires specific code on the first row of the textbox to successfully execute during deployment. You must place ‘#!/bin/bash’ as the first line of code to set the shell that executes the rest of the script. After first line of code, you can continue adding additional commands to achieve the results you want.

How do I know if my Launch Script ran successfully?

Although running launch scripts is convenient to create a baseline instance, it’s possible that your instance doesn’t achieve the desired end-state because of an error in your script or permissions issues. You must troubleshoot to see why the launch script didn’t complete successfully. To find if the launch script ran successfully, refer to the instance logs to determine whether your launch script was successful or not.

For Windows, the launch log can be found in: C:\ProgramData\Amazon\EC2-Windows\launch\Log\UserdataExecution.log. Note that ProgramData is a hidden folder, and unless you access the file from PowerShell or Command Prompt, you must use Windows File Explorer (`View > Show > Hidden items`) folders to see it.

For Linux, the launch log can be found in: /var/log/cloud-init-output.log and can be monitored after your instance launches by tailing the log by typing the following in the terminal:

tail -f /var/log/cloud-init-output.log

If you want to see the entire log file including commands that have already run before you opened the log file, then you can type the following in the terminal:

less +F /var/log/cloud-init-output.log

On a Windows instance, an easy way to monitor the UserdataExecution.log is to add the following code in your launch script, which creates a shortcut to tail or watch the log as commands are executing:

# Create a log-monitoring script to monitor the progress of the launch script execution

$monitorlogs = @"
get-content C:\ProgramData\Amazon\EC2-Windows\launch\Log\UserdataExecution.log -wait
"@

# Save the log-monitoring script to the desktop for the user

$monitorlogs | out-file -FilePath C:\Users\Administrator\Desktop\MonitorLogs.ps1 -Encoding utf8 -Force

</powershell>
<persist>false</persist>

If the script was executed, then the last line of the log should say ‘{Timestamp}: User data script completed’.

However, if you want more detail, you can build the logging into your launch script. For example, you can append a text or log file with each command so that you can read the output in an easy-to-access location:

<powershell>
# Set the location for the log file. In this case,
# it will appear on the desktop of your Lightsail instance
$loc = "c:\Users\Administrator\Desktop\mylog.txt"

# Write text to the log file
Write-Output "Starting Script" >> $loc

# Download and install Chocolatey to do unattended installations of the rest of the apps.
iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

# You could run commands like this to output the progress to the log file:

# Install vscode and all dependencies
choco install -y vscode --force --force-dependencies --verbose >> $loc

# Install git and all dependencies
choco install -y git --force --force-dependencies --verbose >> $loc

# Completed
Write-Output "Completed" >> $loc
</powershell>
<persist>false</persist>

This code creates a log file, outputs data, and appends it along the way. If there is an issue, then you can see where the logs stopped or errors appeared.

For Ubuntu and Amazon Linux 2

If the cloud-init-output.log isn’t comprehensive enough, then you can re-direct the output from your commands to a log file of your choice. In this example, we create a log file in the /tmp/ directory and push all output from our commands to this file.

# Create the log file
touch /tmp/launchscript.log

# Add text to the log file if you so choose
echo 'Starting' >> /tmp/launchscript.log

# Update package index
sudo apt update >> /tmp/launchscript.log

# Install software to manage independent software vendor sources
sudo apt -y install software-properties-common >> /tmp/launchscript.log

# Add the repository for all PHP versions
sudo add-apt-repository -y ppa:ondrej/php >> /tmp/launchscript.log

# Install Web server, mySQL client, PHP (and packages), unzip, and curl
sudo apt -y install apache2 mysql-client-core-8.0 php8.0 libapache2-mod-php8.0 php8.0-common php8.0-imap php8.0-mbstring php8.0-xmlrpc php8.0-soap php8.0-gd php8.0-xml php8.0-intl php8.0-mysql php8.0-cli php8.0-bcmath php8.0-ldap php8.0-zip php8.0-curl unzip curl >> /tmp/launchscript.log

# Any final text you want to include
echo 'Completed' >> /tmp/launchscript.log

It’s possible to check the logs before the launch script has finished executing. One way to follow along is to ‘tail’ the log file. This lets you stream all updates as they occur. You can monitor the log using:

‘tail -f /tmp/launchscript.log’. </code>

Using Launch Scripts from AWS Command Line Interface (AWS CLI)

You can deploy their Lightsail instances from the AWS Command Line Interface (AWS CLI) instead of the Lightsail console. You can add launch scripts to the AWS CLI command as a parameter by creating a variable with the script and referencing the variable, or by saving the launch script as a file and referencing the local file location on your computer.

The launch script is still written the same way as the previous examples. For a Windows instance with a PowerShell launch script, you can deploy a Lightsail instance with a launch script with the following code:

# PowerShell script saved in the Downloads folder:

$loc = "c:\Users\Administrator\Desktop\mylog.txt"

# Write text to the log file

Write-Output "Starting Script" >> $loc

# Download and install Chocolatey to do unattended installations of the rest of the apps.

iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

# You could run commands like this to output the progress to the log file:

# Install vscode and all dependencies

choco install -y vscode --force --force-dependencies --verbose >> $loc

# Install git and all dependencies

choco install -y git --force --force-dependencies --verbose >> $loc

# Completed

Write-Output "Completed" >> $loc

AWS CLI code to deploy a Windows Server 2019 medium instance in the us-west-2a Availability Zone:

aws lightsail create-instances \

--instance-names "my-windows-instance-1" \

--availability-zone us-west-2a \

--blueprint-id windows_server_2019 \

--bundle-id medium_win_2_0 \

--region us-west-2 \

--user-data file://~/Downloads/powershell_script.ps1

Clean up

Remember to delete resources when you are finished using them to avoid incurring future costs.

Conclusion

You now have the understanding and examples of how to create and troubleshoot Lightsail launch scripts both through the Lightsail console and AWS CLI. As demonstrated in this blog, using launch scripts, you can increase your productivity and decrease the deployment time and configuration of your applications. For more examples of using launch scripts, check out the aws-samples GitHub repository. You now have all the foundational building blocks you need to successfully script automated instance configuration. To learn more about Lightsail, visit the Lightsail service page.

Building a high-performance Windows workstation on AWS for graphics intensive applications

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/building-a-high-performance-windows-workstation-on-aws-for-graphics-intensive-applications/

This blog post is written by Mike Lim, Senior Public Sector SA.

Video editing, professional visualization, and video games can be resource demanding applications that require high performance Windows workstations with GPUs. When developing these resource demanding applications, a high-performance remote display protocol is desirable in order to access the instances’ graphical desktops from the internet. Using NICE DCV provides a bandwidth-adaptive streaming protocol that provides near real-time responsiveness without compromising image quality. Customers using NICE DCV can leverage Amazon EC2 G4 and Amazon EC2 G5 GPU instances which support graphic-intensive applications in the cloud using a pay-as-you-go pricing model. By using Amazon Elastic Compute Cloud (Amazon EC2) with NICE DCV, customers can run graphically intensive applications remotely and stream their user interface to simpler client machines, eliminating the need for expensive dedicated workstations.

This post shows how you can provision and manage an Amazon EC2 GPU Windows instance and access it via the high-performance NICE DCV remote display protocol.

Solution overview

The solution is illustrated in the following figure.

Solution overview

Figure 1: Solution overview

We used the AWS CloudFormation Infrastructure-as-Code (IaC) service to provision our solution. Our CloudFormation template provides the following functionality:

1.       Using AWS CloudFormation, you can specify your choice of EC2 instance type, the Amazon Virtual Private Cloud (Amazon VPC), and subnet in which to provision. You also have the option to assign a static IPv4 address. NICE DCV server is installed to provide remote access, and you can specify the choice of graphics driver to install.

2.       A security group is created and associated with the EC2 instance, and it acts as a firewall.

3.       An AWS Identity and Access Management (IAM) role is created and associated with the EC2 instance using an instance profile. It lets your instance access Amazon Simple Storage Service (Amazon S3) buckets for NICE DCV server license validation, and download and install the latest graphics drivers.

4.       The IAM role also makes sure that your EC2 instance can be managed by AWS Systems Manager. This service provides in-browser command line and graphical access to your Windows instance via Session Manager and Fleet Manager from the AWS Management Console.

Walkthrough

The following sections walk through the steps to setup and maintain your graphics workstation. To begin, you need an AWS account. For this walkthrough, we provision a g5.xlarge instance for cloud gaming.

Check instance type availability

For best performance and lowest latency, you will want to provision EC2 in the AWS Region nearest to you. Before proceeding, verify that the g5.xlarge instance type is available in your desired AWS Region, and the Availability Zones (AZs) in which it is available.

Log in to your Amazon EC2 console and select your AWS Region. From the navigation pane, choose Instance Types to view the instance types available. In the search bar, filter instances types to the specific instance type you want, in this case g5.xlarge. Toggle the display preferences (gear) icon to display Availability zones column.

In the following screenshot, the g5.xlarge instance is available in two of the three AZs in eu-west-2 Europe London Region.

Amazon EC2 console instance types

Figure 2: Amazon EC2 console instance types

Check Amazon EC2 running on-demand G instances quota

Your AWS account has a limit on the number and type of EC2 instances types you can run, and you need to make sure you have enough quota to run the g5.xlarge instance.

Go to the Service Quotas console for your AWS Region. Under AWS services in the navigation pane, select Amazon Elastic Compute Cloud (Amazon EC2) and search for Running On-Demand G and VT instances. Verify that the Applied quota value number is equal or more than the number of vCPUs for the instance size you need. In the following screenshot, the applied quota value is 64. It lets us launch instance sizes from 4 vCPUs xlarge up to 64 vCPUs 16xlarge instance size.

Service Quotas console

Figure 3: Service Quotas console

You can request a higher quota value by selecting Request quota increase.

Using CloudFormation template

Download the CloudFormation template file from aws-samples GitHub repository. Go to the CloudFormation console for your AWS Region to create a stack, and upload your downloaded file.

The CloudFormation parameters page is divided into the following sections:

  1. AMI and instance type
  2. EC2 configuration
  3. Allowed inbound source IP prefixes to NICE DCV port 8443
  4. EBS volume configuration

We go through the configuration settings for each section in detail.

AMI and instance type

In this section, we select the Windows Amazon Machine Image (AMI) to use, EC2 instance type to provision, and graphics driver to install. The default AMI is Microsoft Windows Server 2022.

Replace the instanceType value with g5.xlarge.

CloudFormation parameters: AMI and instance type

Figure 4: CloudFormation parameters: AMI and instance type

Select driverType based on your instance type and the following use case:

  1. AMD: select this for instance types with AMD GPU (G4ad instance).
  2. NVIDIA-Gaming: select this to install the NVIDIA gaming driver, which is optimized for gaming (G5 and G4dn instances).
  3. NVIDIA-GRID: select this to install the GRID driver, which is optimized for professional visualization applications that render content such as 3D models or high-resolution videos (G5, G4dn, and G3 instances).
  4. none: select this option for accelerated computing instances, such as P2 and P3 instances where you download and install public NVIDIA drivers manually.
  5. NICE-DCV: this installs the NICE DCV Virtual Display driver and is suitable for all other instance types.

Note that GRID and NVIDIA gaming drivers’ downloads are available to AWS customers only. Upon installation of the software, you are bound by the terms of the NVIDIA GRID Cloud End User License Agreement.

For our walkthrough, select NVIDIA-Gaming for driverType.

Amazon EC2 configuration

In this section, we specify the VPC and subnet in which to provision our EC2 instance. You can select default VPC from the vpcID dropdown. Make sure that the subnetID value you select is in your selected VPC and resides in an AZ that has your instance type offering. You can also change the EC2 instance name.

Select Yes for the assignStaticIP option if you want to associate a static Internet IPv4 address. Note that there is a small hourly charge when the instance is not running.

CloudFormation parameters: Amazon EC2 configuration

Figure 5: CloudFormation parameters: Amazon EC2 configuration

Allowed inbound source IP prefixes to NICE DCV port 8443

Here, we specify the source prefixes allowed to access our instance. The default values allow access from all addresses. To secure access to your instance, you may want to limit the source prefix to your IP address.

To get your IPv4 address, go to https://checkip.amazonaws.com and append /32 to the value for ingressIPv4. The default VPC and subnet is IPv4 only. Therefore, you can enter ::1/128 to explicitly block all IPv6 access for ingressIPv6.

CloudFormation parameters: Allowed inbound source IP prefixes

Figure 6: CloudFormation parameters: Allowed inbound source IP prefixes

Amazon EBS volume configuration

The default Amazon Elastic Block Store (Amazon EBS) volume size is 30 GiB. You can specify a larger size by changing the volumeSize value.

CloudFormation parameters: Amazon EBS volume configuration

Figure 7: CloudFormation parameters: Amazon EBS volume configuration

Continue to provision your stack.

NICE DCV client

NICE DCV provides an HTML5 client for web browser access. For performance and additional features, such as QUIC UDP transport protocol support and USB remotization support, install the native client from the NICE DCV download page. NICE DCV offers native clients for Windows, MacOS for both Intel and Apple M1 processors, and modern Linux distributions including RHEL, SUSE Linux, and Ubuntu.

Cloudformation Outputs

Once provisioning is complete, go to the Outputs section.

CloudFormation Outputs

Figure 8: CloudFormation Outputs

The following URLs are available.

  1. DCVwebConsole
  2. EC2instance
  3. RdpConnect
  4. SSMsessionManager

We go through the purpose of each URL in the following sections.

SSMsessionManager: Change administrator password

To log in, you must specify an administrator password. Open the SSMsessionManager value URL in a new browser tab and run the command net user administrator <YOUR-PASSWORD> where <YOUR-PASSWORD> is the password on which you decided.

Systems Manager session manager

Figure 9: Systems Manager session manager

DCVwebConsole: Connecting to the EC2 instance

Copy DCVWebConsole value, open the NICE DCV client from your local machine and either use the copied value or the IP address to connect. Log in as administrator with the password that you have configured. Alternatively, enter the URL in the format dcv://<EC2-IP-Address> in a browser URL bar to automatically launch and connect a locally installed NICE DCV client to your EC2 instance.

NICE DCV client

Figure 10: NICE DCV client

EC2instance: manage EC2 instance

Use this link to manage your EC2 instance in the Amazon EC2 console. If you did not select the static IP address option, then use this page to get the assigned IP address whenever you stop and start your instance.

RdpConnect: Fleet Manager console access

The RdpConnect link provides in-browser Remote Desktop Protocol (RDP) console access to your Windows instance. Choose User credentials for Authentication Type. Enter administrator for username and the password that you have configured.

Fleet Manager Remote Desktop

Figure 11: Fleet Manager Remote Desktop

Updating NICE DCV server

To update NICE DCV server, log in via Fleet Manager Remote Desktop and run c:\users\administrator\update-DCV.cmd script. In the following screenshot, we successfully upgraded NICE DCV server from version 2022.2-14521 to 2023.0-15065.

Updating NICE DCV server

Figure 12: Updating NICE DCV server

Updating graphics drivers

You can use the download-<DRIVER_TYPE>-driver.cmd batch file to download the latest graphics driver for your instance type GPU. Downloaded files are located in the Downloads\Drivers folder.

Graphics driver download scripts

Figure 13: Graphics driver download scripts

AWS Command Line interface (AWS CLI v2) is installed in the instance. You can use it to view the different versions available on the driver S3 bucket. For example, the command aws s3 ls –recursive s3://nvidia-gaming/windows/ | sort /R lists NVIDIA gaming drivers available for download. NVIDIA GRID and AMD drivers are in the s3://ec2-windows-nvidia-drivers and s3://ec2-amd-windows-drivers S3 buckets respectively.

Listing graphics drivers on S3 bucket

Figure 14: Listing graphics drivers on S3 bucket

Use the command aws s3 cp s3://<S3_BUCKET_PATH>/<FILE-NAME>. to copy a specific driver from the S3 bucket to your local directory.

You can refer to Install NVIDIA drivers on Windows instances and Install AMD drivers on Windows instances for NVIDIA and AMD drivers installation instructions respectively.

Customizing your EC2 instance environment

You may want to customize the instance to your needs. For NVIDIA GPU instances, you can optimize GPU settings to achieve the best performance.

If you are doing video editing, then you can enable high color accuracy, configure multi-channel audio, and enable accurate audio/video sync. For gaming, you may enable gamepad support to use a DualShock 4 or Xbox 360 controller. NICE DCV session storage is enabled. This lets you transfer files using NICE DCV client. More configuration options are available from the NICE DCV User Guide and Administrator Guide.

Terminating your EC2 instance

When you have finished using your EC2 instance, you can release all provisioned resources by going to CloudFormation console to delete your stack.

Conclusion

The Amazon G4 and G5 GPU instance types are suitable for graphics-intensive applications, and NICE DCV provides a responsive and high image quality display protocol for remote access. Using the CloudFormation template from amazon-ec2-nice-dcv-samples GitHub site, you can build and maintain your own high performance Windows graphics workstation in the AWS cloud

AWS Cloud service considerations for designing multi-tenant SaaS solutions

Post Syndicated from Dennis Greene original https://aws.amazon.com/blogs/architecture/aws-cloud-service-considerations-for-designing-multi-tenant-saas-solutions/

An increasing number of software as a service (SaaS) providers are considering the move from single to multi-tenant to utilize resources more efficiently and reduce operational costs. This blog aims to inform customers of considerations when evaluating a transformation to multi-tenancy in the Amazon Web Services (AWS) Cloud. You’ll find valuable information on how to optimize your cloud-based SaaS design to reduce operating expenses, increase resiliency, and offer a high-performing experience for your customers.

Single versus multi-tenancy

In a multi-tenant architecture, resources like compute, storage, and databases can be shared among independent tenants. In contrast, a single-tenant architecture allocates exclusive resources to each tenant.

Let’s consider a SaaS product that needs to support many customers, each with their own independent deployed website. Using a single-tenant model (see Figure 1), the SaaS provider may opt to utilize a dedicated AWS account to host each tenant’s workloads. To contain their respective workloads, each tenant would have their own Amazon Elastic Compute Cloud (Amazon EC2) instances organized within an Auto Scaling group. Access to the applications running in these EC2 instances would be done via an Application Load Balancer (ALB). Each tenant would be allocated their own database environment using Amazon Relational Database Service (RDS). The website’s storage (consisting of PHP, JavaScript, CSS, and HTML files) would be provided by Amazon Elastic Block Store (EBS) volumes attached to the EC2 instances. The SaaS provider would have a control plane AWS account used to create and modify these tenant-specific accounts.

Single-tenant configuration

Figure 1. Single-tenant configuration

To transition to a multi-tenant pattern, the SaaS provider can use containerization to package each website, and a container orchestrator to deploy the websites across shared compute nodes (EC2 instances). Kubernetes can be employed as a container orchestrator, and a website would then be represented by a Kubernetes deployment and its associated pods. A Kubernetes namespace would serve as the logical encapsulation of the tenant-specific resources, as each tenant would be mapped to one Kubernetes namespace. The Kubernetes HorizontalPodAutoscaler can be utilized for autoscaling purposes, dynamically adjusting the number of replicas in the deployment on a given namespace based on workload demands.

When additional compute resources are required, tools such as the Cluster Autoscaler, or Karpenter, can dynamically add more EC2 instances to the shared Kubernetes Cluster. An ALB can be reused by multiple tenants to route traffic to the appropriate pods. For RDS, SaaS providers can use tenant-specific database schemas to separate tenant data. For static data, Amazon Elastic File System (EFS) and tenant-specific directories can be employed. The SaaS provider would still have a control plane AWS account that would now interact with the Kubernetes and AWS APIs to create and update tenant-specific resources.

This transition to a multi-tenant design utilizing Kubernetes, Amazon Elastic Kubernetes Service (EKS), and other managed services offers numerous advantages. It enables efficient resource utilization by leveraging containerization and auto-scaling capabilities, reducing costs, and optimizing performance (see Figure 2).

Multi-tenant configuration

Figure 2. Multi-tenant configuration

EKS cluster sizing and customer segmentation considerations in multi-tenancy designs

A high concentration of SaaS tenants hosted within the same system results in a large “blast radius.” This means a failure within the system has the potential to impact all resident tenants. This situation can lead to downtime for multiple tenants at once. To address this problem, SaaS providers are encouraged to partition their customers amongst multiple AWS accounts, each with their own deployments of this multi-tenant architecture. The number of tenants that can be present in a single cluster is a determination that can only be made by the SaaS provider after weighing the risks. Compare the shared fate of some subset of their customers, against the possible efficiency benefits of a multi-tenant architecture.

EKS security

SaaS providers must evaluate whether it’s appropriate for them to make use of containers as a workload isolation boundary. This is of particular importance in multi-tenant Kubernetes architectures, given that containers running on a single Amazon EC2 instance will share the underlying Linux kernel. Security vulnerabilities place this shared resource (the EC2 instance) at risk from attack vectors from the host Linux instance. Risk is elevated when any container running in a Kubernetes Pod cluster initiates untrusted code. This risk is heightened if SaaS providers permit tenants to “bring their code”. Kubernetes is a single tenant orchestrator, but with a multi-tenant approach to SaaS architectures, a single instance of the Amazon EKS control plane will be shared among all the workloads running within a cluster. Amazon EKS considers the cluster as the hard isolation security boundary. Every Amazon EKS managed Kubernetes cluster is isolated in a dedicated single-tenant Amazon VPC. At present, hard multi-tenancy can only be implemented by provisioning a unique cluster for each tenant.

EFS considerations

A SaaS provider may consider EFS as the storage solution for the static content of the multiple tenants. This provides them with a straightforward, serverless, and elastic file system. Directories may be used to separate the content for each tenant. While this approach of creating tenant-specific directories in EFS provides many benefits, there may be challenges harvesting per-tenant utilization and performance metrics. This can result in operational challenges for providers that need to granularly meter per-tenant usage of resources. Consequently, noisy neighbors will be difficult to identify and remediate. To resolve this, SaaS providers should consider building a custom solution to monitor the individual tenants in the multi-tenant file system by leveraging storage and throughput/IOPS metrics.

RDS considerations

Multi-tenant workloads, where data for multiple customers or end users is consolidated in the same RDS database cluster, can present operational challenges regarding per-tenant observability. Both MySQL Community Edition and open-source PostgreSQL have limited ability to provide per-tenant observability and resource governance. AWS customers operating multi-tenant workloads often use a combination of ‘database’ or ‘schema’ and ‘database user’ accounts as substitutes. AWS customers should use alternate mechanisms to establish a mapping between a tenant and these substitutes. This will give you the ability to process raw observability data from the database engine externally. You can then map these substitutes back to tenants, and distinguish tenants in the observability data.

Conclusion

In this blog, we’ve shown what to consider when moving to a multi-tenancy SaaS solution in the AWS Cloud, how to optimize your cloud-based SaaS design, and some challenges and remediations. Invest effort early in your SaaS design strategy to explore your customer requirements for tenancy. Work backwards from your SaaS tenants end goals. What level of computing performance do they require? What are the required cyber security features? How will you, as the SaaS provider, monitor and operate your platform with the target tenancy configuration? Your respective AWS account team is highly qualified to advise on these design decisions. Take advantage of reviewing and improving your design using the AWS Well-Architected Framework. The tenancy design process should be followed by extensive prototyping to validate functionality before production rollout.

Related information

New Seventh-Generation General Purpose Amazon EC2 Instances (M7i-Flex and M7i)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-seventh-generation-general-purpose-amazon-ec2-instances-m7i-flex-and-m7i/

Today we are launching Amazon Elastic Compute Cloud (Amazon EC2) M7i-Flex and M7i instances powered by custom 4th generation Intel Xeon Scalable processors available only on AWS, that offer the best performance among comparable Intel processors in the cloud – up to 15% faster than Intel processors utilized by other cloud providers. M7i-Flex instances are available in the five most common sizes, and are designed to give you up to 19% better price/performance than M6i instances for many workloads. The M7i instances are available in nine sizes (with two size of bare metal instances in the works), and offer 15% better price/performance than the previous generation of Intel-powered instances.

M7i-Flex Instances
The M7i-Flex instances are a lower-cost variant of the M7i instances, with 5% better price/performance and 5% lower prices. They are great for applications that don’t fully utilize all compute resources. The M7i-Flex instances deliver a baseline of 40% CPU performance, and can scale up to full CPU performance 95% of the time. M7i-Flex instances are ideal for running general purpose workloads such as web and application servers, virtual desktops, batch processing, micro-services, databases and enterprise applications. If you are currently using earlier generations of general-purposes instances, you can adopt M7i-Flex instances without having to make changes to your application or your workload.

Here are the specs for the M7i-Flex instances:

Instance Name vCPUs
Memory
Network Bandwidth
EBS Bandwidth
m7i-flex.large 2 8 GiB up to 12.5 Gbps up to 10 Gbps
m7i-flex.xlarge 4 16 GiB up to 12.5 Gbps up to 10 Gbps
m7i-flex.2xlarge 8 32 GiB up to 12.5 Gbps up to 10 Gbps
m7i-flex.4xlarge 16 64 GiB up to 12.5 Gbps up to 10 Gbps
m7i-flex.8xlarge 32 128 GiB up to 12.5 Gbps up to 10 Gbps

M7i Instances
For workloads such as large application servers and databases, gaming servers, CPU based machine learning, and video streaming that need the largest instance sizes or high CPU continuously, you can get price/performance benefits by using M7i instances.

Here are the specs for the M7i instances:

Instance Name vCPUs
Memory
Network Bandwidth
EBS Bandwidth
m7i.large 2 8 GiB up to 12.5 Gbps up to 10 Gbps
m7i.xlarge 4 16 GiB up to 12.5 Gbps up to 10 Gbps
m7i.2xlarge 8 32 GiB up to 12.5 Gbps up to 10 Gbps
m7i.4xlarge 16 64 GiB up to 12.5 Gbps up to 10 Gbps
m7i.8xlarge 32 128 GiB 12.5 Gbps 10 Gbps
m7i.12xlarge 48 192 GiB 18.75 Gbps 15 Gbps
m7i.16xlarge 64 256 GiB 25.0 Gbps 20 Gbps
m7i.24xlarge 96 384 GiB 37.5 Gbps 30 Gbps
m7i.48xlarge 192 768 GiB 50 Gbps 40 Gbps

You can attach up to 128 EBS volumes to each M7i instance; by way of comparison, the M6i instances allow you to attach up to 28 volumes.

We are also getting ready to launch two sizes of bare metal M7i instances:

Instance Name vCPUs
Memory
Network Bandwidth
EBS Bandwidth
m7i.metal-24xl 96 384 GiB 37.5 Gbps 30 Gbps
m7i.metal-48xl 192 768 GiB 50.0 Gbps 40 Gbps

Built-In Accelerators
The Sapphire Rapids processors include four built-in accelerators, each providing hardware acceleration for a specific workload:

  • Advanced Matrix Extensions (AMX) – This set of extensions to the x86 instruction set improve deep learning and inferencing, and support workloads such as natural language processing, recommendation systems, and image recognition. The extensions provide high-speed multiplication operations on 2-dimensional matrices of INT8 or BF16 values. To learn more, read Chapter 3 of the Intel AMX Instruction Set Reference.
  • Intel Data Streaming Accelerator (DSA) – This accelerator drives high performance for storage, networking, and data-intensive workloads by offloading common data movement tasks between CPU, memory, caches, network devices, and storage devices, improving streaming data movement and transformation operations. Read Introducing the Intel Data Streaming Accelerator (Intel DSA) to learn more.
  • Intel In-Memory Analytics Accelerator (IAA) – This accelerator runs database and analytic workloads faster, with the potential for greater power efficiency. In-memory compression, decompression, and encryption at very high throughput, and a suite of analytics primitives support in-memory databases, open source database, and data stores like RocksDB and ClickHouse. To learn more, read the Intel In-Memory Analytics Accelerator (Intel IAA) Architecture Specification.
  • Intel QuickAssist Technology (QAT) -This accelerator offloads encryption, decryption, and compression, freeing up processor cores and reducing power consumption. It also supports merged compression and encryption in a single data flow. To learn more start at the Intel QuickAssist Technology (Intel QAT) Overview.

Some of these accelerators require the use of specific kernel versions, drivers, and/or compilers.

The Advanced Matrix Extensions are available on all sizes of M7i and M7i-Flex instances. The Intel QAT, Intel IAA, and Intel DSA accelerators will be available on the m7i.metal-24xl and m7i.metal-48xl instances.

Details
Here are a couple of things to keep in mind about the M7i-Flex and M7i instances:

Regions – The new instances are available in the US East (Ohio, N. Virginia), US West (Oregon), and Europe (Ireland) AWS Regions, and we plan to expand to additional regions throughout the rest of 2023.

Purchasing Options – M7i-Flex amd M7i instances are available in On-Demand, Reserved Instance, Savings Plan, and Spot form. M7i instances are also available in Dedicated Host and Dedicated Instance form.

Jeff;

Prime Day 2023 Powered by AWS – All the Numbers

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/prime-day-2023-powered-by-aws-all-the-numbers/

As part of my annual tradition to tell you about how AWS makes Prime Day possible, I am happy to be able to share some chart-topping metrics (check out my 2016, 2017, 2019, 2020, 2021, and 2022 posts for a look back).

This year I bought all kinds of stuff for my hobbies including a small drill press, filament for my 3D printer, and irrigation tools. I also bought some very nice Alphablock books for my grandkids. According to our official release, the first day of Prime Day was the single largest sales day ever on Amazon and for independent sellers, with more than 375 million items purchased.

Prime Day by the Numbers
As always, Prime Day was powered by AWS. Here are some of the most interesting and/or mind-blowing metrics:

Amazon Elastic Block Store (Amazon EBS) – The Amazon Prime Day event resulted in an incremental 163 petabytes of EBS storage capacity allocated – generating a peak of 15.35 trillion requests and 764 petabytes of data transfer per day. Compared to the previous year, Amazon increased the peak usage on EBS by only 7% Year-over-Year yet delivered +35% more traffic per day due to efficiency efforts including workload optimization using Amazon Elastic Compute Cloud (Amazon EC2) AWS Graviton-based instances. Here’s a visual comparison:

AWS CloudTrail – AWS CloudTrail processed over 830 billion events in support of Prime Day 2023.

Amazon DynamoDB – DynamoDB powers multiple high-traffic Amazon properties and systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers. Over the course of Prime Day, these sources made trillions of calls to the DynamoDB API. DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 126 million requests per second.

Amazon Aurora – On Prime Day, 5,835 database instances running the PostgreSQL-compatible and MySQL-compatible editions of Amazon Aurora processed 318 billion transactions, stored 2,140 terabytes of data, and transferred 836 terabytes of data.

Amazon Simple Email Service (SES) – Amazon SES sent 56% more emails for Amazon.com during Prime Day 2023 vs. 2022, delivering 99.8% of those emails to customers.

Amazon CloudFront – Amazon CloudFront handled a peak load of over 500 million HTTP requests per minute, for a total of over 1 trillion HTTP requests during Prime Day.

Amazon SQS – During Prime Day, Amazon SQS set a new traffic record by processing 86 million messages per second at peak. This is 22% increase from Prime Day of 2022, where SQS supported 70.5M messages/sec.

Amazon Elastic Compute Cloud (EC2) – During Prime Day 2023, Amazon used tens of millions of normalized AWS Graviton-based Amazon EC2 instances, 2.7x more than in 2022, to power over 2,600 services. By using more Graviton-based instances, Amazon was able to get the compute capacity needed while using up to 60% less energy.

Amazon Pinpoint – Amazon Pinpoint sent tens of millions of SMS messages to customers during Prime Day 2023 with a delivery success rate of 98.3%.

Prepare to Scale
Every year I reiterate the same message: rigorous preparation is key to the success of Prime Day and our other large-scale events. If you are preparing for a similar chart-topping event of your own, I strongly recommend that you take advantage of AWS Infrastructure Event Management (IEM). As part of an IEM engagement, my colleagues will provide you with architectural and operational guidance that will help you to execute your event with confidence!

Jeff;

New – AWS Public IPv4 Address Charge + Public IP Insights

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-public-ipv4-address-charge-public-ip-insights/

We are introducing a new charge for public IPv4 addresses. Effective February 1, 2024 there will be a charge of $0.005 per IP per hour for all public IPv4 addresses, whether attached to a service or not (there is already a charge for public IPv4 addresses you allocate in your account but don’t attach to an EC2 instance).

Public IPv4 Charge
As you may know, IPv4 addresses are an increasingly scarce resource and the cost to acquire a single public IPv4 address has risen more than 300% over the past 5 years. This change reflects our own costs and is also intended to encourage you to be a bit more frugal with your use of public IPv4 addresses and to think about accelerating your adoption of IPv6 as a modernization and conservation measure.

This change applies to all AWS services including Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (RDS) database instances, Amazon Elastic Kubernetes Service (EKS) nodes, and other AWS services that can have a public IPv4 address allocated and attached, in all AWS regions (commercial, AWS China, and GovCloud). Here’s a summary in tabular form:

Public IP Address Type Current Price/Hour (USD) New Price/Hour (USD)
(Effective February 1, 2024)
In-use Public IPv4 address (including Amazon provided public IPv4 and Elastic IP) assigned to resources in your VPC, Amazon Global Accelerator, and AWS Site-to-site VPN tunnel No charge $0.005
Additional (secondary) Elastic IP Address on a running EC2 instance $0.005 $0.005
Idle Elastic IP Address in account $0.005 $0.005

The AWS Free Tier for EC2 will include 750 hours of public IPv4 address usage per month for the first 12 months, effective February 1, 2024. You will not be charged for IP addresses that you own and bring to AWS using Amazon BYOIP.

Starting today, your AWS Cost and Usage Reports automatically include public IPv4 address usage. When this price change goes in to effect next year you will also be able to use AWS Cost Explorer to see and better understand your usage.

As I noted earlier in this post, I would like to encourage you to consider accelerating your adoption of IPv6. A new blog post shows you how to use Elastic Load Balancers and NAT Gateways for ingress and egress traffic, while avoiding the use of a public IPv4 address for each instance that you launch. Here are some resources to show you how you can use IPv6 with widely used services such as EC2, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Kubernetes Service (EKS), Elastic Load Balancing, and Amazon Relational Database Service (RDS):

Earlier this year we enhanced EC2 Instance Connect and gave it the ability to connect to your instances using private IPv4 addresses. As a result, you no longer need to use public IPv4 addresses for administrative purposes (generally using SSH or RDP).

Public IP Insights
In order to make it easier for you to monitor, analyze, and audit your use of public IPv4 addresses, today we are launching Public IP Insights, a new feature of Amazon VPC IP Address Manager that is available to you at no cost. In addition to helping you to make efficient use of public IPv4 addresses, Public IP Insights will give you a better understanding of your security profile. You can see the breakdown of public IP types and EIP usage, with multiple filtering options:

You can also see, sort, filter, and learn more about each of the public IPv4 addresses that you are using:

Using IPv4 Addresses Efficiently
By using the new IP Insights tool and following the guidance that I shared above, you should be ready to update your application to minimize the effect of the new charge. You may also want to consider using AWS Direct Connect to set up a dedicated network connection to AWS.

Finally, be sure to read our new blog post, Identify and Optimize Public IPv4 Address Usage on AWS, for more information on how to make the best use of public IPv4 addresses.

Jeff;

New Amazon EC2 Instances (C7gd, M7gd, and R7gd) Powered by AWS Graviton3 Processor with Local NVMe-based SSD Storage

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-instances-c7gd-m7gd-and-r7gd-powered-by-aws-graviton3-processor-with-local-nvme-based-ssd-storage/

We launched Amazon EC2 C7g instances in May 2022 and M7g and R7g instances in February 2023. Powered by the latest AWS Graviton3 processors, the new instances deliver up to 25 percent higher performance, up to two times higher floating-point performance, and up to 2 times faster cryptographic workload performance compared to AWS Graviton2 processors.

Graviton3 processors deliver up to 3 times better performance compared to AWS Graviton2 processors for machine learning (ML) workloads, including support for bfloat16. They also support DDR5 memory that provides 50 percent more memory bandwidth compared to DDR4. Graviton3 also uses up to 60 percent less energy for the same performance as comparable EC2 instances, which helps you reduce your carbon footprint.

The C7g instances are well suited for compute-intensive workloads, such as high performance computing (HPC), batch processing, ad serving, video encoding, gaming, scientific modeling, distributed analytics, and CPU-based machine learning inference. The M7g instances are for general purpose workloads such as application servers, microservices, gaming servers, mid-sized data stores, and caching fleets. The R7g instances are a great fit for memory-intensive workloads such as open-source databases, in-memory caches, and real-time big data analytics.

Today, we’re adding a d variant to all three instance families. The new Amazon EC2 C7gd, M7gd, and R7gd instance types have NVM Express (NVMe) locally attached up to 2 x 1.9 TB SSD drives that are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the instance. These instances have up to 45 percent better real-time NVMe storage performance than comparable Graviton2-based instances.

These are a great fit for applications that need access to high-speed, low-latency local storage, including those that need temporary storage of data for scratch space, temporary files, and caches. The data on an instance store volume persists only during the life of the associated EC2 instance.

Here are the specs for these instances:

Instance Size vCPU Memory
(GiB)
Local NVMe Storage (GB) Network Bandwidth
(Gbps)
EBS Bandwidth
(Gbps)
C7gd/M7gd/R7gd C7gd/M7gd/R7gd C7gd/M7gd/R7gd
medium 1 2/ 4 / 8 1 x 59 Up to 12.5 Up to 10
large 2 4 / 8 / 16 1 x 118 Up to 12.5 Up to 10
xlarge 4 8 / 16 / 32 1 x 237 Up to 12.5 Up to 10
2xlarge 8 16 / 32 / 64 1 x 474 Up to 15 Up to 10
4xlarge 16 32 / 64 / 128 1 x 950 Up to 15 Up to 10
8xlarge 32 64 / 128 / 256 1 x 1900 15 10
12xlarge 48 96 / 192/ 384 2 x 1425 22.5 15
16xlarge 64 128 / 256 / 512 2 x 1900 30 20

These instances are built on the AWS Nitro System, a combination of AWS-designed dedicated hardware and a lightweight hypervisor that allows the delivery of isolated multitenancy, private networking, and fast local storage. They provide up to 20 Gbps Amazon Elastic Block Store (Amazon EBS) bandwidth and up to 30 Gbps network bandwidth. The 16xlarge instances also support Elastic Fabric Adapter (EFA) for applications that need a high level of inter-node communication.

Now Available
Amazon EC2 C7gd, M7gd, and R7gd instances are now available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and Europe (Ireland). As usual with Amazon EC2, you only pay for what you use. For more information, see the Amazon EC2 pricing page.

If you’re optimizing applications for Arm architecture, be sure to have a look at our Getting Started collection of resources or learn more about AWS Graviton3-based EC2 instances.

To learn more, visit our Amazon EC2 C7g instances, M7g instances or R7g instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

New – Amazon EC2 P5 Instances Powered by NVIDIA H100 Tensor Core GPUs for Accelerating Generative AI and HPC Applications

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-p5-instances-powered-by-nvidia-h100-tensor-core-gpus-for-accelerating-generative-ai-and-hpc-applications/

In March 2023, AWS and NVIDIA announced a multipart collaboration focused on building the most scalable, on-demand artificial intelligence (AI) infrastructure optimized for training increasingly complex large language models (LLMs) and developing generative AI applications.

We preannounced Amazon Elastic Compute Cloud (Amazon EC2) P5 instances powered by NVIDIA H100 Tensor Core GPUs and AWS’s latest networking and scalability that will deliver up to 20 exaflops of compute performance for building and training the largest machine learning (ML) models. This announcement is the product of more than a decade of collaboration between AWS and NVIDIA, delivering the visual computing, AI, and high performance computing (HPC) clusters across the Cluster GPU (cg1) instances (2010), G2 (2013), P2 (2016), P3 (2017), G3 (2017), P3dn (2018), G4 (2019), P4 (2020), G5 (2021), and P4de instances (2022).

Most notably, ML model sizes are now reaching trillions of parameters. But this complexity has increased customers’ time to train, where the latest LLMs are now trained over the course of multiple months. HPC customers also exhibit similar trends. With the fidelity of HPC customer data collection increasing and data sets reaching exabyte scale, customers are looking for ways to enable faster time to solution across increasingly complex applications.

Introducing EC2 P5 Instances
Today, we are announcing the general availability of Amazon EC2 P5 instances, the next-generation GPU instances to address those customer needs for high performance and scalability in AI/ML and HPC workloads. P5 instances are powered by the latest NVIDIA H100 Tensor Core GPUs and will provide a reduction of up to 6 times in training time (from days to hours) compared to previous generation GPU-based instances. This performance increase will enable customers to see up to 40 percent lower training costs.

P5 instances provide 8 x NVIDIA H100 Tensor Core GPUs with 640 GB of high bandwidth GPU memory, 3rd Gen AMD EPYC processors, 2 TB of system memory, and 30 TB of local NVMe storage. P5 instances also provide 3200 Gbps of aggregate network bandwidth with support for GPUDirect RDMA, enabling lower latency and efficient scale-out performance by bypassing the CPU on internode communication.

Here are the specs for these instances:

Instance
Size
vCPUs Memory
(GiB)
GPUs
(H100)
Network Bandwidth
(Gbps)
EBS Bandwidth
(Gbps)
Local Storage
(TB)
P5.48xlarge 192 2048 8 3200 80 8 x 3.84

Here’s a quick infographic that shows you how the P5 instances and NVIDIA H100 Tensor Core GPUs compare to previous instances and processors:

P5 instances are ideal for training and running inference for increasingly complex LLMs and computer vision models behind the most demanding and compute-intensive generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more. P5 will provide up to 6 times lower time to train compared with previous generation GPU-based instances across those applications. Customers who can use lower precision FP8 data types in their workloads, common in many language models that use a transformer model backbone, will see further benefit at up to 6 times performance increase through support for the NVIDIA transformer engine.

HPC customers using P5 instances can deploy demanding applications at greater scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling. Customers using dynamic programming (DP) algorithms for applications like genome sequencing or accelerated data analytics will also see further benefit from P5 through support for a new DPX instruction set.

This enables customers to explore problem spaces that previously seemed unreachable, iterate on their solutions at a faster clip, and get to market more quickly.

You can see the detail of instance specifications along with comparisons of instance types between p4d.24xlarge and new p5.48xlarge below:

Feature p4d.24xlarge p5.48xlarge Comparision
Number & Type of Accelerators 8 x NVIDIA A100 8 x NVIDIA H100
FP8 TFLOPS per Server 16,000 640% vs.A100 FP16
FP16 TFLOPS per Server 2,496 8,000
GPU Memory 40 GB 80 GB 200%
GPU Memory Bandwidth 12.8 TB/s 26.8 TB/s 200%
CPU Family Intel Cascade Lake AMD Milan
vCPUs 96  192 200%
Total System Memory 1152 GB 2048 GB 200%
Networking Throughput 400 Gbps 3200 Gbps 800%
EBS Throughput 19 Gbps 80 Gbps 400%
Local Instance Storage 8 TBs NVMe 30 TBs NVMe 375%
GPU to GPU Interconnect 600 GB/s 900 GB/s 150%

Second-generation Amazon EC2 UltraClusters and Elastic Fabric Adaptor
P5 instances provide market-leading scale-out capability for multi-node distributed training and tightly coupled HPC workloads. They offer up to 3,200 Gbps of networking using the second-generation Elastic Fabric Adaptor (EFA) technology, 8 times compared with P4d instances.

To address customer needs for large-scale and low latency, P5 instances are deployed in the second-generation EC2 UltraClusters, which now provide customers with lower latency across up to 20,000+ NVIDIA H100 Tensor Core GPUs. Providing the largest scale of ML infrastructure in the cloud, P5 instances in EC2 UltraClusters deliver up to 20 exaflops of aggregate compute capability.

EC2 UltraClusters use Amazon FSx for Lustre, fully managed shared storage built on the most popular high-performance parallel file system. With FSx for Lustre, you can quickly process massive datasets on demand and at scale and deliver sub-millisecond latencies. The low-latency and high-throughput characteristics of FSx for Lustre are optimized for deep learning, generative AI, and HPC workloads on EC2 UltraClusters.

FSx for Lustre keeps the GPUs and ML accelerators in EC2 UltraClusters fed with data, accelerating the most demanding workloads. These workloads include LLM training, generative AI inferencing, and HPC workloads, such as genomics and financial risk modeling.

Getting Started with EC2 P5 Instances
To get started, you can use P5 instances in the US East (N. Virginia) and US West (Oregon) Region.

When launching P5 instances, you will choose AWS Deep Learning AMIs (DLAMIs) to support P5 instances. DLAMI provides ML practitioners and researchers with the infrastructure and tools to quickly build scalable, secure distributed ML applications in preconfigured environments.

You will be able to run containerized applications on P5 instances with AWS Deep Learning Containers using libraries for Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service  (Amazon EKS).  For a more managed experience, you can also use P5 instances via Amazon SageMaker, which helps developers and data scientists easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up clusters and data pipelines. HPC customers can leverage AWS Batch and ParallelCluster with P5 to help orchestrate jobs and clusters efficiently.

Existing P4 customers will need to update their AMIs to use P5 instances. Specifically, you will need to update your AMIs to include the latest NVIDIA driver with support for NVIDIA H100 Tensor Core GPUs. They will also need to install the latest CUDA version (CUDA 12), CuDNN version, framework versions (e.g., PyTorch, Tensorflow), and EFA driver with updated topology files. To make this process easy for you, we will provide new DLAMIs and Deep Learning Containers that come prepackaged with all the needed software and frameworks to use P5 instances out of the box.

Now Available
Amazon EC2 P5 instances are available today in AWS Regions: US East (N. Virginia) and US West (Oregon). For more information, see the Amazon EC2 pricing page. To learn more, visit our P5 instance page and explore AWS re:Post for EC2 or through your usual AWS Support contacts.

You can choose a broad range of AWS services that have generative AI built in, all running on the most cost-effective cloud infrastructure for generative AI. To learn more, visit Generative AI on AWS to innovate faster and reinvent your applications.

Channy

How to scan EC2 AMIs using Amazon Inspector

Post Syndicated from Luke Notley original https://aws.amazon.com/blogs/security/how-to-scan-ec2-amis-using-amazon-inspector/

Amazon Inspector is an automated vulnerability management service that continually scans Amazon Web Services (AWS) workloads for software vulnerabilities and unintended network exposure. Amazon Inspector supports vulnerability reporting and deep inspection of Amazon Elastic Compute Cloud (Amazon EC2) instances, container images stored in Amazon Elastic Container Registry (Amazon ECR), and AWS Lambda functions. Operating system and programming language support is extensive, ranging from Bottlerocket to Windows Server.

Many customers use Amazon EC2 Auto Scaling groups as part of their resilience and scaling architecture for their workloads. With Auto Scaling groups, you can scale and deploy rapidly by using Amazon Machine Images (AMIs). However, AMIs within your environment can quickly become outdated as new vulnerabilities are discovered. A security best practice is to perform routine vulnerability assessments of your AMIs to identify whether newfound vulnerabilities apply to them. If you identify a vulnerability, you can update the AMI with the appropriate security patches, test the AMI in lower environments, and deploy the updated AMI in your environment. At this time, Amazon Inspector only supports scanning of running EC2 instances.

In this blog post, we’ll share a solution that you can use with Amazon EventBridge, AWS Lambda, AWS Step Functions, Amazon Simple Notification Service (Amazon SNS)­­, and Amazon Simple Storage Service (Amazon S3) to scan AMIs and generate Amazon Inspector finding reports to help ensure that your AMIs are scanned for known vulnerabilities and updated prior to deployment. Then, we will show you how to periodically scan selected EC2 AMIs based on a tagging strategy, and take automated actions.

Prerequisites

The solution provided in this post has a number of items that you will need to review and address before you deploy the solution:

  1. Make sure that the AMI to be scanned by Amazon Inspector is based from one of the operating systems that AWS supports for EC2 scanning.
  2. To successfully complete a scan, Amazon Inspector requires the EC2 instance to be a managed instance in AWS Systems Manager that has the Systems Manager Agent installed and running, and has an attached AWS Identity and Access Management (IAM) instance profile that allows Systems Manager to manage the instance. For more information, see Scanning Amazon EC2 instances with Amazon Inspector.
  3. If you use customer managed keys to encrypt Amazon Elastic Block Store (Amazon EBS) volumes and you have a default EC2 configuration set to encrypt EBS volumes, you will need to configure additional key policy permissions. For the customer managed key that encrypts EBS volumes, add the following example policy statement to the key policy. Make sure to replace <111122223333> with your own AWS account ID.
    {
                "Sid": "Allow use of the key by AMI Scanner State Machine",
                "Effect": "Allow",
                "Principal": {
                    "AWS": "arn:aws:iam:: <111122223333>:role/service-role/AMIScanner-Statemachine-role"
                },
                "Action": [
                    "kms:Encrypt",
                    "kms:Decrypt",
                    "kms:ReEncrypt*",
                    "kms:GenerateDataKey*",
                    "kms:DescribeKey"
                ],
                "Resource": "*"
            },

    If you don’t add this additional policy, the Step Functions state machine won’t allow the EC2 instances to launch. For more information, see Key policy sections that allow access to the customer managed key.

  4. The solution in this blog post requires that you activate Amazon Inspector in your AWS account. If you haven’t activated Amazon Inspector yet, learn more about the free trial and pricing, and follow the steps in the Amazon Inspector documentation to set up the service and start monitoring your account. Alternatively, you can activate Amazon Inspector by using the AWS Command Line Interface (AWS CLI) and this GitHub example.

Solution overview and architecture

In this solution, you will use the follow AWS services and features:

  • Task orchestration
    • AWS Step Functions state machine workflows are used in this solution to verify that conditions are successfully validated before moving to the next task. This helps ensure that the Amazon Inspector scanning of the temporary instance launched in the first state machine is completed before the second state machine starts. This can help reduce the overall cost of the solution and can help prevent the first state machine from reaching state transition limitations.
    • Lambda functions handle the logic for retrieving AMIs to be scanned, launching temporary instances, creating Amazon EventBridge rules, tagging AMIs, and exporting Amazon Inspector reports to Amazon S3.
  • AMI tagging
    • To use this solution, you need to tag the AMIs that Amazon Inspector will scan, because a Lambda function will use these tags to start the solution orchestration. For this post, we use the tag InspectorScan with a value of true. With AMI tagging, you can configure automated processes as part of your deployment pipelines to implement the tagging.
  • Storage of exported Amazon Inspector findings
    • Amazon S3 helps you store the exported Amazon Inspector findings report and use them in a standardized format for multiple use cases across AWS services, or use Amazon Athena to query the reports, which we will cover later in the post. Each scanned report is stored in the S3 bucket and is named in the form AMI-NAME/guid.JSON or AMI-NAME/guid.CSV, depending on the export format that you specify.
    • You can also use S3 event notifications to alert different operational teams that there are Amazon Inspector scan results that require review.
  • Encryption of Amazon Inspector findings reports
    • AWS Key Management Service (AWS KMS) is used to encrypt the findings report. The AWS KMS key used must be a customer managed, symmetric KMS encryption key, and importantly, the key must be in the same AWS Region as the S3 bucket that you configured to store the report. The solution in this post creates a new KMS key, as well as a key policy that is configured to grant permissions for Amazon Inspector to use the key.
  • Event tracking and scheduling
    • This solution uses an Amazon EventBridge rule to listen for completed Amazon Inspector scan events for each temporary EC2 instance launch. When the EventBridge rule finds a matched event, the rule passes the required parameters and invokes the second Step Functions state machine. The event pattern used in this solution uses the following format:
      	{
      			"source": ["aws.inspector2"],
      			"detail-type": ["Inspector2 Scan"],
      			"resources": ["i-abcdef01234567890"]
      		}

    • You can schedule the AMI scanning by using an EventBridge rule that invokes a Lambda function that runs on a schedule. The Lambda function uses a cron expression to occur weekly. You can configure this parameter according to your requirements. Initially, this rule will be disabled to allow you to configure and enable the rule at a later stage.
    • Amazon SNS sends notifications during the AMI scanning solution process. From the SNS topic, you can configure different subscriptions, depending on your preferred use case and environment. An example of a subscription could be a shared mailbox email address for the security team or incident ticketing system.

Figure 1 shows the solution architecture.

Figure 1: Amazon Inspector scanning of an AMI

Figure 1: Amazon Inspector scanning of an AMI

The high-level workflow of the solution is as follows:

  1. You can use EventBridge to create a scheduled rule to invoke a Lambda function. You can set the rule for daily, weekly, or monthly, depending on your use case.
  2. The Lambda function searches for AMIs with the appropriate tags and passes these as parameters to the Step Functions workflow.
  3. The first Step Functions state machine is invoked for each AMI to be scanned.
  4. The first Step Functions workflow deploys a temporary EC2 instance from the AMI that is defined.
  5. A Lambda function is invoked to create an EventBridge rule.
  6. An EventBridge rule is created to listen for the successful Amazon Inspector scanned event of the temporary EC2 instance.
  7. A Lambda function is invoked to tag the EC2 instance.
  8. The temporary EC2 instance is tagged, showing Amazon Inspector that scanning is in progress.
  9. The first Step Functions workflow sends a notification to an SNS topic.
  10. The EventBridge rule parses the required parameters and invokes the second Step Functions state machine.
  11. A Lambda function is invoked to generate an Amazon Inspector report and export the findings to an S3 bucket.
  12. The scanned Amazon Inspector AMI results are saved to an S3 bucket.
  13. The Step Functions workflow terminates the temporary EC2 instance that can reduce cost and clean up the process.
  14. A Lambda function is invoked to delete the temporary EventBridge rule.
  15. The temporary EventBridge rule and targets are deleted.
  16. A Lambda function is invoked to tag the AMI.
  17. The scanned AMI is updated with tagging metadata.
  18. The second Step Functions workflow sends a final notification to an SNS topic.

Deploy the solution

The solution will be deployed with the scheduled rule in Amazon EventBridge disabled to allow you to create your tagging strategy and to familiarize yourself with the solution. Later in this post, we’ll cover how to enable the Amazon EventBridge scheduled rule.

Step 1: Deploy the CloudFormation template

For this next step, make sure that you deploy the CloudFormation template provided for multi-AMI scanning in the AWS account and Region where you want to test this solution.

To deploy the CloudFormation template

  1. Choose the following Launch Stack button to launch a CloudFormation stack in your account. Note that the stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, download the solution’s CloudFormation template, modify it, and deploy it to the selected Region.

    Select this image to open a link that starts building the CloudFormation stack

    Make sure that you configure the following parameters in the CloudFormation template so that it deploys successfully:

    • AMITagName — The AMI tag name to check if the AMI should be scanned by Amazon Inspector.
    • AMITagValue — The AMI tag value to check if the AMI should be scanned by Amazon Inspector.
    • InspectorReportFormat — The report format, which can be either CSV or JSON.
    • InstanceSubnetID — The subnet ID to launch the temporary EC2 instance into.
    • InstanceType — The instance type to deploy the AMI to for temporary scanning purposes.
    • KmsKeyAdministratorRole — The existing IAM role that needs to have administrator access to the KMS key created for the solution. This key provides access to encrypt and decrypt the Amazon Inspector report.
    • S3ReportBucketName — The name of the S3 bucket to be created.
    • SnsTopic — The name of the new SNS topic to be created. This name defines the SNS topic that notifications are published to.
  2. Review the stack name and the parameters for the template.
  3. On the Quick create stack screen, scroll to the bottom and select I acknowledge that AWS CloudFormation might create IAM resources.
  4. Choose Create stack. The deployment of the CloudFormation stack will take 3–4 minutes.

After the CloudFormation stack has deployed successfully, you can use the deployed solution.

Step 2: Manually run the first Step Functions workflow

The first Step Functions state machine requires parameters to be passed in; the SingleAMI Lambda function accomplishes this. You can start the Lambda function by creating a test event and passing the correct JSON text and parameters. The following parameters are available in the output section of the CloudFormation stack that the solution deployed:

  • AmiId — The ID of the AMI to be used for deploying the EC2 instance. This is the EC2 AMI to be scanned.
  • EC2InstanceProfile — The Amazon Resource Name (ARN) of the EC2 instance profile that the CloudFormation stack created.
  • InstanceType — The type of EC2 instance to use for deployment.
  • KmsKeyName — The ARN of the KMS key to be used for encrypting and decrypting the Amazon Inspector report that the CloudFormation stack created.
  • S3Bucket — The name of the S3 bucket to which the Amazon Inspector reports will be exported. The S3 bucket was created previously by the CloudFormation stack.
  • S3ReportFormat — The report format that Amazon Inspector will use to export the findings report; either the JSON or the CSV format is valid.
  • SnsTopc — The ARN of the SNS topic to which notifications will be sent. This SNS topic was created previously by the CloudFormation stack.
  • StateMachineArn — The ARN of the first Step Functions state machine, which the Lambda function will run first.
  • SubnetId — The ID of the VPC subnet to which the EC2 instance will be attached and launched into. This is a required parameter and could be a subnet that is created specifically for this scanning purpose.

The following is an example parameter configuration and JSON that you can use to run the Lambda function. Make sure to replace each <user input placeholder> with your own information.

{
"AmiId" : "<AMI-ABCDEF01234567890>",
"Ec2InstanceProfile" : "arn:aws:iam::<111122223333>:instance-profile/Ec2InstanceLaunchRole",
"InstanceType" : "t3.medium",
"KMSKeyName" : "arn:aws:kms:region-name:<111122223333>:key/<a1b2c3d4-5678-90ab-cdef-EXAMPLE11111>",
"S3Bucket" : "<DOC-EXAMPLE-BUCKET-111122223333>",
"S3ReportFormat" : "CSV",
"SnsTopic" : "arn:aws:sns:region-name-2:<111122223333>:InspectorScanner",
"StateMachine": "arn:aws:states:region-name:<111122223333>:stateMachine:AMIScanner-Part1-LaunchEC2",
"SubnetId" : "<SUBNET-ABCDEF01234567890>"
}

After the first state machine is finished, the EventBridge rule listens for the successful Amazon Inspector scan event. An SNS notification is sent, similar to the following.


{"AWS Inspector AMI Scan status":"EC2 instance","For AMI":"ami-abcdef01234567890","Temporarily launched AMI using instance":"i-abcdef01234567890"}

After Amazon Inspector has finished scanning the EC2 instance, and the second state machine completes successfully, the Amazon Inspector finding report appears in the S3 bucket and notifications appear on the SNS topic that was created. The following is an example of an SNS notification.


{"AWS Inspector AMI Scan completed":"Successfully","For AMI":"ami-abcdef01234567890","AWS Inspector report located at S3 Bucket":"DOC-EXAMPLE-BUCKET-111122223333","Temporarily launched AMI using instance":"i-abcdef01234567890"}

Enable scheduled scanning

You can enable the EventBridge scheduled rule to handle multiple AMIs and automatic scheduling. The scheduled rule invokes a Lambda function on a scheduled basis that identifies AMIs with the appropriate tags and passes parameters to the Step Functions workflow.

To enable the rule

  • In the EventBridge rules console, navigate to AMIScanner-ScheduledSolutionTask, and choose Enable.
    Figure 2: Enable Amazon EventBridge scheduled rule

    Figure 2: Enable Amazon EventBridge scheduled rule

Extend the solution

With Amazon Athena, you can run SQL queries on raw data that is stored in S3 buckets. The Amazon Inspector reports are exported to S3, and you can query the data and create tables by using AWS Glue crawlers. To make sure that AWS Glue can crawl the S3 data, you need to add the role that you create for AWS Glue to the AWS KMS key permissions, so that AWS Glue can decrypt the S3 data. The following is an example policy JSON that you can update. Make sure to replace the AWS account ID <111122223333> and S3 bucket name <DOC-EXAMPLE-BUCKET-111122223333> with your own information.

{
"Sid": "Allow the AWS Glue crawler usage of the KMS key",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<111122223333>:role/service-role/AWSGlueServiceRole-S3InspectorReports"
},
"Action": [
"kms:Decrypt",
"kms:GenerateDataKey*"
],
"Resource": "arn:aws:s3:::<DOC-EXAMPLE-BUCKET-111122223333>"
},

After an AWS Glue Data Catalog has been built, you can run the crawler on a scheduled basis to help keep the catalog up to date with the latest Amazon Inspector findings as they are exported into the S3 bucket.

Using Amazon Athena, you can run queries against the Amazon Inspector reports to generate output data that is relevant to your environment. For example, to list the AMIs that are affected by high-severity findings, you can run the following SQL query. Make sure to replace <DOC-EXAMPLE-BUCKET-111122223333> with your own information.

SELECT DISTINCT partition_0 from "<DOC-EXAMPLE-BUCKET-111122223333>" where severity='HIGH'

With the results, you can use AWS Systems Manager to update the relevant AMIs to include the latest patches and update the launch template used in your Auto Scaling groups.

To further extend this solution, you can also use Amazon QuickSight to visualize the data by connecting to the AWS Glue table and producing dashboards for consumption.

Conclusion

By performing security assessments of your AMIs on a regular basis, you can gain greater visibility and control over the security of your EC2 instances that are created from those AMIs. In this blog post, you learned how to set up AMI vulnerability assessments, and how the results of these continuous vulnerability assessments can help you keep your environment up to date with security patches. For additional hands-on walkthroughs for Amazon Inspector, see Amazon Inspector workshops. You can find the code for this blog post in the inspector-ami-scanning-solution GitHub repository.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Luke Notley

Luke Notley

Luke is a Senior Solutions Architect with Amazon Web Services and is based in Western Australia. Luke has a passion for helping customers connect business outcomes with technology and assisting customers throughout their cloud journey, helping them design scalable, flexible, and resilient architectures. In his spare time, he enjoys traveling, coaching basketball teams, and DJing.

Mixing AWS Graviton with x86 CPUs to optimize cost and resiliency using Amazon EKS

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/mixing-aws-graviton-with-x86-cpus-to-optimize-cost-and-resilience-using-amazon-eks/

This post is written by Yahav Biran, Principal SA, and Yuval Dovrat, Israel/Iberia Head Compute SAs.

Introduction

Deploying applications on a mixed CPUs architecture allows for a wider selection of available Amazon Elastic Compute Cloud (Amazon EC2) capacity pools. This enhances your applications’ resiliency, optimizes your costs, and enables a seamless transition between architectures.

Using AWS Graviton-based Amazon EC2 instances can deliver up to 40% better price performance compared to x86-based Amazon EC2 instances. However, adopting a new CPU architecture can be challenging. The challenges include, but not limited to, code compatibility, application performance, and mixing new instance types to an active deployment. These may impact time-to-market and service interruption, both of which affect the end user-experience.

In this blog, we will show you how to integrate AWS Graviton-based Amazon EC2 instances into an existing Amazon Elastic Kubernetes Service (Amazon EKS) environment. We will start with resources to help you with code migration to AWS Graviton-based instances. Then, we will use Karpenter, an open-source node provisioning project, to enable automatic compute provisioning; and AWS Load Balancer Controller to simplify the application’s CPU mix. Using a sample Python-based web application, we are going to build and deploy two multi-architecture configurations. Both configurations will use AWS Graviton-based Amazon EC2 instances and x86-based Amazon EC2 instances.

The first configuration (shown in Figure 1 below) is the “A/B Configuration”. Here, we set the ALB ingress to manually control the amount of traffic flow to each instance type, and we set Karpenter to provision nodes based on traffic’s demand. We will use this configuration to help us test, measure, and compare the application’s performance on both CPU types before we deploy the multi-architecture solution to the end users.

A/B Configuration: Mix CPU architecture with ALB ingress connected to two separate CPU services

Figure 1, config I: A/B Configuration

In our second configuration, the “Karpenter Controlled Configuration” (shown in Figure 2 below as Config II), we allow Karpenter to automatically control the instances blend. This Karpenter settings optimize availability and costs. Karpenter is set to use weighted provisioners with values that prioritize AWS Graviton-based Amazon EC2 instances over x86-based Amazon EC2 instances.

Karpenter Controlled Configuration: Mix CPU architecture with ALB ingress connected to a single service

Figure 2, config II:  Karpenter Controlled Configuration, with Weighting provisioners topology

From our experience, we recommend that you build the first configuration to experiment and test AWS Graviton-based Amazon EC2 instances in a multi-architecture environment. Once you determined that a multi-architecture solution is suitable for your applications, you can build the second configuration to realize the full potential of Karpenter.

Code Migration to AWS Graviton

We start the migration process by opting-in to AWS Compute Optimizer. Compute Optimizer helps you find workloads that will deliver the biggest returns for the smallest migration effort.

Once we find these workloads, we will start migrating our code from x86-based Amazon EC2 instances to AWS Graviton-based Amazon EC2 instances. AWS has multiple resources to help you migrate your code. These include AWS Graviton Fast Start Program, AWS Graviton Technical Guide GitHub Repository, AWS Graviton Transition Guide, and Porting Advisor for Graviton.

From our experience in migrating open-source or self-managed applications, we learned that the migration requires developers to assess their codes and the accompanying packages & libraries. To accelerate the migration process, we recommend using Porting Advisor for Graviton. In addition to Porting Advisor for Graviton, AWS offers programming language-specific considerations and platforms such as PyTorch and TensorFlow.

After completing the code migration, we will now recompile the code for Arm architecture so that it is compatible with AWS Graviton-based Amazon EC2 instances. Then we will create a Docker image package followed by a Docker manifest.

The Docker manifest needs to point to the relevant Docker image for the target CPU architecture. We will show you how to modify your existing continuous integration pipeline with Docker multi-platform images. This will allow your Docker image to run on any CPU with your continuous delivery system.

To simplify the process, we recommend avoiding platform-specific software packages settings such as amd64 and aarch64.

For example, use:

RUN pip install awscli

instead of:

ARG ARCH=aarch64 or amd64
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux${ARCH}.zip" -o "awscliv2.zip"
RUN unzip awscliv2.zip
RUN ./aws/install

To automate the Docker image build process, we use AWS CodePipeline. AWS CodePipeline starts by building a Docker image from the code in AWS CodeBuild that is pushed to Amazon Elastic Container Registry (Amazon ECR). You can also use Amazon CodeCatalyst, as described in this blog post, to achieve the same task.

Our current build process creates docker images for x86-based Amazon EC2 instances. We will add an additional build process to create docker images for AWS Graviton-based Amazon EC2 instances (Figure 3). This new build process creates a docker manifest that references both Docker images. The docker client (the K8s node that hosts the container app) will pull the platform specific image in deploy-time.

Application build process where CodePipeline splits into two CPU builds before both builds are pushed into a single ECR

Figure 3 Application build process

Application Deployment

As stated earlier, we are going to build and deploy two multi-architecture configurations:

  1. The A/B Configuration with controlled topology
  2. The Karpenter Controlled Configuration with a weighting provisioners topology that favors AWS Graviton-based Amazon EC2 instances while allowing x86-based Amazon EC2 instances for resiliency

Config I – A/B controlled topology

We use this topology to measure performance for each CPU variant (though it can be utilized for other use-cases like gradual migrations). As shown in Figure 1, our topology has a single endpoint that equally splits the load between an AWS Graviton-based Amazon EC2 instances target group, and an x86-based Amazon EC2 instances target group.

After we create the topology, we will add Amazon CloudWatch Container Insights on Amazon EKS. This allows us to measure application performance on AWS Graviton-based Amazon EC2 instances and x86-based Amazon EC2 instances. We use two dedicated Karpenter provisioners and K8s deployments (replica sets) to separate AWS Graviton-based Amazon EC2 instances nodes from x86-based Amazon EC2 instances nodes. We also use nodeSelector to delineate AWS Graviton-based pods from x86-based pods.

We limit each provisioner to a single processor architecture, using the kubernetes.io/arch key:

The x86-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "6"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In 
    values:
    - "2"
  - key: karpenter.sh/capacity-type
    operator: In
    values: 
    - on-demand
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: default
  ttlSecondsAfterEmpty: 30 

The AWS Graviton-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: arm64provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "7"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In
    values:
    - "2"
  - key: karpenter.sh/capacity-type
    operator: In
    values: 
    - on-demand
  - key: kubernetes.io/arch
    operator: In
    values:
    - arm64
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: default
  ttlSecondsAfterEmpty: 30

Note: We want to use similar machine types for accurate comparison. To do so, we requested two vCPUs from comparable same generation instances. We will remove these constraints in the next topology to allow more allocation flexibility.

Next, we will deploy two K8s Deployments that uses nodeSelector to instructs the scheduler to point AWS Graviton pods to AWS Graviton-based Amazon EC2 instances, and x86 pods to x86-based Amazon EC2 instances. Similarly, we will create two K8s NodePort Services and a K8s Ingress that route http requests from the load balancer endpoint to the ingress.

Afterwards, we added Application Load Balancer with the AWS Load Balancer Controller to spread the load among the AWS Graviton-based Amazon EC2 instances and the x86-based Amazon EC2 instances compute pools. We control the traffic routing with the alb.ingress.kuberenetes.io/actions.weighted-routing annotation. In the annotation link, you will find that we split the traffic 50:50. You can tweak the weight in the code snippet below for your specific needs.

…
alb.ingress.kubernetes.io/actions.weighted-routing: | 
{
  "type":"forward",
  "forwardConfig":{
  "targetGroups":[
    {
      "serviceName":"armsimplemultiarchapp-svc",
      "servicePort":"80","weight":50
    },
    {
      "serviceName":"amdsimplemultiarchapp-svc",
      "servicePort":"80","weight":50}]
    }
 }

spec:
  ingressClassName: alb
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: weighted-routing
                port:
                  name: use-annotation

The last step is to generate the test load. We created a loader app that issues request to our example application and pointed the app to our ALB endpoint. Our load pattern is a sinusoidal pattern to simulate a variable load and measure its performance. Amazon CloudWatch populates ALB target-group request/second metric per HTTP response code to assess application throughput and CPU usage.

While analyzing our results, we want to verify that:

  1. Both AWS Graviton-based Amazon EC2 instances and x86-based Amazon EC2 instances pods are processing the same amount of traffic
  2. The AWS Graviton-based Amazon EC2 instances pods are using less CPU utilization to perform the task.

The orange (Graviton) and blue (x86) curves of HTTP 2XXs – USE1 (figure 4) are overlapping which indicate that both pods processed the same amount of HTTP requests/seconds.

Two curves with different colors overlapping each other, indicating both Graviton and x86-based pods are processing the same amount of HTTP requests/seconds

Figure 4 HTTP 2XXs – USE1

Also, the red (Graviton) and purple (x86) curves in Network RX – USE1 (figure 5) and Network TX – USE1 (figure 6) are overlapping which indicate that both Graviton and x86 instances ingest and process the same amount of data:

Two curves with different colors overlapping each other, indicating both Graviton and x86-based pods are ingesting and processing the same amount of network RXFigure 5 Network RX – USE1

Two curves with different colors overlapping each other, indicating both Graviton and x86-based pods are ingesting and processing the same amount of network TX

Figure 6 Network TX – USE1

We then measured the CPU usage (Pod CPU Utilization – USE1, figure 7) for each application variant: AWS Graviton in green and x86 in orange. We observed a higher CPU utilization on x86-based Amazon EC2 instances (83%), compared to AWS Graviton-based Amazon EC2 instances (61%).

Two curves with different colors to indicate CPU utilization of each pod. The Graviton-based pods have an average of 61% CPU utilization while the x86-based pods have an average of 83% CPU utilization.

Figure 7 Pod CPU Utilization – USE1

Based on these results, we found that pods with AWS Graviton-based Amazon EC2 instances requires 36% fewer EC2 instances than those with x86-based Amazon EC2 instances for the same application throughput. Fewer instances may lead to not only cost reductions, but also a reduced carbon footprint for the workload.

Config II – Karpenter Controlled Configuration

Once you determined AWS Graviton-based Amazon EC2 instances performance is suitable for your workload, it is time to simplify the application deployment topology with Karpenter weighted provisioners. We will reuse the CPU-based provisioners from the previous topology, but assign a higher .spec.weight to the AWS Graviton-based Amazon EC2 instances provisioner spec. This resulted in Karpenter prioritizing AWS Graviton-based instances, while preserving the ability to launch x86-based instances as needed.

The x86-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: 
    - on-demand
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64
  providerRef:
    name: default
  ttlSecondsAfterEmpty: 30 

The AWS Graviton-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: priority-arm64provisioner
spec:
  weight: 10
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: 
    - on-demand
  - key: kubernetes.io/arch
    operator: In
    values:
    - arm64
  providerRef:
    name: default
  ttlSecondsAfterEmpty: 30

The above configuration lets Karpenter choose any instance size or type. For optimal cluster utilization, we recommend omitting smaller instances by adding the following to the provisioners configurations:

  - key: karpenter.k8s.aws/instance-size
    operator: NotIn
    values:
    - nano
    - micro
    - small
    - medium

Next, we will unify the two K8s deployments (Figure 1) into a single K8s deployment by removing the manual mapping between pods and CPU-specific instances. To do so, we remove the nodeSelector section of the deployments’ configuration, which yield a single deployment (Figure 2).

In addition, we consolidate the two services into a single K8s service (single ALB target group).

To do so, we remove the actions.weighted-routing, and create a single target group with the following config:

spec:
  rules:
    - http:
        paths:
          - path: /app
            pathType: Prefix
            backend:
              service:
                name: simplemultiarchapp-svc
                port:
                  number: 80

Now your application can use both CPU architectures. Karpenter will scale your infrastructure based on your application’s demand by using AWS Graviton-based Amazon EC2 instances and x86-based Amazon EC2 instances.

We simulated load to the weighted provisioner topology (Figure 8), to show how the Karpenter prioritization works. We configured a synthetic limit on AWS Graviton-based Amazon EC2 instances’ CPU capacity (Provisioner.limits.resources.cpu) and used the same load simulator from the first configuration. We measured the total application throughput (figure 8) denoted by aarch64_200 (blue) and x86_64_200 (orange). We counted the requests, and observed Karpenter behaved as expected: prioritizing AWS Graviton-based Amazon EC2 instances, and bursting to x86-based Amazon EC2 instances when the we cross the CPU limits.Two curves with different colors to indicate instance count. The curves show Graviton as baseline and bursting with x86.”

Figure 8 Config II – Weighting provisioners results

In the weighted provisioner experiment (Figure 2), we showed how we set for 100% Graviton usage for cost optimization while allowing our workload to use on x86 when necessary.

Conclusion

Deploying applications on a mixed CPUs architecture enhances your applications’ resiliency, optimizes your costs, and enables a seamless transition between architectures. This blog provides you with the process to test and build multi-architecture solution using AWS Graviton-based Amazon EC2 instances. We discussed resources to help you migrate to AWS Graviton-based Amazon EC2 instances and offered two multi-architecture configurations: one to allow you to evaluate AWS Graviton-based Amazon EC2 instances, and another for a production environment with x86-based Amazon EC2 instances. You can reuse these configurations for any AWS Graviton-based Amazon EC2 instances, and other instance types & sizes. Finally, we also include an open-sourced step-by-step guide on GitHub for you to try or practice the example app deployment

Validating attestation documents produced by AWS Nitro Enclaves

Post Syndicated from maceneff original https://aws.amazon.com/blogs/compute/validating-attestation-documents-produced-by-aws-nitro-enclaves/

This blog post is written by Paco Gonzalez Senior EMEA IoT Specialist SA.

AWS Nitro Enclaves offers an isolated, hardened, and highly constrained environment to host security-critical applications. Think of AWS Nitro Enclaves as regular Amazon Elastic Compute Cloud (Amazon EC2) virtual machines (VMs) but with the added benefit of the environment being highly constrained.

A great benefit of using AWS Nitro Enclaves is that you can run your software as if it was a regular EC2 instance, but with no persistent storage and limited access to external systems. The only way to communicate with AWS Nitro Enclaves is using a VSOCK socket. This special type of communication mechanism acts as an isolated communication channel between the parent EC2 instance and AWS Nitro Enclaves.Diagram that shows how Nitro Enclaves uses the proven isolation of the Nitro Hypervisor to further isolate the CPU and memory of the Nitro Enclaves from users, applications, and libraries on the parent instance.

 Fig 1 – AWS Nitro Enclaves uses the proven isolation of the Nitro Hypervisor to further isolate the CPU and memory of the Nitro Enclaves from users, applications, and libraries on the parent instance.

AWS Nitro Enclaves comes with a custom Linux device called the Nitro Security Module (NSM), which is accessible via /dev/nsm. This device provides attestation capability to the Nitro Enclaves. The attestation comes in the form of an attestation document. The attestation document makes it easy and safe to build trust between systems that interact with the Nitro Enclaves. The external system must have a mechanism to process the attestation document to determine the validity of the attestation document.

In this post, I go through the anatomy of an attestation document produced by the NSM API. I then show you an example of how to perform different validations that help determine the accuracy of an attestation document produced by the AWS Nitro Enclaves Security Module. I use syntactic and semantic validations to check for the attestation document’s correctness before proceeding with a cryptographic validation of the contents of the document’s payload. The examples used in this post use the C language. Look at the companion repository available in GitHub for access to the all source code used in this post.

Anatomy of an attestation document produced by AWS Nitro Enclaves

The attestation document uses the Concise Binary Object Representation (CBOR) format to encode the data. The CBOR object is wrapped using the CBOR Object Signing and Encryption (COSE) protocol. The COSE format used is a single-signer data structure called “COSE_Sign1”. The object is comprised of headers, the payload, and a signature.

For more information about COSE, see RFC 8152: CBOR Object Signing and Encryption (COSE). For more information about CBOR, see RFC 8949 Concise Binary Object Representation (CBOR).

We published a library to make it easy to interact with the NSM. The library contains helpers which your application, running on the Nitro Enclaves, can use to communicate with the NSM device.

Here is the minimum code needed to generate an attestation document:

#include <stdlib.h>
#include <stdio.h>
#include <nsm.h>

#define NSM_MAX_ATTESTATION_DOC_SIZE (16 * 1024)

int main(void) {

    /// NSM library initialization function.  
    /// *Returns*: A descriptor for the opened device file.

    int nsm_fd = nsm_lib_init();
    if (nsm_fd < 0) {
        exit(1);
    }

    /// NSM `GetAttestationDoc` operation for non-Rust callers.  
    /// *Argument 1 (input)*: The descriptor to the NSM device file.  
    /// *Argument 2 (input)*: User data.  
    /// *Argument 3 (input)*: The size of the user data buffer.  
    /// *Argument 4 (input)*: Nonce data.  
    /// *Argument 5 (input)*: The size of the nonce data buffer.  
    /// *Argument 6 (input)*: Public key data.  
    /// *Argument 7 (input)*: The size of the public key data buffer.  
    /// *Argument 8 (output)*: The obtained attestation document.  
    /// *Argument 9 (input / output)*: The document buffer capacity (as input)
    /// and the size of the received document (as output).  
    /// *Returns*: The status of the operation.

    int status;
    uint8_t att_doc_buff[NSM_MAX_ATTESTATION_DOC_SIZE];
    uint32_t att_doc_cap_and_size = NSM_MAX_ATTESTATION_DOC_SIZE;

    status = nsm_get_attestation_doc(nsm_fd, NULL, 0, NULL, 0, NULL, 0, att_doc_buff, 
                                    &att_doc_cap_and_size);
    if (status != ERROR_CODE_SUCCESS) {
        printf("[Error] Request::Attestation got invalid response: %s\n",status);
        exit(1);
    }

    printf("########## attestation_document_buff ##########\r\n");
    for(int i=0; i<att_doc_cap_and_size; i++)
        fprintf(stdout, "%02X", att_doc_buff[i]);

    exit(0);
}

To produce a sample attestation document, initialize the device, call the function ‘nsm_get_attestation_doc’ inside the AWS Nitro Enclaves, and dump the contents. The library is written using Rust, but it contains
bindings for C. You can read more about the library and some of the other relevant capabilities
here.

The COSE headers contain a protected and an un-protected data section. The cryptographic algorithm used for the signature is specified inside the protected area. AWS Nitro Enclaves use a 384-bit elliptic curve algorithm (P-384) to sign attestation documents. AWS Nitro Enclaves do not use the unprotected data field so it is always left blank.

The payload contains fixed parameters that include the following: information about the issuing NSM, a timestamp of the issuing event, a map of all the locked Platform Configuration Registers (PCRs) at the moment the attestation document was generated, the hashing algorithm used to produce the digest that was used to calculate the PCR values – AWS Nitro Enclaves use a 384 bit secure hashing algorithm (SHA384), a x509 certificate signed by AWS Nitro Enclaves’ Private Public Key Infrastructure (PKI). An AWS Nitro Enclaves certificate expires three hours after it has been issued. The common name (CN) contains information about the issuing NSM – and finally the issuing Certificate Authority (CA) bundle. The payload also contains optional parameters that a third-party application can use to create custom authentication and authorization workflows. The optional parameters are: a public key, a cryptographic nonce, and additional arbitrary data.

Finally, the signature is the result of a signing operation using the private key related to the public key contained inside the certificate that is part of the payload.

Diagram that illustrates the components of a attestation document produced by a Nitro Enclave

Fig 2. An attestation document is generated and signed by the Nitro Hypervisor. It contains information about the Nitro Enclaves and it can be used by an external service to verify the identity of Nitro Enclaves and to establish trust. You can use the attestation document to build your own cryptographic attestation mechanisms.

Syntactical validation

Early validation of the attestation document format makes sure that only documents that conform to the expected structure are processed in subsequent steps.

I start by attempting to decode the CBOR object and testing to see if it corresponds to a COSE object signed with one signer or ‘COSE_Sign1’ structure. This can be easily done by looking at the most significant first three bits (MSB) of the first byte – I am expecting a stream of CBOR bytes (decimal 6). Then, I take the least significant (LSB) remaining five bits of the first byte – I am expecting a tag that tells me it is a COSE_Sign1 object (decimal 18).

assert(att_doc_buff[0*] == 6 <<5 | 18); // 0xD2

* Note that the time of writing, the NSM does not include the COSE tag and thus this validation cannot be made and is mentioned in this post for informational purposes only. However, it is important to keep this in mind, as the tag is part of the standard, and the NSM device or library could include it in the future.*

The next step is to parse the actual CBOR object. A COSE_Sign1 object is an array of size 4 (protected headers, un protected headers, payload, and signature). Therefore, I must check that the next three MSB correspond to Type 4 (array) and that the size is exactly 4.

assert(att_doc_buff[0] == 4 <<5 | 4); // 0x84

The next byte determines what the first CBOR item of the array looks like. I am expecting the protected COSE header as the first item of the array. The CBOR field should indicate that the contents of the item are of a Type 2 (raw bytes) and the size should be exactly 4.

assert(att_doc_buff[1] == 2 <<5 | 4); // 0x44

The next four bytes represent the protected header. The contents of this item is a regular CBOR object. The object should contain a Type 5 (map) with a single item (1). The item first key is expected to be the number 1. The first three MSB of the first byte should be a Type 1 (negative integer). The remaining five LSB should indicate that the value is an 8-bit number (decimal 24). The last byte should be negative 35 as it maps to the P-384 curve that Nitro Enclaves use. Note that CBOR negative numbers are stored minus 1.

assert(att_doc_buff[2] == 5 <<5 | 1); // 0xA1
assert(att_doc_buff[3] == 0x01); // 0x01
assert(att_doc_buff[4] == 1 <<5 | 24); // 0x38
assert(att_doc_buff[4] == 35-1); // 0x22

The next byte corresponds to the unprotected header. AWS Nitro Enclaves do not use unprotected headers. Therefore, the expected is a Type 5 (map) with zero items.

assert(att_doc_buff[6] == 5 <<5 | 0); // 0xA0

Now that I am done inspecting the headers, I can move onto the payload. The CBOR object used for the payload is Type 2 (raw bytes). This time we are expecting a large steam of bytes. The remaining five LSB are used to indicate the data type used to indicate the size of the byte stream (i.e. 8-bit, 16-bit). AWS Nitro Enclaves attestation documents are about 5 KiB without using any of the three optional parameters. The optional parameters have a size limit of 1 KiB each. This means that it would be highly unlikely for the buffer to be larger than a 16-bit number (CBOR short count: 25).

assert(att_doc_buff[8] == 2 <<5 | 25); // 0x59

The next two bytes represent the size of the payload which I am going to skip those for now, as the contents of the payload are validated in subsequent steps. I’ll move onto the final portion of the attestation document: the signature. The signature has to be a Type 2 (raw bytes) of exactly 96 bytes.

    uint16_t payload_size = att_doc_buff[8] << 8 | att_doc_buff[9];
    assert(att_doc_buff[9+payload_size+1] == (2<<5 | 24));   // 0x58
    assert(att_doc_buff[9+payload_size+1+1] == 96);         // 0x60

At this point, I have validated that the data produced by the NSM looks the way it should. My application is ready to start looking into the contents of the attestation document.

I want to make sure that the document contains all mandatory fields and I can check that the fields have the right structure and their sizes are within the expected boundaries. I have evidence that the data looks the way it should, so I am ready to use an off-the-shelf CBOR library to make the validation process easier instead of doing it by hand.

Here is an example of how to load a CBOR object using libcbor and standard C libraries to check the contents. I am showing just one example to illustrate the process. Refer to the section ‘Verifying the root of trust’ in the AWS Nitro Enclaves User Guide for a detailed description of each parameter and the validations that your application should perform to make sure that the document is valid.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>

#include <cbor.h>
#include <openssl/ssl.h>

#define APP_X509_BUFF_LEN                   (1024*2)
#define APP_ATTDOC_BUFF_LEN                 (1024*10)

void output_handler(char * msg){
    fprintf(stdout, "\r\n%s\r\n", msg);
}

void output_handler_bytes(uint8_t * buffer, int buffer_size){    
    for(int i=0; i<buffer_size; i++)
        fprintf(stdout, "%02X", buffer[i]);
    fprintf(stdout, "\r\n");
}

int read_file( unsigned char * file, char * file_name, size_t elements) {
    FILE * fp; size_t file_len = 0;
    fp = fopen(file_name, "r");
    file_len = fread(file, sizeof(char), elements, fp);
    if (ferror(fp) != 0 ) {
        fputs("Error reading file", stderr);
    } 
    fclose(fp);
    return file_len; 
}  

int main(int argc, char* argv[]) {

    // STEP 0 - LOAD ATTESTATION DOCUMENT

    // Check inputs, expect two
    if (argc != 3) { 
        fprintf(stderr, "%s\r\n", "ERROR: usage: ./main {att_doc_sample.bin} {AWS_NitroEnclaves_Root-G1.pem}"); exit(1);
    }

    // Load file into buffer, use 1st argument
    unsigned char * att_doc_buff = malloc(APP_ATTDOC_BUFF_LEN);
    int att_doc_len = read_file(att_doc_buff, argv[1], APP_ATTDOC_BUFF_LEN );

    // STEP 1 - SYTANCTIC VALIDATON

    // Check COSE TAG (skipping - not currently implemented by AWS Nitro Enclaves)
    // assert(att_doc_buff[0] == 6 <<5 | 18); // 0xD2
    // Check if this is an array of exactly 4 items
    assert(att_doc_buff[0] == (4<<5 | 4));      // 0x84
    // Check if next item is a byte stream of 4 bytes
    assert(att_doc_buff[1] == (2<<5 | 4));      // 0x44
    // Check is fist item if byte stream is a map with 1 item
    assert(att_doc_buff[2] == (5<<5 | 1));      // 0xA1
    // Check that the first key of the map is 0x01
    assert(att_doc_buff[3] == 0x01);            // 0x01
    // Check that value of the the first key of the map is -35 (P-384 curve)
    assert(att_doc_buff[4] == (1 <<5 | 24));    // 0x38
    assert(att_doc_buff[5] == 35-1);            // 0x22
    // Check that next item is a map of 0 items
    assert(att_doc_buff[6] == (5<<5 | 0));      // 0xA0
    // Check that the next item is a byte stream and the size is a 16-bit number (dec. 25)
    assert(att_doc_buff[7] == (2<<5 | 25));     // 0x59
    // Cast the 16-bit number
    uint16_t payload_size = att_doc_buff[8] << 8 | att_doc_buff[9];
    // Check that the item after the payload is a byte stream and the size is 8-bit number (dec. 24)
    assert(att_doc_buff[9+payload_size+1] == (2<<5 | 24));   // 0x58
    // Check that the size of the signature is exactly 96 bytes
    assert(att_doc_buff[9+payload_size+1+1] == 96);         // 0x60

    // Parse buffer using library
    struct cbor_load_result ad_result;
    cbor_item_t * ad_item = cbor_load(att_doc_buff, att_doc_len, &ad_result);
    free(att_doc_buff); // not needed anymore

    // Parse protected header -> item 0 
    cbor_item_t * ad_pheader = cbor_array_get(ad_item, 0); 
    size_t ad_pheader_len = cbor_bytestring_length(ad_pheader);

    // Parse signed bytes -> item 2 (skip un-protected headers as they are always empty)
    cbor_item_t * ad_signed = cbor_array_get(ad_item, 2);
    size_t ad_signed_len = cbor_bytestring_length(ad_signed);

    // Load signed bytes as a new CBOR object
    unsigned char * ad_signed_d = cbor_bytestring_handle(ad_signed);
    struct cbor_load_result ad_signed_result;
    cbor_item_t * ad_signed_item = cbor_load(ad_signed_d, ad_signed_len, &ad_signed_result);

    // Create the pair structure
    struct cbor_pair * ad_signed_item_pairs = cbor_map_handle(ad_signed_item);

    // Parse signature -> item 3
    cbor_item_t * ad_sig = cbor_array_get(ad_item, 3); 
    size_t ad_sig_len = cbor_bytestring_length(ad_sig);
    unsigned char * ad_sig_d = cbor_bytestring_handle(ad_sig);

    // Example 01: Check that the first item's key is the string "module_id" and that is not empty
    size_t module_k_len = cbor_string_length(ad_signed_item_pairs[0].key);
    unsigned char * module_k_str = realloc(cbor_string_handle(ad_signed_item_pairs[0].key), module_k_len+1); //null char
    module_k_str[module_k_len] = '\0';
    size_t module_v_len = cbor_string_length(ad_signed_item_pairs[0].value);
    unsigned char * module_v_str = realloc(cbor_string_handle(ad_signed_item_pairs[0].value), module_v_len+1); //null char
    module_v_str[module_v_len] = '\0';
    assert(module_k_len != 0);
    assert(module_v_len != 0);

    // Example 02: Check that the module id key is actually the string "module_id"
    assert(!strcmp("module_id",(const char *)module_k_str));

    // Example 03: Check that the signature is exactly 96 bytes long
    assert(ad_sig_len == 96);

    // Example 04: Check that the protected header is exactly 4 bytes long
    assert(ad_pheader_len == 4);

Semantic validation

The next step is to look at the data contained in the attestation document and check if it conforms to pre-defined business rules. The attestation document contains a certificate that was signed by the AWS Nitro Enclaves’ PKI. This validation it is important, as it proves that the document was signed by the AWS Nitro Enclaves’ PKI.

The signature of an x509 certificate is based on the certificate’s payload digest. Validating this signature means that I trust the information contained within the certificate, including the public key which I can later use to validate the attestation document itself. Furthermore, the information in the document contains details about the NSM module and a timestamp. Passing this check provides the assurances I need to trust that the document originated from my software running on AWS Nitro Enclaves at a specific time.

Diagram that illustrates the components of a x.509 certificate, part of the payload of a attestation document produced by AWS Nitro Enclaves.

Fig 3. The attestation document contains a x.509 certificate that was signed by the AWS Nitro Enclaves’ PKI.

Here is an example of how I use the AWS Nitro Enclaves’ Private PKI root certificate from an external file. Then, use the CA bundle contained in the attestation document to validate the authenticity of the certificate contained in the document. In this example, I am using the OpenSSL library.

// STEP 2 -  SEMANTIC VALIDATION

    // Load AWS Nitro Enclave's Private PKI root certificate
    unsigned char * x509_root_ca = malloc(APP_X509_BUFF_LEN);
    int x509_root_ca_len = read_file(x509_root_ca, argv[2], APP_X509_BUFF_LEN );
    BIO * bio = BIO_new_mem_buf((void*)x509_root_ca, x509_root_ca_len);
    X509 * caX509 = PEM_read_bio_X509(bio, NULL, NULL, NULL);
    if (caX509 == NULL) {
        fprintf(stderr, "%s\r\n", "ERROR: PEM_read_bio_X509 failed"); exit(1);
    }
    free(x509_root_ca); free(bio);
    // Create CA_STORE
    X509_STORE * ca_store = NULL;
    ca_store = X509_STORE_new();
    /* ADD X509_V_FLAG_NO_CHECK_TIME FOR TESTING! TODO REMOVE */
    X509_STORE_set_flags (ca_store, X509_V_FLAG_NO_CHECK_TIME);
    if (X509_STORE_add_cert(ca_store, caX509) != 1) {
        fprintf(stderr, "%s\r\n", "ERROR: X509_STORE_add_cert failed"); exit(1);
    }
    // Add certificates to CA_STORE from cabundle
    // Skip the first one [0] as that is the Root CA and we want to read it from an external source
    for (int i = 1; i < cbor_array_size(ad_signed_item_pairs[5].value); ++i){ 
        cbor_item_t * ad_cabundle = cbor_array_get(ad_signed_item_pairs[5].value, i); 
        size_t ad_cabundle_len = cbor_bytestring_length(ad_cabundle);
        unsigned char * ad_cabundle_d = cbor_bytestring_handle(ad_cabundle);
        X509 * cabnX509 = X509_new();
        cabnX509 = d2i_X509(&cabnX509, (const unsigned char **)&ad_cabundle_d, ad_cabundle_len);
        if (cabnX509 == NULL) {
            fprintf(stderr, "%s\r\n", "ERROR: d2i_X509 failed"); exit(1);
        }
        if (X509_STORE_add_cert(ca_store, cabnX509) != 1) {
            fprintf(stderr, "%s\r\n", "ERROR: X509_STORE_add_cert failed"); exit(1);
        }
    }

    // Load certificate from attestation dcoument - this a certificate that we don't trust (yet)
    size_t ad_signed_cert_len = cbor_bytestring_length(ad_signed_item_pairs[4].value);
    unsigned char * ad_signed_cert_d = realloc(cbor_bytestring_handle(ad_signed_item_pairs[4].value), ad_signed_cert_len);
    X509 * pX509 = X509_new();
    pX509 = d2i_X509(&pX509, (const unsigned char **)&ad_signed_cert_d, ad_signed_cert_len);
    if (pX509 == NULL) {
        fprintf(stderr, "%s\r\n", "ERROR: d2i_X509 failed"); exit(1);
    }
    // Initialize X509 store context and veryfy untrusted certificate
    STACK_OF(X509) * ca_stack = NULL;
    X509_STORE_CTX * store_ctx = X509_STORE_CTX_new();
    if (X509_STORE_CTX_init(store_ctx, ca_store, pX509, ca_stack) != 1) {
        fprintf(stderr, "%s\r\n", "ERROR: X509_STORE_CTX_init failed"); exit(1);
    }
    if (X509_verify_cert(store_ctx) != 1) {
        fprintf(stderr, "%s\r\n", "ERROR: X509_verify_cert failed"); exit(1);
    }
    fprintf(stdout, "%s\r\n", "OK: ########## Root of Trust Verified! ##########");

Having proof that the certificate was signed by the expected CA is just the beginning. I also want to make sure that the contents of the certificate are correct. This involves checking that the certificate has not expired, as well as making sure that the critical extensions contain correct information to name a few.

Cryptographic validation

The syntactic validation helped me determine that the attestation document has the right shape, and the sematic validation helped me determine if the document meets my business rules. However, I still don’t know for sure if the document is valid.

The attestation document contains critical information, such as PCRs and the AWS Identity Access and Management (IAM) role among other details. I can safely use these two values in my authentication or authorization workflows if I can prove that they are trustworthy.

The attestation document was signed using a private key that is never exposed. However, the corresponding public key is contained within the certificate that was issued and stored within the attestation document. I know I can trust the contents of this certificate because I have proof that the certificate was signed by an entity that I trust.

Here is an example where I cryptographically prove that all the protected contents of the attestation document are related to the public key contained in the certificate. To validate the COSE signature, I must first recreate the original message that was used during the signature operation – COSE uses a specific format. Then, I use OpenSSL to check if there is a match between the message, signature, and public key. If the signature checks, then I can trust the contents of the already semantically-verified payload.

 // STEP 3 - CRYPTOGRAPHIC VALIDATION

    #define SIG_STRUCTURE_BUFFER_S (1024*10)
    // Create new empty key
    EVP_PKEY * pkey = EVP_PKEY_new();
    // Create a new eliptic curve object using P-384 curve
    EC_KEY * ec_key = EC_KEY_new_by_curve_name(NID_secp384r1);
    // Reference the public key stucture and eliptic curve object with each other
    EVP_PKEY_assign_EC_KEY(pkey, ec_key);
    // Load the public key from the attestation document (we trust it now)
    pkey = X509_get_pubkey(pX509);
    if (pkey == NULL) {
        fprintf(stderr, "%s\r\n", "ERROR: X509_get_pubkey failed"); exit(1);
    }
    // Allocate, initialize and return a digest context
    EVP_MD_CTX * ctx = EVP_MD_CTX_create();
    // Set up verification context
    if (EVP_DigestVerifyInit(ctx, NULL, EVP_sha384(), NULL, pkey) <= 0) {
        fprintf(stderr, "%s\r\n", "ERROR: EVP_DigestVerifyInit failed"); exit(1);
    }
    // Recreate COSE_Sign1 structure, and serilise it into a buffer
    cbor_item_t * cose_sig_arr = cbor_new_definite_array(4);
    cbor_item_t * cose_sig_arr_0_sig1 = cbor_build_string("Signature1"); 
    cbor_item_t * cose_sig_arr_2_empty = cbor_build_bytestring(NULL, 0);

    assert(cbor_array_push(cose_sig_arr, cose_sig_arr_0_sig1));
    assert(cbor_array_push(cose_sig_arr, ad_pheader));
    assert(cbor_array_push(cose_sig_arr, cose_sig_arr_2_empty));
    assert(cbor_array_push(cose_sig_arr, ad_signed));

    unsigned char sig_struct_buffer[SIG_STRUCTURE_BUFFER_S];
    size_t sig_struct_buffer_len = cbor_serialize(cose_sig_arr, sig_struct_buffer, SIG_STRUCTURE_BUFFER_S);
    // Hash message and load it into the verificaiton context
    if (EVP_DigestVerifyUpdate(ctx, sig_struct_buffer, sig_struct_buffer_len) <= 0) {
        fprintf(stderr, "%s\r\n", "ERROR: nEVP_DigestVerifyUpdate failed"); exit(1);
    }
    // Create R and V BIGNUM structures
    BIGNUM * sig_r = BN_new(); BIGNUM * sig_v = BN_new();
    BN_bin2bn(ad_sig_d, 48, sig_r); BN_bin2bn(ad_sig_d + 48, 48, sig_v);
    // Allocate an empty ECDSA_SIG structure
    ECDSA_SIG * ec_sig = ECDSA_SIG_new();
    // Set R and V values
    ECDSA_SIG_set0(ec_sig, sig_r, sig_v);
    // Convert R and V values into DER format
    int sig_size = i2d_ECDSA_SIG(ec_sig, NULL);
    unsigned char * sig_bytes = malloc(sig_size); unsigned char * p;
    memset_s(sig_bytes,sig_size,0xFF, sig_size);
    p = sig_bytes;
    sig_size = i2d_ECDSA_SIG(ec_sig, &p);
    // Verify the data in the context against the signature and get final result
    if (EVP_DigestVerifyFinal(ctx, sig_bytes, sig_size) != 1) {
        fprintf(stderr, "%s\r\n", "ERROR: EVP_DigestVerifyFinal failed"); exit(1);
    } else {
        fprintf(stdout, "%s\r\n", "OK: ########## Message Verified! ##########"); 
        free(sig_bytes);
        exit(0);
    }
    //#endif

    exit(1);

}

Conclusion

In this post, I went through a detailed examination of attestation documents produced by the AWS Nitro Enclaves. Then, I went over different types of validations (syntactic, semantic, and cryptographic) that safely help determine if an attestation document should be trusted. I’ve also included access to a public repository that contains the source code used in this post. New AWS Nitro Enclaves users can use it as a starting point when looking to integrate their applications with AWS Nitro Enclaves and build highly secure and confidential solutions.