All posts by Sheila Busser

Building diversified and cost-optimized EC2 server groups in Spinnaker

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/building-diversified-and-cost-optimized-ec2-server-groups-in-spinnaker/

This blog post is written by Sandeep Palavalasa, Sr. Specialist Containers SA, and Prathibha Datta-Kumar, Software Development Engineer

Spinnaker is an open source continuous delivery platform created by Netflix for releasing software changes rapidly and reliably. It enables teams to automate deployments into pipelines that are run whenever a new version is released with proven deployment strategies that are faster and more dependable with zero downtime. For many AWS customers, Spinnaker is a critical piece of technology that allows developers to deploy their applications safely and reliably across different AWS managed services.

Listening to customer requests on the Spinnaker open source project and in the Amazon EC2 Spot Instances integrations roadmap, we have further enhanced Spinnaker’s ability to deploy on Amazon Elastic Compute Cloud (Amazon EC2). The enhancements make it easier to combine Spot Instances with On-Demand, Reserved, and Savings Plans Instances to optimize workload costs with performance. You can improve workload availability when using Spot Instances with features such as allocation strategies and proactive Spot capacity rebalancing, when you are flexible about Instance types and Availability Zones. Combinations of these features offer the best possible experience when using Amazon EC2 with Spinnaker.

In this post, we detail the recent enhancements, along with a walkthrough of how you can use them following the best practices.

Amazon EC2 Spot Instances

EC2 Spot Instances are spare compute capacity in the AWS Cloud available at steep discounts of up to 90% when compared to On-Demand Instance prices. The primary difference between an On-Demand Instance and a Spot Instance is that a Spot Instance can be interrupted by Amazon EC2 with a two-minute notification when Amazon EC2 needs the capacity back. Amazon EC2 now sends rebalance recommendation notifications when Spot Instances are at an elevated risk of interruption. This signal can arrive sooner than the two-minute interruption notice. This lets you proactively replace your Spot Instances before it’s interrupted.

The best way to adhere to Spot best practices and instance fleet management is by using an Amazon EC2 Auto Scaling group When using Spot Instances in Auto Scaling group, enabling Capacity Rebalancing helps you maintain workload availability by proactively augmenting your fleet with a new Spot Instance before a running instance is interrupted by Amazon EC2.

Spinnaker concepts

Spinnaker uses three key concepts to describe your services, including applications, clusters, and server groups, and how your services are exposed to users is expressed as Load balancers and firewalls.

An application is a collection of clusters, a cluster is a collection of server groups, and a server group identifies the deployable artifact and basic configuration settings such as the number of instances, autoscaling policies, metadata, etc. This corresponds to an Auto Scaling group in AWS. We use Auto Scaling groups and server groups interchangeably in this post.

Spinnaker and Amazon EC2 Integration

In mid-2020, we started looking into customer requests and gaps in the Amazon EC2 feature set supported in Spinnaker. Around the same time, Spinnaker OSS added support for Amazon EC2 Launch Templates. Thanks to their effort, we could follow-up and expand the Amazon EC2 feature set supported in Spinnaker. Now that we understand the new features, let’s look at how to use some of them in the following tutorial spinnaker.io.

Here are some highlights of the features contributed recently:

Feature Why use it? (Example use cases)
  Multiple Instance Types   Tap into multiple capacity pools to achieve and maintain the desired scale using Spot Instances.
  Combining On-Demand and Spot Instances

  – Control the proportion of On-Demand and Spot Instances launched in your sever group.

– Combine Spot Instances with Amazon EC2 Reserved Instances or Savings Plans.

  Amazon EC2 Auto Scaling allocation strategies   Reduce overall Spot interruptions by launching from Spot pools that are optimally chosen based on the available Spot capacity, using capacity-optimized Spot allocation strategy.
  Capacity rebalancing   Improve your workload availability by proactively shifting your Spot capacity to optimal pools by enabling capacity rebalancing along with capacity-optimized allocation strategy.
  Improved support for burstable performance instance types with custom credit specification   Reduce costs by preventing wastage of CPU cycles.

We recommend using Spinnaker stable release 1.28.x for API users and 1.29.x for UI users. Here is the Git issue for related PRs and feature releases.

Now that we understand the new features, let’s look at how to use some of them in the following tutorial.

Example tutorial: Deploy a demo web application on an Auto Scaling group with On-Demand and Spot Instances

In this example tutorial, we setup Spinnaker to deploy to Amazon EC2, create an Application Load Balancer, and deploy a demo application on a server group diversified across multiple instance types and purchase options – this case On-Demand and Spot Instances.

We leverage Spinnaker’s API throughout the tutorial to create new resources, along with a quick guide on how to deploy the same using Spinnaker UI (Deck) and leverage UI to view them.

Prerequisites

As a prerequisite to complete this tutorial, you must have an AWS Account with an AWS Identity and Access Management (IAM) User that has the AdministratorAccess configured to use with AWS Command Line Interface (AWS CLI).

1. Spinnaker setup

We will use the AWS CloudFormation template setup-spinnaker-with-deployment-vpc.yml to setup Spinnaker and the required resources.

1.1 Create an Secure Shell(SSH) keypair used to connect to Spinnaker and EC2 instances launched by Spinnaker.

AWS_REGION=us-west-2 # Change the region where you want Spinnaker deployed
EC2_KEYPAIR_NAME=spinnaker-blog-${AWS_REGION}
aws ec2 create-key-pair --key-name ${EC2_KEYPAIR_NAME} --region ${AWS_REGION} --query KeyMaterial --output text > ~/${EC2_KEYPAIR_NAME}.pem
chmod 600 ~/${EC2_KEYPAIR_NAME}.pem

1.2 Deploy the Cloudformation stack.

STACK_NAME=spinnaker-blog
SPINNAKER_VERSION=1.29.1 # Change the version if newer versions are available
NUMBER_OF_AZS=3
AVAILABILITY_ZONES=${AWS_REGION}a,${AWS_REGION}b,${AWS_REGION}c
ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
S3_BUCKET_NAME=spin-persitent-store-${ACCOUNT_ID}

# Download template
curl -o setup-spinnaker-with-deployment-vpc.yml https://raw.githubusercontent.com/awslabs/ec2-spot-labs/master/ec2-spot-spinnaker/setup-spinnaker-with-deployment-vpc.yml

# deploy stack
aws cloudformation deploy --template-file setup-spinnaker-with-deployment-vpc.yml \
    --stack-name ${STACK_NAME} \
    --parameter-overrides NumberOfAZs=${NUMBER_OF_AZS} \
    AvailabilityZones=${AVAILABILITY_ZONES} \
    EC2KeyPairName=${EC2_KEYPAIR_NAME} \
    SpinnakerVersion=${SPINNAKER_VERSION} \
    SpinnakerS3BucketName=${S3_BUCKET_NAME} \
    --capabilities CAPABILITY_NAMED_IAM --region ${AWS_REGION}

1.3 Connecting to Spinnaker

1.3.1 Get the SSH command to port forwarding for Deck – the browser-based UI (9000) and Gate – the API Gateway (8084) to access the Spinnaker UI and API.

SPINNAKER_INSTANCE_DNS_NAME=$(aws cloudformation describe-stacks --stack-name ${STACK_NAME} --region ${AWS_REGION} --query "Stacks[].Outputs[?OutputKey=='SpinnakerInstance'].OutputValue" --output text)
echo 'ssh -A -L 9000:localhost:9000 -L 8084:localhost:8084 -L 8087:localhost:8087 -i ~/'${EC2_KEYPAIR_NAME}' [email protected]$'{SPINNAKER_INSTANCE_DNS_NAME}''

1.3.2 Open a new terminal and use the SSH command (output from the previous command) to connect to the Spinnaker instance. After you successfully connect to the Spinnaker instance via SSH, access the Spinnaker UI here and API here.

2. Deploy a demo web application

Let’s make sure that we have the environment variables required in the shell before proceeding. If you’re using the same terminal window as before, then you might already have these variables.

STACK_NAME=spinnaker-blog
AWS_REGION=us-west-2 # use the same region as before
EC2_KEYPAIR_NAME=spinnaker-blog-${AWS_REGION}
VPC_ID=$(aws cloudformation describe-stacks --stack-name ${STACK_NAME} --region ${AWS_REGION} --query "Stacks[].Outputs[?OutputKey=='VPCID'].OutputValue" --output text)

2.1 Create a Spinnaker Application

We start by creating an application in Spinnaker, a placeholder for the service that we deploy.

curl 'http://localhost:8084/tasks' \
-H 'Content-Type: application/json;charset=utf-8' \
--data-raw \
'{
   "job":[
      {
         "type":"createApplication",
         "application":{
            "cloudProviders":"aws",
            "instancePort":80,
            "name":"demoapp",
            "email":"[email protected]",
            "providerSettings":{
               "aws":{
                  "useAmiBlockDeviceMappings":true
               }
            }
         }
      }
   ],
   "application":"demoapp",
   "description":"Create Application: demoapp"
}'

Spin Create Server Group

2.2 Create an Application Load Balancer

Let’s create an Application Load Balanacer and a target group for port 80, spanning the three availability zones in our public subnet. We use the Demo-ALB-SecurityGroup for Firewalls to allow public access to the ALB on port 80.

As Spot Instances are interrupted with a two minute warning, you must adjust the Target Group’s deregistration delay to a slightly lower time. Recommended values are 90 seconds or less. This allows time for in-flight requests to complete and gracefully close existing connections before the instance is interrupted.

curl 'http://localhost:8084/tasks' \
-H 'Content-Type: application/json;charset=utf-8' \
--data-binary \
'{
   "application":"demoapp",
   "description":"Create Load Balancer: demoapp",
   "job":[
      {
         "type":"upsertLoadBalancer",
         "name":"demoapp-lb",
         "loadBalancerType":"application",
         "cloudProvider":"aws",
         "credentials":"my-aws-account",
         "region":"'"${AWS_REGION}"'",
         "vpcId":"'"${VPC_ID}"'",
         "subnetType":"public-subnet",
         "idleTimeout":60,
         "targetGroups":[
            {
               "name":"demoapp-targetgroup",
               "protocol":"HTTP",
               "port":80,
               "targetType":"instance",
               "healthCheckProtocol":"HTTP",
               "healthCheckPort":"traffic-port",
               "healthCheckPath":"/",
               "attributes":{
                  "deregistrationDelay":90
               }
            }
         ],
         "regionZones":[
            "'"${AWS_REGION}"'a",
            "'"${AWS_REGION}"'b",
            "'"${AWS_REGION}"'c"
         ],
         "securityGroups":[
            "Demo-ALB-SecurityGroup"
         ],
         "listeners":[
            {
               "protocol":"HTTP",
               "port":80,
               "defaultActions":[
                  {
                     "type":"forward",
                     "targetGroupName":"demoapp-targetgroup"
                 }
               ]
            }
         ]
      }
   ]
}'

Spin Create ALB

2.3 Create a server group

Before creating a server group (Auto Scaling group), here is a brief overview of the features used in the example:

      • onDemandBaseCapacity (default 0): The minimum amount of your ASG’s capacity that must be fulfilled by On-Demand instances (can also be applied toward Reserved Instances or Savings Plans). The example uses an onDemandBaseCapacity of three.
      • onDemandPercentageAboveBaseCapacity (default 100): The percentages of On-Demand and Spot Instances for additional capacity beyond OnDemandBaseCapacity. The example uses onDemandPercentageAboveBaseCapacity of 10% (i.e. 90% Spot).
      • spotAllocationStrategy: This indicates how you want to allocate instances across Spot Instance pools in each Availability Zone. The example uses the recommended Capacity Optimized strategy. Instances are launched from optimal Spot pools that are chosen based on the available Spot capacity for the number of instances that are launching.
      • launchTemplateOverridesForInstanceType: The list of instance types that are acceptable for your workload. Specifying multiple instance types enables tapping into multiple instance pools in multiple Availability Zones, designed to enhance your service’s availability. You can use the ec2-instance-selector, an open source AWS Command Line Interface(CLI) tool to narrow down the instance types based on resource criteria like vcpus and memory.
      • capacityRebalance: When enabled, this feature proactively manages the EC2 Spot Instance lifecycle leveraging the new EC2 Instance rebalance recommendation. This increases the emphasis on availability by automatically attempting to replace Spot Instances in an ASG before they are interrupted by Amazon EC2. We enable this feature in this example.

Learn more on spinnaker.io: feature descriptions and use cases and sample API requests.

Let’s create a server group with a desired capacity of 12 instances diversified across current and previous generation instance types, attach the previously created ALB, use Demo-EC2-SecurityGroup for the Firewalls which allows http traffic only from the ALB, use the following bash script for UserData to install httpd, and add instance metadata into the index.html.

2.3.1 Save the userdata bash script into a file user-date.sh.

Note that Spinnaker only support base64 encoded userdata. We use base64 bash command to encode the file contents in the next step.

cat << "EOF" > user-data.sh
#!/bin/bash
yum update -y
yum install httpd -y
echo "<html>
    <head>
        <title>Demo Application</title>
        <style>body {margin-top: 40px; background-color: #Gray;} </style>
    </head>
    <body>
        <h2>You have reached a Demo Application running on</h2>
        <ul>
            <li>instance-id: <b> `curl http://169.254.169.254/latest/meta-data/instance-id` </b></li>
            <li>instance-type: <b> `curl http://169.254.169.254/latest/meta-data/instance-type` </b></li>
            <li>instance-life-cycle: <b> `curl http://169.254.169.254/latest/meta-data/instance-life-cycle` </b></li>
            <li>availability-zone: <b> `curl http://169.254.169.254/latest/meta-data/placement/availability-zone` </b></li>
        </ul>
    </body>
</html>" > /var/www/html/index.html
systemctl start httpd
systemctl enable httpd
EOF

2.3.2 Create the server group by running the following command. Note we use the KeyPairName that we created as part of the prerequisites.

curl 'http://localhost:8084/tasks' \
-H 'Content-Type: application/json;charset=utf-8' \
-d \
'{
   "job":[
      {
         "type":"createServerGroup",
         "cloudProvider":"aws",
         "account":"my-aws-account",
         "application":"demoapp",
         "stack":"",
         "credentials":"my-aws-account",
	"healthCheckType": "ELB",
	"healthCheckGracePeriod":600,
	"capacityRebalance": true,
         "onDemandBaseCapacity":3, 
         "onDemandPercentageAboveBaseCapacity":10,
         "spotAllocationStrategy":"capacity-optimized",
         "setLaunchTemplate":true,
         "launchTemplateOverridesForInstanceType":[
            {
               "instanceType":"m4.large"
            },
            {
               "instanceType":"m5.large"
            },
            {
               "instanceType":"m5a.large"
            },
            {
               "instanceType":"m5ad.large"
            },
            {
               "instanceType":"m5d.large"
            },
            {
               "instanceType":"m5dn.large"
            },
            {
               "instanceType":"m5n.large"
            }

         ],
         "capacity":{
            "min":6,
            "max":21,
            "desired":12
         },
         "subnetType":"private-subnet",
         "availabilityZones":{
            "'"${AWS_REGION}"'":[
               "'"${AWS_REGION}"'a",
               "'"${AWS_REGION}"'b",
               "'"${AWS_REGION}"'c"
            ]
         },
         "keyPair":"'"${EC2_KEYPAIR_NAME}"'",
         "securityGroups":[
            "Demo-EC2-SecurityGroup"
         ],
         "instanceType":"m5.large",
         "virtualizationType":"hvm",
         "amiName":"'"$(aws ec2 describe-images --owners amazon --filters "Name=name,Values=amzn2-ami-hvm-2*x86_64-gp2" --query 'reverse(sort_by(Images, &CreationDate))[0].Name' --region ${AWS_REGION} --output text)"'",
         "targetGroups":[
            "demoapp-targetgroup"
         ],
         "base64UserData":"'"$(base64 user-data.sh)"'",,
        "associatePublicIpAddress":false,
         "instanceMonitoring":false
      }
   ],
   "application":"demoapp",
   "description":"Create New server group in cluster demoapp"
}'

Spin Create ServerGroup

Spinnaker creates an Amazon EC2 Launch Template and an ASG with specified parameters and waits until the ALB health check passes before sending traffic to the EC2 Instances.

The server group and launch template that we just created will look like this in Spinnaker UI:

Spin View ServerGroup

The UI also displays capacity type, such as the purchase option for each instance type in the Instance Information section:

Spin View ServerGroup Purchase Options 1Spin View ServerGroup Purchase Options 2

3. Access the application

Copy the Application Load Balancer URL by selecting the tree icon in the right top corner of the server group, and access it in a browser. You can refresh multiple times to see that the requests are going to different instances every time.

Spin Access App

Congratulations! You successfully deployed the demo application on an Amazon EC2 server group diversified across multiple instance types and purchase options.

Moreover, you can clone, modify, disable, and destroy these server groups, as well as use them with Spinnaker pipelines to effectively release new versions of your application.

Cost savings

Check the savings you realized by deploying your demo application on EC2 Spot Instances by going to EC2 console > Spot Requests > Saving Summary.

Spin Spot Savings

Cleanup

To avoid incurring any additional charges, clean up the resources created in the tutorial.

Frist, delete the server group, application load balancer and application in Spinnaker.

curl 'http://localhost:8084/tasks' \
-H 'Content-Type: application/json;charset=utf-8' \
--data-raw \
'{
   "job":[
      {
         "reason":"Cleanup",
         "asgName":"demoapp-v000",
         "moniker":{
            "app":"demoapp",
            "cluster":"demoapp",
            "sequence":0
         },
         "serverGroupName":"demoapp-v000",
         "type":"destroyServerGroup",
         "region":"'"${AWS_REGION}"'",
         "credentials":"my-aws-account",
         "cloudProvider":"aws"
      },
      {
         "cloudProvider":"aws",
         "loadBalancerName":"demoapp-lb",
         "loadBalancerType":"application",
         "regions":[
            "'"${AWS_REGION}"'"
         ],
         "credentials":"my-aws-account",
         "vpcId":"'"${VPC_ID}"'",
         "type":"deleteLoadBalancer"
      },
      {
         "type":"deleteApplication",
         "application":{
            "name":"demoapp",
            "cloudProviders":"aws"
         }
      }
   ],
   "application":"demoapp",
   "description":"Deleting ServerGroup, ALB and Application: demoapp"
}'

Wait for Spinnaker to delete all of the resources before proceeding further. You can confirm this either on the Spinnaker UI or AWS Management Console.

Then delete the Spinnaker infrastructure by running the following command:

aws ec2 delete-key-pair --key-name ${EC2_KEYPAIR_NAME} --region ${AWS_REGION}
rm ~/${EC2_KEYPAIR_NAME}.pem
aws s3api delete-objects \
--bucket ${S3_BUCKET_NAME} \
--delete "$(aws s3api list-object-versions \
--bucket ${S3_BUCKET_NAME} \
--query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}')" #If error occurs, there are no Versions and is OK
aws s3api delete-objects \
--bucket ${S3_BUCKET_NAME} \
--delete "$(aws s3api list-object-versions \
--bucket ${S3_BUCKET_NAME} \
--query='{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}')" #If error occurs, there are no DeleteMarkers and is OK
aws s3 rb s3://${S3_BUCKET_NAME} --force #Delete Bucket
aws cloudformation delete-stack --region ${AWS_REGION} --stack-name ${STACK_NAME}

Conclusion

In this post, we learned about the new Amazon EC2 features recently added to Spinnaker, and how to use them to build diversified and optimized Auto Scaling Groups. We also discussed recommended best practices for EC2 Spot and how they can improve your experience with it.

We would love to hear from you! Tell us about other Continuous Integration/Continuous Delivery (CI/CD) platforms that you want to use with EC2 Spot and/or Auto Scaling Groups by adding an issue on the Spot integrations roadmap.

Architecting for data residency with AWS Outposts rack and landing zone guardrails

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/architecting-for-data-residency-with-aws-outposts-rack-and-landing-zone-guardrails/

This blog post was written by Abeer Naffa’, Sr. Solutions Architect, Solutions Builder AWS, David Filiatrault, Principal Security Consultant, AWS and Jared Thompson, Hybrid Edge SA Specialist, AWS.

In this post, we will explore how organizations can use AWS Control Tower landing zone and AWS Organizations custom guardrails to enable compliance with data residency requirements on AWS Outposts rack. We will discuss how custom guardrails can be leveraged to limit the ability to store, process, and access data and remain isolated in specific geographic locations, how they can be used to enforce security and compliance controls, as well as, which prerequisites organizations should consider before implementing these guardrails.

Data residency is a critical consideration for organizations that collect and store sensitive information, such as Personal Identifiable Information (PII), financial, and healthcare data. With the rise of cloud computing and the global nature of the internet, it can be challenging for organizations to make sure that their data is being stored and processed in compliance with local laws and regulations.

One potential solution for addressing data residency challenges with AWS is to use Outposts rack, which allows organizations to run AWS infrastructure on premises and in their own data centers. This lets organizations store and process data in a location of their choosing. An Outpost is seamlessly connected to an AWS Region where it has access to the full suite of AWS services managed from a single plane of glass, the AWS Management Console or the AWS Command Line Interface (AWS CLI).  Outposts rack can be configured to utilize landing zone to further adhere to data residency requirements.

The landing zones are a set of tools and best practices that help organizations establish a secure and compliant multi-account structure within a cloud provider. A landing zone can also include Organizations to set policies – guardrails – at the root level, known as Service Control Policies (SCPs) across all member accounts. This can be configured to enforce certain data residency requirements.

When leveraging Outposts rack to meet data residency requirements, it is crucial to have control over the in-scope data movement from the Outposts. This can be accomplished by implementing landing zone best practices and the suggested guardrails. The main focus of this blog post is on the custom policies that restrict data snapshots, prohibit data creation within the Region, and limit data transfer to the Region.

Prerequisites

Landing zone best practices and custom guardrails can help when data needs to remain in a specific locality where the Outposts rack is also located.  This can be completed by defining and enforcing policies for data storage and usage within the landing zone organization that you set up. The following prerequisites should be considered before implementing the suggested guardrails:

1. AWS Outposts rack

AWS has installed your Outpost and handed off to you. An Outpost may comprise of one or more racks connected together at the site. This means that you can start using AWS services on the Outpost, and you can manage the Outposts rack using the same tools and interfaces that you use in AWS Regions.

2. Landing Zone Accelerator on AWS

We recommend using Landing Zone Accelerator on AWS (LZA) to deploy a landing zone for your organization. Make sure that the accelerator is configured for the appropriate Region and industry. To do this, you must meet the following prerequisites:

    • A clear understanding of your organization’s compliance requirements, including the specific Region and industry rules in which you operate.
    • Knowledge of the different LZAs available and their capabilities, such as the compliance frameworks with which you align.
    • Have the necessary permissions to deploy the LZAs and configure it for your organization’s specific requirements.

Note that LZAs are designed to help organizations quickly set up a secure, compliant multi-account environment. However, it’s not a one-size-fits-all solution, and you must align it with your organization’s specific requirements.

3. Set up the data residency guardrails

Using Organizations, you must make sure that the Outpost is ordered within a workload account in the landing zone.

Figure 1 Landing Zone Accelerator Outposts workload on AWS high level Architecture

Figure 1: Landing Zone Accelerator – Outposts workload on AWS high level Architecture

Utilizing Outposts rack for regulated components

When local regulations require regulated workloads to stay within a specific boundary, or when an AWS Region or AWS Local Zone isn’t available in your jurisdiction, you can still choose to host your regulated workloads on Outposts rack for a consistent cloud experience. When opting for Outposts rack, note that, as part of the shared responsibility model, customers are responsible for attesting to physical security, access controls, and compliance validation regarding the Outposts, as well as, environmental requirements for the facility, networking, and power. Utilizing Outposts rack requires that you procure and manage the data center within the city, state, province, or country boundary for your applications’ regulated components, as required by local regulations.

Procuring two or more racks in the diverse data centers can help with the high availability for your workloads. This is because it provides redundancy in case of a single rack or server failure. Additionally, having redundant network paths between Outposts rack and the parent Region can help make sure that your application remains connected and continue to operate even if one network path fails.

However, for regulated workloads with strict service level agreements (SLA), you may choose to spread Outposts racks across two or more isolated data centers within regulated boundaries. This helps make sure that your data remains within the designated geographical location and meets local data residency requirements.

In this post, we consider a scenario with one data center, but consider the specific requirements of your workloads and the regulations that apply to determine the most appropriate high availability configurations for your case.

Outposts rack workload data residency guardrails

Organizations provide central governance and management for multiple accounts. Central security administrators use SCPs with Organizations to establish controls to which all AWS Identity and Access Management (IAM) principals (users and roles) adhere.

Now, you can use SCPs to set permission guardrails.  A suggested preventative controls for data residency on Outposts rack that leverage the implementation of SCPs are shown as follows. SCPs enable you to set permission guardrails by defining the maximum available permissions for IAM entities in an account. If an SCP denies an action for an account, then none of the entities in the account can take that action, even if their IAM permissions let them. The guardrails set in SCPs apply to all IAM entities in the account, which include all users, roles, and the account root user.

Upon finalizing these prerequisites, you can create the guardrails for the Outposts Organization Unit (OU).

Note that while the following guidelines serve as helpful guardrails – SCPs – for data residency, you should consult internally with legal and security teams for specific organizational requirements.

 To exercise better control over workloads in the Outposts rack and prevent data transfer from Outposts to the Region or data storage outside the Outposts, consider implementing the following guardrails. Additionally, local regulations may dictate that you set up these additional guardrails.

  1. When your data residency requirements require restricting data transfer/saving to the Region, consider the following guardrails:

a. Deny copying data from Outposts to the Region for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (Amazon RDS), Amazon ElastiCache and data sync “DenyCopyToRegion”.

b. Deny Amazon Simple Storage Service (Amazon S3) put action to the Region “DenyPutObjectToRegionalBuckets”.

If your data residency requirements mandate restrictions on data storage in the Region,  consider implementing this guardrail to prevent  the use of S3 in the Region.

Note: You can use Amazon S3 for Outposts.

c. If your data residency requirements mandate restrictions on data storage in the Region, consider implementing “DenyDirectTransferToRegion” guardrail.

Out of Scope is metadata such as tags, or operational data such as KMS keys.

{
  "Version": "2012-10-17",
  "Statement": [
      {
      "Sid": "DenyCopyToRegion",
      "Action": [
        "ec2:ModifyImageAttribute",
        "ec2:CopyImage",  
        "ec2:CreateImage",
        "ec2:CreateInstanceExportTask",
        "ec2:ExportImage",
        "ec2:ImportImage",
        "ec2:ImportInstance",
        "ec2:ImportSnapshot",
        "ec2:ImportVolume",
        "rds:CreateDBSnapshot",
        "rds:CreateDBClusterSnapshot",
        "rds:ModifyDBSnapshotAttribute",
        "elasticache:CreateSnapshot",
        "elasticache:CopySnapshot",
        "datasync:Create*",
        "datasync:Update*"
      ],
      "Resource": "*",
      "Effect": "Deny"
    },
    {
      "Sid": "DenyDirectTransferToRegion",
      "Action": [
        "dynamodb:PutItem",
        "dynamodb:CreateTable",
        "ec2:CreateTrafficMirrorTarget",
        "ec2:CreateTrafficMirrorSession",
        "rds:CreateGlobalCluster",
        "es:Create*",
        "elasticfilesystem:C*",
        "elasticfilesystem:Put*",
        "storagegateway:Create*",
        "neptune-db:connect",
        "glue:CreateDevEndpoint",
        "glue:UpdateDevEndpoint",
        "datapipeline:CreatePipeline",
        "datapipeline:PutPipelineDefinition",
        "sagemaker:CreateAutoMLJob",
        "sagemaker:CreateData*",
        "sagemaker:CreateCode*",
        "sagemaker:CreateEndpoint",
        "sagemaker:CreateDomain",
        "sagemaker:CreateEdgePackagingJob",
        "sagemaker:CreateNotebookInstance",
        "sagemaker:CreateProcessingJob",
        "sagemaker:CreateModel*",
        "sagemaker:CreateTra*",
        "sagemaker:Update*",
        "redshift:CreateCluster*",
        "ses:Send*",
        "ses:Create*",
        "sqs:Create*",
        "sqs:Send*",
        "mq:Create*",
        "cloudfront:Create*",
        "cloudfront:Update*",
        "ecr:Put*",
        "ecr:Create*",
        "ecr:Upload*",
        "ram:AcceptResourceShareInvitation"
      ],
      "Resource": "*",
      "Effect": "Deny"
    },
    {
      "Sid": "DenyPutObjectToRegionalBuckets",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": ["arn:aws:s3:::*"],
      "Effect": "Deny"
    }
  ]
}
  1. If your data residency requirements require limitations on data storage in the Region, consider implementing this guardrail “DenySnapshotsToRegion” and “DenySnapshotsNotOutposts” to restrict the use of snapshots in the Region.

a. Deny creating snapshots of your Outpost data in the Region “DenySnapshotsToRegion”

 Make sure to update the Outposts “<outpost_arn_pattern>”.

b. Deny copying or modifying Outposts Snapshots “DenySnapshotsNotOutposts”

Make sure to update the Outposts “<outpost_arn_pattern>”.

Note: “<outpost_arn_pattern>” default is arn:aws:outposts:*:*:outpost/*

{
  "Version": "2012-10-17",
  "Statement": [

    {
      "Sid": "DenySnapshotsToRegion",
      "Effect":"Deny",
      "Action":[
        "ec2:CreateSnapshot",
        "ec2:CreateSnapshots"
      ],
      "Resource":"arn:aws:ec2:*::snapshot/*",
      "Condition":{
         "ArnLike":{
            "ec2:SourceOutpostArn":"<outpost_arn_pattern>"
         },
         "Null":{
            "ec2:OutpostArn":"true"
         }
      }
    },
    {

      "Sid": "DenySnapshotsNotOutposts",          
      "Effect":"Deny",
      "Action":[
        "ec2:CopySnapshot",
        "ec2:ModifySnapshotAttribute"
      ],
      "Resource":"arn:aws:ec2:*::snapshot/*",
      "Condition":{
         "ArnLike":{
            "ec2:OutpostArn":"<outpost_arn_pattern>"
         }
      }
    }

  ]
}
  1. This guardrail helps to prevent the launch of Amazon EC2 instances or creation of network interfaces in non-Outposts subnets. It is advisable to keep data residency workloads within the Outposts rather than the Region to ensure better control over regulated workloads. This approach can help your organization achieve better control over data residency workloads and improve governance over your AWS Organization.

Make sure to update the Outposts subnets “<outpost_subnet_arns>”.

{
"Version": "2012-10-17",
  "Statement":[{
    "Sid": "DenyNotOutpostSubnet",
    "Effect":"Deny",
    "Action": [
      "ec2:RunInstances",
      "ec2:CreateNetworkInterface"
    ],
    "Resource": [
      "arn:aws:ec2:*:*:network-interface/*"
    ],
    "Condition": {
      "ForAllValues:ArnNotEquals": {
        "ec2:Subnet": ["<outpost_subnet_arns>"]
      }
    }
  }]
}

Additional considerations

When implementing data residency guardrails on Outposts rack, consider backup and disaster recovery strategies to make sure that your data is protected in the event of an outage or other unexpected events. This may include creating regular backups of your data, implementing disaster recovery plans and procedures, and using redundancy and failover systems to minimize the impact of any potential disruptions. Additionally, you should make sure that your backup and disaster recovery systems are compliant with any relevant data residency regulations and requirements. You should also test your backup and disaster recovery systems regularly to make sure that they are functioning as intended.

Additionally, the provided SCPs for Outposts rack in the above example do not block the “logs:PutLogEvents”. Therefore, even if you implemented data residency guardrails on Outpost, the application may log data to CloudWatch logs in the Region.

Highlights

By default, application-level logs on Outposts rack are not automatically sent to Amazon CloudWatch Logs in the Region. You can configure CloudWatch logs agent on Outposts rack to collect and send your application-level logs to CloudWatch logs.

logs: PutLogEvents does transmit data to the Region, but it is not blocked by the provided SCPs, as it’s expected that most use cases will still want to be able to use this logging API. However, if blocking is desired, then add the action to the first recommended guardrail. If you want specific roles to be allowed, then combine with the ArnNotLike condition example referenced in the previous highlight.

Conclusion

The combined use of Outposts rack and the suggested guardrails via AWS Organizations policies enables you to exercise better control over the movement of the data. By creating a landing zone for your organization, you can apply SCPs to your Outposts racks that will help make sure that your data remains within a specific geographic location, as required by the data residency regulations.

Note that, while custom guardrails can help you manage data residency on Outposts rack, it’s critical to thoroughly review your policies, procedures, and configurations to make sure that they are compliant with all relevant data residency regulations and requirements. Regularly testing and monitoring your systems can help make sure that your data is protected and your organization stays compliant.

References

Enabling Microsoft Defender Credential Guard on Amazon EC2

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/enabling-microsoft-defender-credential-guard-on-amazon-ec2/

This blog post is written by Jason Nicholls, Principal Solutions Architect AWS.

In this post we show you how to enable that Windows Defender Credential Guard (Credential Guard) on Amazon Elastic Compute Cloud (Amazon EC2) running Microsoft Windows Server. Credential Guard, when enabled on Amazon EC2 Windows Instances protects sensitive user login information from being extracted from the Operating System (OS) memory.

Microsoft Windows stores credential material such as authentication tokens in the Local Security Authority (LSA), a process in the Microsoft Windows operating system that is responsible for enforcing the security policy of the system. Traditionally these credentials are stored in LSA’s process memory accessible to the rest of the OS, making it possible for a privileged user to extract these credentials. The use of complex passwords, limiting privileged account use, and training users not to use the same password across multiple systems are steps that limit the use of credentials extracted from the LSA, but customers want to isolate these credentials from other rest of the OS.

With Credential Guard enabled, the LSA is isolated by Windows virtualization-based security (VBS). VBS is a suite of Windows security mechanisms that use hardware virtualization features to create an isolated compute environment to store user credentials referred to as the isolated LSA. The isolated LSA is inaccessible to the rest of the OS.

Prerequisites

Credential Guard requires a virtualization technology that supports Virtualization-based Security (VBS) and Unified Extensible Firmware Interface (UEFI) Secure Boot. Optionally, Credential Guard can leverage a Trusted Platform Module (TPM) to further secure credentials. In this walk-through we will use NitroTPM, a virtual TPM 2.0-compliant (ISO/IEC 11889:2015) module for your Amazon EC2 instances to enhance the security of the isolated LSA.

Launch your instance

A list of Windows Amazon Machine Image (AMI)s preconfigured to enable UEFI Secure Boot is available in our “Launch an instance with UEFI Secure Boot support ” guide. You can verify that your AMI supports UEFI Secure Boot and NitroTPM using the AWS Command Line Interface (AWS CLI) describe-images command as follows:

aws ec2 describe-images --image-ids ami-0123456789

When UEFI Secure Boot and NitroTPM are enabled for the AMI, “TpmSupport“: “v2.0“, and “BootMode”: “uefi” appear in the output respectively, such as in the following example.

{
   "Images": [
      {
         ...
         "BootMode": "uefi",
         "TpmSupport": "v2.0"
      }
   ]
}

Before launching the AMI verify that the instance type you want to launch supports UEFI boot mode and uses the Nitro System by using the DescribeInstanceTypes API call. Using the AWS CLI and calling describe-instance-types as follows:

aws ec2 describe-instance-types --instance-types [INSTANCE_TYPE] --region [REGION]

Where INSTANCE_TYPE is a supported instance type as defined in the documentation, an example is c5.large and REGION is a supported AWS Region.

The output of the command should display nitro as the hypervisor and list uefi as a supported boot mode. For example:

{
   "InstanceTypes": [
   {
      ...
      "Hypervisor": "nitro",
      ...
      "SupportedBootModes": [
                "legacy-bios",
                "uefi"
      ]
   }
}

Use the Amazon EC2 console or AWS Command Line Interface (AWS CLI) to launch an Amazon EC2 instance which has “uefi” boot mode enabled. Administrator access to the Amazon EC2 instance is required to enable Credential Guard.

Walkthrough:

Enabling Credential Guard with Amazon EC2 Launch an Amazon EC2 Windows Instance

Launch an Amazon EC2 Windows Instance using a Windows AMI preconfigured to enable UEFI Secure Boot with Microsoft Windows Secure Boot Keys on an instance type that supports UEFI Secure Boot.

You can use the launch wizard in the Amazon EC2 console or run-instances command via the AWS CLI to launch an instance that can support Credential Guard. You need a compatible AMI ID for launching your instance which is unique for each AWS Region. You can use the following link to discover and launch instances with compatible Amazon-provided AMIs in the Amazon EC2 console:

Enable Credential Guard on the launched Instance

Now that you know how to create an AMI with UEFI Secure Boot support enabled, let’s create a Windows instance and configure Credential Guard.

Credential Guard can be enabled either by using Group Policies, the Windows Registry, or the Windows Defender Device Guard and Windows Defender Credential Guard hardware readiness tool (DG-Readiness tool). In this post we will show you how to enable Credential Guard using the Windows Registry and via the DG-Readiness tool.

Option 1: Enabling Credential Guard via the Windows Registry

Start the Amazon EC2 instance using an AMI that has the BootMode set to “uefi“. Windows AMI must be preconfigured to enable UEFI Secure Boot with Microsoft Windows Secure Boot keys as we defined earlier. Once you’re connected to the instance use the Windows System Information tool to check that Credential Guard isn’t running on the instance:

  1. Select Start, type msinfo32.exe, and then select System Information.
  2. Select System Summary on the left.
  3. Confirm that Virtualization-based security is Not Enabled.

Figure 1 Overview of System Information in WindowsFigure 1 Overview of System Information in Windows

Figure 2 System Information confirming that Credential Guard is Not Enabled

Figure 2 System Information confirming that Credential Guard is Not Enabled

4. Open the Windows Command Shell by selecting Start, type cmd.exe, and then press Enter.

5. Run the following commands from the Windows Command Shell to enable Credential Guard using the Windows Registry:

REG ADD "HKLM\System\CurrentControlSet\Control\DeviceGuard" /v EnableVirtualizationBasedSecurity /t REG_DWORD /d 1 /f

REG ADD "HKLM\System\CurrentControlSet\Control\DeviceGuard" /v RequirePlatformSecurityFeatures /t REG_DWORD /d 1 /f

REG ADD "HKLM\System\CurrentControlSet\Control\LSA" /v LsaCfgFlags /t REG_DWORD /d 2 /f

REG ADD "HKLM\SYSTEM\CurrentControlSet\Control\DeviceGuard\Capabilities\" /v "HyperVEnabled" /t REG_DWORD /d 1 /f

Figure 3 Update the Windows Registry to enable Credential GuardFigure 3 Update the Windows Registry to enable Credential Guard

6. Once the changes have been made, reboot the instance.

7. When the instance is back up, reconnect to the instance and select Start, type msinfo32.exe, and then select System Information.

8. Select System Summary on the left.

9. Note that Virtualisation-based security is now changed to Running, and that Secure Boot and Credential Guard are enabled.

Figure 4 Overview of System Information showing Credential Guard is EnabledFigure 4 Overview of System Information showing Credential Guard is Enabled

Figure 5 System Information shows Credential Guard is now EnabledFigure 5 System Information shows Credential Guard is now Enabled

Option 2: Enabling Credential Guard via the DG-Readiness tool

Option 1 requires a remote desktop session to your Windows Instance. Option 2 is run via PowerShell which can either be done in a remote desktop session as described here or via a remote PowerShell session.

Start the Amazon EC2 instance using an AMI that has the BootMode set to “uefi“. Windows AMI must be preconfigured to enable UEFI Secure Boot with Microsoft Windows Secure Boot keys as we defined earlier. Once you’re connected to the instance start a PowerShell session:

  1. Select Start, type PowerShell, and then click Windows PowerShell.

Figure 6 Start PowerShell via the Start BarFigure 6 Start PowerShell via the Start Bar

      2. Download the DG-Readiness tool by running the command:

wget https://download.microsoft.com/download/B/D/8/BD821B1F-05F2-4A7E-AA03-DF6C4F687B07/dgreadiness_v3.6.zip -outfile dgreadiness.zip

Figure 7 Download the DG-Readiness toolFigure 7 Download the DG-Readiness tool

  1. Uncompress the downloaded zip file using the Expand-Archive function within PowerShell
Expand-Archive -Path C:\Users\Administrator\dgreadiness.zip -DestinationPath C:\dgreadiness
  1. Move the DG-Readiness tool to the current folder

copy C:\dgreadiness\dgreadiness_v3.6\DG_Readiness_Tool_v3.6.ps1 .\

Figure 8 Copy the DG-Readiness tool to the current folderFigure 8 Copy the DG-Readiness tool to the current folder

  1. Confirm that Credential Guard is disabled by running the DG-Readiness tool with the -Ready option, as follows:
DG_Readiness_Tool_v3.6.ps1 -Ready

Figure 9 Using the DG-Readiness tool to confirm Credential Guard is not enabledFigure 9 Using the DG-Readiness tool to confirm Credential Guard is not enabled

  1. Enable Credential Guard using the -Enable -CG options as follows:
DG_Readiness_Tool_v3.6.ps1 -Enable -CG
  1. Reboot the instance
  2. Reconnect to the instance after the reboot and confirm that Credential Guard is now running by running the command:
DG_Readiness_Tool_v3.6.ps1 -Ready

Figure 10 DG Readiness Tool shows Credential Guard is enabled and runningFigure 10 DG Readiness Tool shows Credential Guard is enabled and running

Conclusion

With support for Windows Defender Credential Guard on Amazon EC2 Windows Instances customers can create an isolated compute environment that is inaccessible to the rest of the OS. Credential Guard requires UEFI Secure Boot support. Credential Guard can leverage NitroTPM to further secure credentials.

To learn more about the support of Credential Guard visit documentation.

Using Porting Advisor for Graviton

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/using-porting-advisor-for-graviton/

This blog post is written by Ryan Doty Solutions Architect , AWS and Vishal Manan Sr. SSA, EC2 Graviton , AWS.

Introduction

AWS customers recognize that Graviton-based EC2 instances deliver price-performance benefits but many are concerned about the effort to port existing applications. Porting code from one architecture to another can result in a substantial investment in time and effort. AWS has worked continuously to improve the migration process for customers. We recently introduced the Porting Advisor for Graviton as a tool to further simplify the migration process. In this blog, we’ll walk you through how to use Porting Advisor for Graviton so that you can learn how to use it.

Porting Advisor for Graviton is an open-source, command-line tool that analyzes source code and generates a report highlighting missing or outdated libraries and code constructs that may require modification and provides a user with alternative recommendations. It helps customers and developers accelerate their transition to Graviton-based Amazon EC2 instances by reducing the iterative process of identifying and resolving source code and library dependencies. This blog post will provide you with a step-by-step implementation on how to use Porting Advisor for Graviton. At the end of the blog, you will be able to run Porting Advisor for Graviton on your source code tree, generating findings that will help simplify the effort required to port your application.

Porting Advisor for Graviton scans for potentially unsupported or non-portable arm64 code in source code trees. The tool only scans source code files for the programming languages C/C++, Fortran, Go 1.11+, Java 8+, Python 3+, and dependency files such as project/requirements.txt file(s). Most importantly the Porting Advisor doesn’t make any code modifications, API-level recommendations, or send data back to AWS.

You can utilize Porting Advisor for Graviton either as a Python script or compiled into a binary and run on x86-64 or arm64 systems. Therefore, it can be easily implemented as part of your build processes.

Expected Results

Porting Advisor for Graviton reports the following issues:

  1. Inline assembly with no corresponding arm64 inline assembly.
  2. Assembly source files with no corresponding arm64 assembly source files.
  3. Missing arm64 architecture detection in autoconf config.guess scripts.
  4. Linking against libraries that aren’t available on the arm64 architecture.
  5. Use  architecture specific intrinsics.
  6. Preprocessor errors that trigger when compiling on arm64.
  7. Usages of old Visual C++ runtime (Windows specific).

Compiler specific code guarded by compiler specific pre-defined macros is detected, but not reported by default. The following cross-compile specific issues are detected, but not reported by default:

  • Architecture detection that depends on the host rather than the target.
  • Use of build artifacts in the build process.

Skillsets needed for using the tool

Porting Advisor for Graviton is designed to be easy to use. Users though should be versed in the following skills in order to take advantage of the recommendations the tool provides:

  • An understanding of the build system – project requirements and dependencies, versioning, etc.
  • Basic scripting language skills around Python, PowerShell/Bash.
  • Understanding hardware (inline assembly for C/C++) or compiler specific (intrinsic for C/C++) constructs when applicable.
  • The ability to follow best practices in the AWS Graviton Technical Guide for code optimization.

How to use Porting Advisor for Graviton

Prerequisites

The tool requires a minimum version of Python 3.10 and Java (8+ to be installed). The installation of Java is also optional and only required if you want to scan JAR files for native method calls. You can run the tool on a Windows/Linux/MacOS machine, or in an EC2 instance. I will show case usage on Windows and Amazon Linux 2(AL2) running on an EC2 instance. It supports both arm64 and x86-64 processors.

You don’t need to be on an arm64-based processor to run the tool.

The tool doesn't need a lot of CPU Horsepower and even a system with few processors will do

You can run the tool as a Python script or an executable. The executable requires extra steps to build. However, it can be used on another machine without the need to install Python packages.

You must copy the complete “dist” folder for it to work, as there are libraries that are present in that folder along with the executable.

Porting Advisor for Graviton can be run as a binary or as a script. You must run the script to generate the binaries

./porting-advisor-linux-x86_64 ~/my/path/to/my/repo --output report.html 
./porting-advisor-linux-x86_64.exe ../test/CppCode/inline_assembly --output report.html

Running Porting Advisor for Graviton as an executable

The first step is to build the executable.

Building the executable

The first step is to set up the Python Environment. If you run into any errors building the binary, see the Troubleshooting section for more details.

Building the binary on Windows

Building Porting Advisor using Powershell

Building Porting Advisor binary

Building the binary on Linux/Mac

Using shell script to build on Linux or macOS

Porting Advisor binary saved in dist folder

Running the binary

Here you can see how you can run the tool on Linux as a binary for a C++ project.

Porting advisor binary run on a C++ codebase with 350 files

The “dist” folder will have the executable.

Running Porting Advisor for Graviton as a script

Enable the Python environment for the following:

Linux/Mac:

$. python3 -m venv .venv
$. source .venv/bin/activate

PowerShell:

PS> python -m venv .venv
PS> .\.venv\Scripts\Activate.ps1

The following shows how the tool would work on Windows when run as a script:

Running Porting Advisor on Windows as a powershell script

Running the Porting Advisor for Graviton on Linux as a Script

Setting up Python environment to run the Porting Advisor as a script

Running Porting Advisor on Linux as a script

Output of the Porting Advisor for Graviton

The Porting Advisor for Graviton requires the directory parameter to point to the folder where your source code lives. If there are multiple locations, then you can run the tool as part of the script.

If no output file is supplied only standard output will be produced. The following is the output of the tool in HTML format. The first line shows the total files scanned.

  1. If no issues are found, then you’ll see an output like the following:

Results of Porting Advisor run on C++ code with 2836 files with no issues found

  1. With x86-64 specific intrinsics, such as _bswap64, you’ll see it flagged. arm64 specific intrinsics won’t be flagged. Therefore, if your code has arm64 specific intrinsics, then the porting advisor will only flag x86-64 to arm64 and not vice versa. The primary goal of the tool is determining the arm64 readiness of your code.

Porting Advisor reporting inline assembly files in C++ code

  1. The scanner typically looks for source code files only, but it can also look for assembly files with *.s extensions. An example of a file with the C++ code with inline assembly code is as follows:

Porting Advisor reporting use of intrinsics in C++ code

  1. The tool has pointed out errors such as a preprocessor error and architecture-specific intrinsic usage errors.

Porting Advisor results run on C++ code pointing out missing preprocessor macros for arm64 and x86-64 specific intrinsics

Next steps

If you don’t see any issues reported by the tool, then you are in good shape for doing the port. If no issues are reported it does not guarantee that it will work as port. Porting Advisor for Graviton is a tool used best as a helper. Regardless of the issues reported by the tool, we still highly suggest testing the application thoroughly.

As a good practice, we recommend that you use the latest version of your libraries.

Based on the issue, you can consider further actions. For Compiler intrinsic errors, we recommend studying Intel and arm64 intrinsics.

Once you’ve migrated your code and are using Gravition, you can start to look at taking steps to optimize your performance. If you’re interested in doing that please look at our Getting Started Guide.

If you run into any issues, then see our CONTRIBUTING file.

FAQs

  1. How fast is the tool? The tool is able to scan 4048 files in 1.18 seconds.

On an arm64 Based Instance:

Porting Advisor scans 4048 files in 1.18 seconds

  1. False Positives?

This tool scans all files in a source tree, regardless of whether they are included by the build system or not. Therefore, it may misreport issues in files that appear in the source tree but are excluded by the build system. Currently, the tool supports the following languages/dependencies: C/C++, Fortran, Go 1.11+, Java 8+, and Python 3+.

For example: You may have legacy code using Python version 2.7 that is in the source tree but isn’t being used. The tool will scan the code base and point out issues in that particular codebase even though you may not be using that piece of code.  To mitigate, either  remove that folder from the source code or ignore the error pointed by the tool.

  1. I see mention of Ruby and .Net in the Open source tool, but they don’t work on my tool.

Ruby and .Net haven’t been implemented yet, but please consider contributing to it and open an issue requesting support. If you need support then see our CONTRIBUTING file.

Troubleshooting

Errors that you may encounter while building the tool binary:

PyInstaller needs a shared version of libraries.

  1. Python 3.10+ not having shared libraries for use by the PyInstaller tool.

Building Porting Advisor binaries

Pyinstaller failed at building binary and suggesting building Python configure script with --enable-shared on Linux or --enable-framework on macOS

The fix for this is to build your version of Python(3.10+) with the right flags:

./configure --enable-optimizations --enable-shared

If the two flags don’t work together, try doing the build with each flag enabled sequentially.

pyinstaller tool needs python configure script with --enable-shared and enable-optimizations flag

  1. Incorrect Python version (version less than 3.10).If you aren’t on the correct version of Python:

You will get errors similar to the ones here:

Python version on host is 3.7.15 which is less than the recommended version

If you want to run the tool in an EC2 instance on Amazon Linux 2(AL2), then you could try upgrading/installing Python 3.10 as pointed out here.

If you run into any issues, then see our CONTRIBUTING file.

Trying to run Porting Advisor as script will result in Syntax errors on Python version less than 3.10

Conclusion

Porting Advisor for Graviton helps customers quantify the amount of work that is required to port an application. It accelerates your ability to transition to Graviton-based Amazon EC2 instances by reducing the iterative process of identifying and resolving source code and library dependencies.

Resources

To learn how to migrate your workloads to Graviton-based instances, see the AWS Graviton Technical Guide GitHub Repository and AWS Graviton Transition Guide. To get started with Graviton-based Amazon EC2 instances, see the AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs.

Some other resources include:

How to create custom health checks for your Amazon EC2 Auto Scaling Fleet

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/how-to-create-custom-health-checks-for-your-amazon-ec2-auto-scaling-fleet/

This blog post is written by Gaurav Verma, Cloud Infrastructure Architect, Professional Services AWS.

Amazon EC2 Auto Scaling helps you maintain application availability and lets you automatically add or remove Amazon Elastic Compute Cloud (Amazon EC2) instances according to the conditions that you define. You can use dynamic and predictive scaling to scale-out and scale-in EC2 instances. Auto Scaling helps to maintain the self-healing Amazon EC2 environment for an application where Auto Scaling can use the status of Amazon EC2 health checks to determine if an instance is faulty and needs replacement. Amazon EC2 Auto Scaling provides three types of health checks, which are discussed below.

EC2 Status check: AWS provides two types of health checks for EC2 instances: System status check and instance status check. System status checks monitor the AWS system on which an instance is running. If the problem is with underlying system, AWS will fix the problem. Instance status check monitors the software and network configuration of an instance. If the instance status check fails, then you can fix the problem by following the steps in the troubleshoot instances with failed status checks documentation.

Elastic Load Balancer Health Check: Auto Scaling groups are generally connected to Elastic Load Balancers (ELB). ELB provides the application-level health check by monitoring the endpoint (a webpage, or a health page in a web application) of an application. ELB health check monitors the application and marks the instance unhealthy if there is no response from an instance in the configured time.

Custom Health Check: You can use custom health checks to mark any instance as unhealthy if the instance fails for the check you define. Custom health checks can be used to implement various user requirements, such as the presence of instance tags added upon completion of a required workflow. The user data script is executed at instance boot time, and it can perform additional investigation into whether or not the user requirements are met before confirming that the instance is ready to accept load. For example, this approach could be used to confirm that the instance was successfully integrated with other parts of a complex application stack.

In some cases, a customer may add multiple checks either in the Amazon EC2 AMI or in the boot sequence to keep the instance secure and compliant. These checks can increase the boot time for the EC2 instance, and they can reboot the EC2 instance multiple times before an instance can be marked as compliant. Therefore, in some cases, an EC2 instance boot period can take forty to fifty minutes or longer.

If an EC2 instance isn’t marked as healthy within a defined time, Auto Scaling will mark an instance unhealthy, even though the instance wasn’t yet ready for evaluation. Custom health checks can help manage these situations. You can write the Amazon EC2 user data script to perform the custom health check and force Auto Scaling to wait until the instance is truly healthy (i.e., functional, secure, and compliant).

This blog describes a method to write a custom health check. We write an Amazon EC2 user data script to perform the custom health check and automate it for future EC2 instances in the Auto Scaling group. This script can wait for an instance to successfully complete the boot process and then mark the instance as healthy.

Prerequisites

You must have an AWS Identity and Access Management (IAM) role for Amazon EC2 with an Auto Scaling policy, which has these two actions allowed for the Auto Scaling group:

autoscaling:CompleteLifecycleAction

autoscaling:RecordLifecycleActionHeartbeat

Furthermore, we use the Amazon EC2 Auto Scaling lifecycle hooks. Lifecycle hooks let you create a solutions that are aware of events in Auto Scaling instance lifecycle, and then perform a custom action on instances when the lifecycle event occurs. As mentioned previously, typically a custom health check is needed when determining the workload readiness of an instance would be longer than the usual boot time for an EC2 instance that Auto Scaling assumes. Therefore, we utilize lifecycle hooks to keep the checks running until the instance is marked healthy.

Create custom health check

Let’s look at an example where an instance can only be marked as healthy if the instance has a tag with the key “Compliance-Check” and value “Successful”. Until this tag is both (a) present and (b) carries the value “Successful”, the instance shouldn’t be marked as “InService”.

  1. Create the Auto Scaling launch template for Amazon EC2 Auto Scaling. Name your Launch template “test”. In the additional configuration for user data, use this shell script as text.

The Following script will install the AWS Command Line Interface (AWS CLI) to interact with the AWS tagging and Auto Scaling APIs. Then, the script will run the while loop until the instance has a tag with the key “Compliance-Check” and value “Successful”. Once the instance has a tag, it will mark the instance as healthy and the instance will move into the “InService” state.

#!/bin/bash
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
#get instance id
instance=$(curl http://169.254.169.254/latest/meta-data/instance-id)

#Checking instance status
while true
do
readystatus=$(aws ec2 describe-instances --instance-ids $instance --filters "Name=tag: Compliance-Check,Values= Successful" |grep -i $instance)
if [[ $readystatus = *"InstanceId"* ]]; then
        echo $readystatus >> /home/ec2-user/user-script-output.txt
        aws autoscaling set-instance-health --instance-id $instance --health-status Healthy
        aws autoscaling complete-lifecycle-action --lifecycle-action-result CONTINUE --instance-id $instance --lifecycle-hook-name test --auto-scaling-group-name my-asg
        break 
else
	aws autoscaling set-instance-health --instance-id $instance --health-status Unhealthy
	sleep 5  
fi

done
  1. Create an Amazon EC2 Auto Scaling group using the AWS CLI with the “test” launch template that you just created and a predefined lifecycle hook. First, create a JSON file “config.json” in a system where you will run the AWS command to create the Auto Scaling group.
{
    "AutoScalingGroupName": "my-asg",
    "LaunchTemplate": {"LaunchTemplateId": "lt-1234567890abcde12"} ,
    "MinSize": 2,
    "MaxSize": 4,
    "DesiredCapacity": 2,
    "VPCZoneIdentifier": "subnet-12345678, subnet-90123456",
    "NewInstancesProtectedFromScaleIn": true,
    "LifecycleHookSpecificationList": [
        {
            "LifecycleHookName": "test",
            "LifecycleTransition": "autoscaling:EC2_INSTANCE_LAUNCHING",
            "HeartbeatTimeout": 300,
            "DefaultResult": "ABANDON"
        }
    ],
    "Tags": [
        {
            "ResourceId": "my-asg",
            "ResourceType": "auto-scaling-group",
            "Key": "Compliance-Check",
            "Value": "UnSuccessful",
            "PropagateAtLaunch": true
        }
    ]
}

To create the Auto Scaling group with the AWS CLI, you must run the following command at the same location where you saved the preceding JSON file. Make sure to replace the relevant subnets that you intend to use in the VPCZoneIdentifier.

>> aws autoscaling create-auto-scaling-group –cli-input-json file://config.json

This command will create the Auto Scaling group with a configuration defined in the JSON file. This Auto Scaling group should have two instances and a lifecycle hook called “test” with a 300 second wait period at the time of launch of an instance.

Tests

Now is the time to test the newly-created instances with a custom health check. Instances in Auto Scaling should be in the “Pending:Wait” stage, not the “InService” stage. Instances will be in this stage for approximately five minutes because we have a lifecycle hook time of 300 seconds in the config.json file.

If the workload readiness evaluation takes more than 300 seconds in your environment, then you can increase the lifecycle hook period to as long as 7200 seconds.

Change the tag value for one instance from “UnSuccessful” to “Successful”. If you’ve changed the tag within the five minutes of instances creation, then the instance should be in the “InService” state and marked as healthy.

Change Compliance-Check value from "UnSuccessful" to "Successful".

This test is a simulation of the situation where the health check of an instance depends on the tag values, and the tag values are only updated if the instance passes all of the checks as per the organization standards. Here we change the tag value manually, but in a real use case scenario, this value would be changed by the booting process when instances are marked as compliant.

Another test case could be that an instance should be marked as healthy if it’s added to the configuration management database, but not before that. For these checks, you can use the API with the curl command and look for the desired result. An example to call an API is in the above script, where it calls the AWS API to get the instance ID.

In case your custom health check script needs more than 7200 seconds, you can use this command to increase the lifecycle hook time:

>> aws autoscaling record-lifecycle-action-heartbeat –lifecycle-hook-name <lh_
name> –auto-scaling-group-name <asg_name> –instance-id <instance_id>

This command will give you the extra time equal to the time that you have configured in the life cycle hook.

Cleanup

Once you successfully test the solution, to avoid ongoing charges for resources you created, you should delete the resources.

To delete the Amazon EC2 Auto Scaling group, run the following command:

aws autoscaling delete-auto-scaling-group --auto-scaling-group-name my-asg

To delete the launch template, run the following command:

aws ec2 delete-launch-template --launch-template-id lt-1234567890abcde12

Delete the role and policy as well if you no longer need it.

Conclusion

EC2 Auto Scaling custom health checks are useful when system or instance health is insufficient, and you want instances to be marked as healthy only after additional checks. Typically, because of these different checks, the Amazon EC2 boot period can be longer than usual, and this may impact the scale-out process when an application needs more resources.

You can start by exploring EC2 Auto Scaling warm pools for these environments. You can keep the healthy marked instances in the warm pool in the Stopped stage. Then, these instances can be brought into the main pool at the time of scale-out without spending time on the boot process and lengthy health check. If you enable scale-in protection, then these healthy instances can move back to the warm pool at the time of scale-in rather than being terminated altogether.

Scaling an ASG using target tracking with a dynamic SQS target

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/scaling-an-asg-using-target-tracking-with-a-dynamic-sqs-target/

This blog post is written by Wassim Benhallam, Sr Cloud Application Architect AWS WWCO ProServe, and Rajesh Kesaraju, Sr. Specialist Solution Architect, EC2 Flexible Compute.

Scaling an Amazon EC2 Auto Scaling group based on Amazon Simple Queue Service (Amazon SQS) is a commonly used design pattern in decoupled applications. For example, an EC2 Auto Scaling Group can be used as a worker tier to offload the processing of audio files, images, or other files sent to the queue from an upstream tier (e.g., web tier). For latency-sensitive applications, AWS guidance describes a common pattern that allows an Auto Scaling group to scale in response to the backlog of an Amazon SQS queue while accounting for the average message processing duration (MPD) and the application’s desired latency.

This post builds on that guidance to focus on latency-sensitive applications where the MPD varies over time. Specifically, we demonstrate how to dynamically update the target value of the Auto Scaling group’s target tracking policy based on observed changes in the MPD. We also cover the utilization of Amazon EC2 Spot instances, mixed instance policies, and attribute-based instance selection in the Auto Scaling Groups as well as best practice implementation to achieve greater cost savings.

The challenge

The key challenge that this post addresses is applications that fail to honor their acceptable/target latency in situations where the MPD varies over time. Latency refers here to the time required for any queue message to be consumed and fully processed.

Consider the example of a customer using a worker tier to process image files (e.g., resizing, rescaling, or transformation) uploaded by users within a target latency of 100 seconds. The worker tier consists of an Auto Scaling group configured with a target tracking policy. To achieve the target latency mentioned previously, the customer assumes that each image can be processed in one second, and configures the target value of the scaling policy so that the average image backlog per instance is maintained at approximately 100 images.

In the first week, the customer submits 1000 images to the Amazon SQS queue for processing, each of which takes one second of processing time. Therefore, the Auto Scaling group scales to 10 instances, each of which processes 100 images in 100s, thereby honoring the target latency of 100s.

In the second week, the customer submits 1000 slightly larger images for processing. Since an image’s processing duration scales with its size, each image takes two seconds to process. As in the first week, the Auto Scaling group scales to 10 instances, but this time each instance processes 100 images in 200s, which is twice as long as was needed in the first round. As a result, the application fails to process the latter images within its acceptable latency.

Therefore, the challenge is common to any latency-sensitive application where the MPD is subject to change. Applications where the processing duration scales with input data size are particularly vulnerable to this problem. This includes image processing, document processing, computational jobs, and others.

Solution overview

Before we dive into the solution, let’s briefly review the target tracking policy’s scaling metric and its corresponding target value. A target tracking scaling policy works by adjusting the capacity to keep a scaling metric at, or close to, the specified target value. When scaling in response to an Amazon SQS backlog, it’s good practice to use a scaling metric known as the Backlog Per Instance (BPI) and a target value based on the acceptable BPI. These are computed as follows:

BPI equation, Saling metric and target value.

Given the acceptable BPI equation, a longer MPD requires us to use a smaller target value if we are to process these messages in the same acceptable latency, and vice versa. Therefore, the solution we propose here works by monitoring the average MPD over time and dynamically adjusting the target value of the Auto Scaling group’s target tracking policy (acceptable BPI) based on the observed changes in the MPD. This allows the scaling policy to adapt to variations in the average MPD over time, and thus enables the application to honor its acceptable latency.

Solution architecture

To demonstrate how the approach above can be implemented in practice, we put together an example architecture highlighting the services involved (see the following figure). We also provide an automated deployment solution for this architecture using an AWS Serverless Application Model (AWS SAM) template and some Python code (repository link). The repository also includes a README file with detailed instructions that you can follow to deploy the solution. The AWS SAM template deploys several resources, including an autoscaling group, launch template, target tracking scaling policy, an Amazon SQS queue, and a few AWS Lambda functions that serve various functions described as follows.

The Amazon SQS queue is used to accumulate messages intended for processing, while the Auto Scaling group instances are responsible for polling the queue and processing any messages received. To do this, a launch template defines a bootstrap script that allows the group’s instances to download and execute a Python code when first launched. The Python code consumes messages from the Amazon SQS queue and simulates their processing by sleeping for the MPD duration specified in the message body. After processing each message, the instance publishes the MPD as an Amazon CloudWatch metric (see the following figure).

Figure 1: Architecture diagram showing the components deployed by the AWS SAM template. These include an SQS queue, an Auto Scaling group responsible for polling and processing queue messages, a Lambda function that regularly updates the BPI CloudWatch metric, and a “Target Setter” Lambda function that regularly updates the Auto Scaling group’s target tracking scaling policy.

Figure 1: Architecture diagram showing the components deployed by the AWS SAM template.

To enable scaling, the Auto Scaling group is configured with a target tracking scaling policy that specifies BPI as the scaling metric and with an initial target value provided by the user.

The BPI CloudWatch metric is calculated and published by the “Metric-Publisher” Lambda function which is invoked every one minute using an Amazon EventBridge rate expression. To calculate BPI, the Lambda function simply takes the ratio of the number of messages visible in the Amazon SQS queue by the total number of in-service instances in the Auto Scaling group, as shown in equation (2) above.

On the other hand, the scaling policy’s target value is updated by the “Target-Setter” Lambda function, which is invoked every 30 minutes using another EventBridge rate expression. To calculate the new target value, the Lambda function simply takes the ratio of the user-defined acceptable latency value by the current average MPD queried from the corresponding CloudWatch metric, as shown in the previous equation (1).

Finally, to help you quickly test this solution, a Lambda function “Testing Lambda” is also provided and can be used to send messages to the Amazon SQS queue with a processing duration of your choice. This is specified within each message’s body. You can invoke this Lambda function with different MPDs (by modifying the corresponding environment variable) to verify how the Auto Scaling group scales in response. A CloudWatch dashboard is also deployed to enable you to track key scaling metrics through time. These include the number of messages visible in the queue, the number of in-service instances in the Auto Scaling group, the MPD, and BPI vs acceptable BPI.

Solution testing

To demonstrate the solution in action and its impact on application latency, we conducted two tests that you can reproduce by following the instructions described in the “Testing” section of the repository’s README file (repository link). In both tests, we assume a hypothetical application with a target latency of 300s. We also modified the invocation frequency of the “Target Setter” Lambda function to one minute to quickly assess the impact of target value changes. In both tests, we submit 50 messages to the Amazon SQS queue through the provided helper lambda. An MPD of 25s and 50s was used for the first and second test, respectively. The provided CloudWatch dashboard shows that the ASG scales to a total of four instances in the first test, and eight instances in the second (see the following figure). See README file for a detailed description of how various metrics evolve over time.

Comparison of Tests 1 and 2

Since Test 2 messages take twice as long to process, the Auto Scaling group launched twice as many instances to attempt to process all of the messages in the same amount of time as Test 1 (latency). The following figure shows that the total time to process all 50 messages in Test 1 was 9 mins vs 10 mins in Test 2. In contrast, if we were to use a static/fixed acceptable BPI of 12, then a total of four instances would have been operational in Test 2, thereby requiring double the time of Test 1 (~20 minutes) to process all of the messages. This demonstrates the value of using a dynamic scaling target when processing messages from Amazon SQS queues, especially in circumstances where the MPD is prone to vary with time.

Figure 2: CloudWatch dashboard showing Auto Scaling group scaling test results (Test 1 and 2). Although Test 2 messages require double the MPD of Test 1 messages, the Auto Scaling group processed Test 2 messages in the same amount of time as Test 1 by launching twice as many instances.

Figure 2: CloudWatch dashboard showing Auto Scaling group scaling test results (Test 1 and 2).

Recommended best practices for Auto Scaling groups

This section highlights a few key best practices that we recommend adopting when deploying and working with Auto Scaling groups.

Reducing cost using EC2 Spot instances

Amazon SQS helps build loosely coupled application architectures, while providing reliable asynchronous communication between the various layers/components of an application. If a worker node fails to process a message within the Amazon SQS message visibility time-out, then the message is returned to the queue and another worker node can pick up and process that message. This makes Amazon SQS-backed applications fault-tolerant by design and thus a great fit for EC2 Spot instances. EC2 spot instances are spare compute capacity in the AWS cloud that is available to you at steep discounts as compared to On-Demand prices.

Maximizing capacity using attribute-based instance selection

With the recently released attribute-based instance selection feature, you can define infrastructure requirements based on application needs such as vCPU, RAM, and processor family (e.g., x86, ARM). This removes the need to define specific instances in your Auto Scaling group configuration, and it eliminates the burden of identifying the correct instance families and sizes. In addition, newly released instance types will be automatically considered if they fit your requirements. Attribute-based instance selection lets you tap into hundreds of different EC2 instance pools, which increases the chance of getting EC2 (Spot/On-demand) instances. When using attribute-based instance selection with the capacity optimized allocation strategy, Amazon EC2 allocates instances from deeper Spot capacity pools, thereby further reducing the chance of Spot interruption.

The following sample configuration creates an Auto Scaling group with attribute-based instance selection:

AutoScalingGroupName: 'my-asg' # [REQUIRED] 
MixedInstancesPolicy:
  LaunchTemplate:
    LaunchTemplateSpecification:
      LaunchTemplateId: 'lt-0537239d9aef10a77'
    Overrides:
    - InstanceRequirements:
        VCpuCount: # [REQUIRED] 
          Min: 2
          Max: 4
        MemoryMiB: # [REQUIRED] 
          Min: 2048
  InstancesDistribution:
    SpotAllocationStrategy: 'capacity-optimized'
MinSize: 0 # [REQUIRED] 
MaxSize: 100 # [REQUIRED] 
DesiredCapacity: 4
VPCZoneIdentifier: 'subnet-e76a128a,subnet-e66a128b,subnet-e16a128c'

Conclusion

As can be seen from the test results, this approach demonstrates how an Auto Scaling group can honor a user-provided acceptable latency constraint while accomodating variations in the MPD over time. This is possible because the average MPD is monitored and regularly updated as a CloudWatch metric. In turn, this is continously used to update the target value of the group’s target tracking policy. Moreover, we have covered additional Auto Scaling group best practices suitable for this use case, including the use of Spot instances to reduce costs and attribute-based instance selection to simplify the selection of relevant instance types.

For more information on scaling options for Auto Scaling groups, visit the Amazon EC2 Auto Scaling documentation page and the SQS-based scaling guide.

How to choose between CoIP and Direct VPC routing modes on AWS Outposts rack

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/how-to-choose-between-coip-and-direct-vpc-routing-modes-on-aws-outposts-rack/

This blog post is written by Sumit Menaria, Senior Hybrid Solutions Architect AWS WWSO Core Services.

AWS Outposts Rack is a fully-managed service that extends AWS infrastructure, services, APIs, and tools to customer premises. By providing local access to AWS managed infrastructure and services, Outposts rack enables customers to build and run applications on premises using the same programming interfaces as in AWS Regions, while using local compute and storage resources for low latency, local data processing, and data residency needs.

There are various data sources on premises that you might want to connect from your Outpost. These sources can include field devices, on-premises databases, mainframes, storage arrays, or end users. Each Outpost supports a single Local Gateway (LGW) construct, which enables connectivity from your Outpost subnets to an on-premises network. Note that this post is specific to Outposts racks and a different method of local communication is used for AWS Outposts servers.

Two different options for facilitating communication between your Outpost based resources and on-premises network: Direct VPC routing and customer-owned IP pool. Both of these are mutually exclusive options, and routing works differently based on your choice of the mode. The two modes are the attributes of the LGW route table that your Outpost subnets’ VPC is associated with, which specifies the communication mode for the Outpost subnets.

Direct VPC routing mode

Direct VPC routing uses the private IP address of the instances in the VPC CIDR block to facilitate communication with your on-premises network. These addresses are advertised to your on-premises network with Border Gateway Protocol (BGP). Advertisement via BGP is only for the private IP addresses that belong to the subnets on your Outpost and have a route pointing to the LGW via the subnet’s route table. This type of routing is the default mode for Outposts Rack. In this mode, the LGW doesn’t perform Network Address Translation (NAT) for instances. Furthermore, you don’t have to assign an Elastic IP address to your Amazon Elastic Compute Cloud (Amazon EC2) instance from a (CoIP) to enable communication with your on-premises resources.

A diagram showing how an EC2 instance on an Outpost communicates with on-premises network using direct VPC routing mode

In this diagram, when the instance Y wants to communicate with an on-premises server, it traverses the LGW and can talk to the on-premises server using its source address (10.0.1.11) in the Subnet CIDR range (10.0.1.0/24) that is advertised over BGP from the LGW to the Customer Network Device. Similarly when the on-premises server wants to initiate communication with the Outpost based EC2 instance, it uses the instance’s private IP address (10.0.1.11) as the destination IP address to set up the connection.

CoIP mode

Utilizing CoIP mode means that you must provide a separate IP address range from your on-premises IP space for AWS to create an address pool, known as a CoIP. With CoIP, when an Outpost based resources, such as EC2 instances, Application Load Balancer (ALB), or Amazon Relational Database Service (Amazon RDS) instances, need to communicate to your on-premises network, the Local Gateway will perform 1:1 NAT from the resource’s private IP address from the Outpost subnet range to an IP address from the CoIP pool. The subnet-to-CoIP address mapping is done by assigning an Elastic IP (EIP) from the CoIP address range allocated for resources such as EC2 instances. To enable the communication with the CoIP pool from the on-premises network, and then the LGW advertises the CoIP pool through BGP over its peering with the Customer Network Device.

A diagram showing how an EC2 instance on an Outpost communicates with on-premises network using CoIP mode

In this diagram, when the instance Y wants to talk to an on-premises server, the traffic traverses the LGW and the source IP address (10.0.1.11) of the instance gets translated to an IP address (192.168.0.11) in the CoIP range that is associated with the instance. Similarly, when the on-premises server initiates the communication, the request will be sent with the CoIP address (192.168.0.11) of the instance as the destination IP address. This will be changed to the instance’s private IP address (10.0.1.11) via NAT at the LGW. The CoIP pool (192.168.0.0/26) is advertised via BGP to the Customer Network Device to provide the route to the on-premises environment for reaching the Outpost based resources.

When to choose CoIP routing mode

CoIP is particularly useful when you want to isolate your Outpost based workloads from the on-premises infrastructure and only need specific resources on the Outpost to be able to communicate to the on-premises infrastructure. This is useful in situations where large enterprise networks have hundreds of IP pools allocated and there is a high chance of overlap between IP addresses allocated to Outpost based VPCs/Subnets and those allocated to on-premises infrastructure. Furthermore, CoIP can act as another layer of security, as you may choose to allocate the CoIP addresses to only the resources which must communicate with the on-premises network. Then you can allocate for the rest of them using the subnet private IP address range for communication within the Outpost or Region based resources.

This means that you don’t need to have the number of IPs in the CoIP pool be equal to the number of resources on your Outpost. For example, you may choose to configure a /26 CoIP range and a /22 pool for subnets to meet your workload requirements.

CoIP mode can also be useful when using an external ALB on Outpost and you want to make it routable through the local internet connectivity. By using a smaller internet routable CoIP address range assignment for your external ALB, you can route traffic to the ALB on the Outpost without needing to traverse through the internet gateway (IGW) in the parent Region.

When to choose Direct VPC routing mode

You can choose Direct VPC routing if you don’t want the operational overhead of managing the additional IP pools for NAT between your Outpost based resources and on-premises network. There are also few applications which may not work well if there is an NAT of IPs between the two endpoints communicating with each other. Some examples you may see are Active Directory communication with on-premises based servers, or iSCSI mount of your instances as an additional storage to on-premises Storage Area Network (SAN). These applications may not work or may need additional tuning if they encounter NATed IP addresses between an Outpost based client on an EC2 instance and on-premises based server for a two-way communication.

When Direct VPC routing mode is used, multiple VPCs can be associated to an Outpost LGW route table, and the Outpost subnets with the LGW as the route target, are automatically advertised to the on-premises network through BGP. Therefore, you must make sure that appropriate IP planning is in place to avoid any overlap of the Outpost VPC/Subnet IP range with the on-premises IP range, as they are directly advertised from LGW toward the Customer Network Device. Having overlapping IP subnets in your network can lead to undesired effects on your application connectivity and you must pay special attention when allocating IP pools for your on-premises and Outpost based VPC address space. You can use Amazon VPC IP Address Manager (IPAM) to plan the IP space of your VPCs and CoIP pools, as well as add on-premises based IP Pools using manual allocation.

Conclusion

You can select either Direct VPC routing or CoIP mode for routing through an Outpost Local Gateway. Since this selection affects the routing for all of the subnets on your Outpost associated with the LGW route table, it should be selected based on your workload requirements and existing IP infrastructure planning. You can also change the LGW route table mode at a later stage. However, that involves network disruption and the creation of a new LGW route table. To learn more about Outposts Racks routing, visit the LGW Route table documentation.

Quick Restoration through Replacing the Root Volumes of Amazon EC2 instances

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/quick-restoration-through-replacing-the-root-volumes-of-amazon-ec2/

This blog post is written by Katja-Maja Krödel, IoT Specialist Solutions Architect, and Benjamin Meyer, Senior Solutions Architect, Game Tech.

Customers use Amazon Elastic Compute Cloud (Amazon EC2) instances to develop, deploy, and test applications. To use those instances most effectively, customers have expressed the need to set back their instance to a previous state within minutes or even seconds. They want to find a quick and automated way to manage setting back their instances at scale.

The feature of replacing Root Volumes of Amazon EC2 instances enables customers to replace the root volumes of running EC2 instances to a specific snapshot or its launch state. Without stopping the instance, this allows customers to fix issues while retaining the instance store data, networking, and AWS Identity and Access Management (IAM) configuration. Customers can resume their operations with their instance store data intact. This works for all virtualized EC2 instances and bare metal EC2 Mac instances today.

In this post, we show you how to design your architecture for automated Root Volume Replacement using this Amazon EC2 feature. We start with the automated snapshot creation, continue with automatically replacing the root volume, and finish with how to keep your environment clean after your replacement job succeeds.

What is Root Volume Replacement?

Amazon EC2 enables customers to replace the root Amazon Elastic Block Store (Amazon EBS) volume for an instance without stopping the instance to which it’s attached. An Amazon EBS root volume is replaced to the launch state, or any snapshot taken from the EBS volume itself. This allows issues to be fixed, such as root volume corruption or guest OS networking errors. Replacing the root volume of an instance includes the following steps:

  • A new EBS volume is created from a previously taken snapshot or the launch state
  • Reboot of the instance
  • While rebooting, the current root volume is detached and the new root volume is attached

The previous EBS root volume isn’t deleted and can be attached to an instance for later investigation of the volume. If replacing to a different state of the EBS than the launch state, then a snapshot of the current root volume is used.

An example use case is a continuous integration/continuous deployment (CI/CD) System that builds on EC2 instances to build artifacts. Within this system, you could alter the installed tools on the host and may cause failing builds on the same machine. To prevent any unclean builds, the introduced architecture is used to clean up the machine by replacing the root volume to a previously known good state. This is especially interesting for EC2 Mac Instances, as their Dedicated Host won’t undergo the scrubbing process, and the instance is more quickly restored than launching a fresh EC2 Mac instance on the same host.

Overview

The feature of replacing Root Volumes was introduced in April 2021 and has just been <TBD> extended to work for Bare Metal EC2 Mac Instances. This means that EC2 Mac Instances are included. If you want to reset an EC2 instance to a previously known good state, then you can create Snapshots of your EBS volumes. To reset the root volume to its launch state, no snapshot is needed. For non-root volumes, you can use these Snapshots to create new EBS volumes, and then attach those to your instance as well as detach them. To automate the process of replacing your root volume not only once, but also in a repeatable manner, we’re introducing you to an architecture that can fully-automate this process.

In the case that you use a snapshot to create a new root volume, you must take a new snapshot of that volume to be able to get back to that state later on. You can’t use a snapshot of a different volume to restore to, which is the reason that the architecture includes the automatic snapshot creation of a fresh root volume.

The architecture is built in three steps:

  1. Automation of Snapshot Creation for new EBS volumes
  2. Automation of replacing your Root Volume
  3. Preparation of the environment for the next Root Volume Replacement

The following diagram illustrates the architecture of this solution.

 Architecture of the automated creation of Root Volumes for Amazon EC2 Instances

In the next sections, we go through these concepts to design the automatic Root Volume Replacement Task.

Automation of Snapshot Creation for new EBS volumes

Architecture of the automated creation of Snapshots of new EBS Volumes.

The figure above illustrates the architecture for automatically creating a snapshot of an existing EBS volume. In this architecture, we focus on the automation of creating a snapshot whenever a new EBS root volume is created.

Amazon EventBridge is used to invoke an AWS Lambda function on an emitted createVolume event. For automated reaction to the event, you can add a rule to the EventBridge which will forward the event to an AWS Lambda function whenever a new EBS volume is created. The rule within EventBridge looks like this:

{
  "source": ["aws.ec2"],
  "detail-type": ["EBS Volume Notification"],
  "detail": {
    "event": ["createVolume"]
  }
}

An example event is emitted when an EBS root volume is created, which will then invoke the Lambda function to look like this:

{
   "version": "0",
   "id": "01234567-0123-0123-0123-012345678901",
   "detail-type": "EBS Volume Notification",
   "source": "aws.ec2",
   "account": "012345678901",
   "time": "yyyy-mm-ddThh:mm:ssZ",
   "region": "us-east-1",
   "resources": [
      "arn:aws:ec2:us-east-1:012345678901:volume/vol-01234567"
   ],
   "detail": {
      "result": "available",
      "cause": "",
      "event": "createVolume",
      "request-id": "01234567-0123-0123-0123-0123456789ab"
   }
}

The code of the function uses the resource ARN within the received event and requests resource details about the EBS volume from the Amazon EC2 APIs. Since the event doesn’t include information if it’s a root volume, then you must verify this using the Amazon EC2 API.

The following is a summary of the tasks of the Lambda function:

  1. Extract the EBS ARN from the EventBridge Event
  2. Verify that it’s a root volume of an EC2 Instance
  3. Call the Amazon EC2 API create-snapshot to create a snapshot of the root volume and add a tag replace-snapshot=true

Then, the tag is used to clean up the environment and get rid of snapshots that aren’t needed.

As an alternative, you can emit your own event to EventBridge. This can be used to automatically create snapshots to which you can restore your volume. Instead of reacting to the createVolume event, you can use a customized approach for this architecture.

Automation of replacing your Root Volume

Architecture of the automated creation of Snapshots of new EBS Volumes.

The figure above illustrates the procedure of replacing the EBS root volume. It starts with the event, which is created through the AWS Command Line Interface (AWS CLI), console, or usage of the API. This leads to creating a new volume from a snapshot or using the initial launch state. The EC2 instance is rebooted, and during that time the old root volume is detached and a new volume gets attached as the root volume.

To invoke the create-replace-root-volume-task, you can call the Amazon EC2 API with the following AWS CLI command:

aws ec2 create-replace-root-volume-task --instance-id <value> --snapshot <value> --tag-specifications ResourceType=string,Tags=[{Key=replaced-volume,Value=true}]

If you want to restore to launch state, then omit the --snapshot parameter:

aws ec2 create-replace-root-volume-task --instance-id <value> --tag-specifications ResourceType=string,Tags=[{Key=delete-volume,Value=true}]

After running this command, AWS will create a new EBS volume, add the tag to the old EBS replaced-volume=true, restart your instance, and attach the new volume to the instance as the root volume. The tag is used later to detect old root volumes and clean up the environment.

If this is combined with the earlier explained automation, then the automation will immediately take a snapshot from the new EBS volume. A restore operation can only be done to a snapshot of the current EBS root volume. Therefore, if no snapshot is taken from the freshly restored EBS volume, then no restore operation is possible except the restore to launch state.

Preparation of the Environment for the next Root Volume Replacement

After the task is completed, the old root volume isn’t removed. Additionally, snapshots of previous root volumes can’t be used to restore current root volumes. To clean up your environment, you can schedule a Lambda function which does the following steps:

  • Delete detached EBS volumes with the tag delete-volume=true
  • Delete snapshots with the tag replace-snapshot=true, which aren’t associated with an existing EBS volume

Conclusion

In this post, we described an architecture to quickly restore EC2 instances through Root Volume Replacement. The feature of replacing Root Volumes of Amazon EC2 instances, now including Bare Metal EC2 Mac instances, enables customers to replace the root volumes of running EC2 instances to a specific snapshot or its launch state. Customers can resume their operations with their instance store data intact. We’ve split the process of doing this in an automated and quick manner into three steps: Create a snapshot, run the replacement task, and reset your environment to be prepared for a following replacement task. If you want to learn more about this feature, then see the Announcement of replacing Root Volumes, as well as the documentation for this feature. <TBD Announcement Bare Metal>

Automating your workload deployments in AWS Local Zones

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/automating-your-workload-deployments-in-aws-local-zones/

This blog post is written by Enrico Liguori, SA – Solutions Builder , WWPS Solution Architecture.

AWS Local Zones are a type of infrastructure deployment that places compute, storage,and other select AWS services close to large population and industry centers.

We now have a total of 32 Local Zones; 15 outside of the US (Bangkok, Buenos Aires, Copenhagen, Delhi, Hamburg, Helsinki, Kolkata, Lagos, Lima, Muscat, Perth, Querétaro, Santiago, Taipei, and Warsaw) and 17 in the US. We will continue to launch Local Zones in 21 metro areas in 18 countries, including Australia, Austria, Belgium, Brazil, Canada, Colombia, Czech Republic, Germany, Greece, India, Kenya, Netherlands, New Zealand, Norway, Philippines, Portugal, South Africa, and Vietnam.

Customers using AWS Local Zones can provision the infrastructure and services needed to host their workloads with the same APIs and tools for automation that they use in the AWS Region, included the AWS Cloud Development Kit (AWS CDK).

The AWS CDK is an open source software development framework to model and provision your cloud application resources using familiar programming languages, including TypeScript, JavaScript, Python, C#, and Java. For the solution in this post, we use Python.

Overview

In this post we demonstrate how to:

  1. Programmatically enable the Local Zone of your interest.
  2. Explore the supported APIs to check the types of Amazon Elastic Compute Cloud (Amazon EC2) instances available in a specific Local Zone and get their associated price per hour;
  3. Deploy a simple WordPress application in the Local Zone through AWS CDK.

Prerequisites

To be able to try the examples provided in this post, you must configure:

  1. AWS Command Line Interface (AWS CLI)
  2. Python version 3.8 or above
  3. AWS CDK

Enabling a Local Zone programmatically

To get started with Local Zones, you must first enable the Local Zone that you plan to use in your AWS account. In this tutorial, you can learn how to select the Local Zone that provides the lowest latency to your site and understand how to opt into the Local Zone from the AWS Management Console.

If you prefer to interact with AWS APIs programmatically, then you can enable the Local Zone of your interest by calling the ModifyAvailabilityZoneGroup API through the AWS CLI or one of the supported AWS SDKs.

The following examples show how to opt into the Atlanta Local Zone through the AWS CLI and through the Python SDK:

AWS CLI:

aws ec2 modify-availability-zone-group \
  --region us-east-1 \
  --group-name us-east-1-atl-1 \
  --opt-in-status opted-in

Python SDK:

ec2 = boto3.client('ec2', config=Config(region_name='us-east-1'))
response = ec2.modify_availability_zone_group(
                  GroupName='us-east-1-atl-1',
                  OptInStatus='opted-in'
           )

The opt in process takes approximately five minutes to complete. After this time, you can confirm the opt in status using the DescribeAvailabilityZones API.

From the AWS CLI, you can check the enabled Local Zones with:

aws ec2 describe-availability-zones --region us-east-1

Or, once again, we can use one of the supported SDKs. Here is an example using Phyton:

ec2 = boto3.client('ec2', config=Config(region_name='us-east-1'))
response = ec2.describe_availability_zones()

In both cases, a JSON object similar to the following, will be returned:

{
"State": "available",
"OptInStatus": "opted-in",
"Messages": [],
"RegionName": "us-east-1",
"ZoneName": "us-east-1-atl-1a",
"ZoneId": "use1-atl1-az1",
"GroupName": "us-east-1-atl-1",
"NetworkBorderGroup": "us-east-1-atl-1",
"ZoneType": "local-zone",
"ParentZoneName": "us-east-1d",
"ParentZoneId": "use1-az4"
}

The OptInStatus confirms that we successful enabled the Atlanta Local Zone and that we can now deploy resources in it.

How to check available EC2 instances in Local Zones

The set of instance types available in a Local Zone might change from one Local Zone to another. This means that before starting deploying resources, it’s a good practice to check which instance types are supported in the Local Zone.

After enabling the Local Zone, we can programmatically check the instance types that are available by using DescribeInstanceTypeOfferings. To use the API with Local Zones, we must pass availability-zone as the value of the LocationType parameter and use a Filter object to select the correct Local Zone that we want to check. The resulting AWS CLI command will look like the following example:

aws ec2 describe-instance-type-offerings --location-type "availability-zone" --filters 
Name=location,Values=us-east-1-atl-1a --region us-east-1

Using Python SDK:

ec2 = boto3.client('ec2', config=Config(region_name='us-east-1'))
response = ec2.describe_instance_type_offerings(
      LocationType='availability-zone',
      Filters=[
            {
            'Name': 'location',
            'Values': ['us-east-1-atl-1a']
            }
            ]
      )

How to check prices of EC2 instances in Local Zones

EC2 instances and other AWS resources in Local Zones will have different prices than in the parent Region. Check the pricing page for the complete list of pricing options and associated price-per-hour.

To access the pricing list programmatically, we can use the GetProducts API. The API returns the list of pricing options available for the AWS service specified in the ServiceCode parameter. We also recommend defining Filters to restrict the number of results returned. For example, to retrieve the On-Demand pricing list of a T3 Medium instance in Atlanta from the AWS CLI, we can use the following:

aws pricing get-products --format-version aws_v1 --service-code AmazonEC2 --region us-east-1 \
--filters 'Type=TERM_MATCH,Field=instanceType,Value=t3.medium' \
--filters 'Type=TERM_MATCH,Field=location,Value=US East (Atlanta)'

Similarly, with Python SDK we can use the following:

pricing = boto3.client('pricing',config=Config(region_name="us-east-1")) response = pricing.get_products(
         ServiceCode='AmazonEC2',
         Filters= [
          {
          "Type": "TERM_MATCH",
          "Field": "instanceType",
          "Value": "t3.medium"
          },
          {
          "Type": "TERM_MATCH",
          "Field": "regionCode",
          "Value": "us-east-1-atl-1"
          }
        ],
         FormatVersion='aws_v1',
)

Note that the Region specified in the CLI command and in Boto3, is the location of the AWS Price List service API endpoint. This API is available only in us-east-1 and ap-south-1 Regions.

Deploying WordPress in Local Zones using AWS CDK

In this section, we see how to use the AWS CDK and Python to deploy a simple non-production WordPress installation in a Local Zone.

Architecture overview

architecture overview

The AWS CDK stack will deploy a new standard Amazon Virtual Private Cloud (Amazon VPC) in the parent Region (us-east-1) that will be extended to the Local Zone. This creates two subnets associated with the Atlanta Local Zone: a public subnet to expose resources on the Internet, and a private subnet to host the application and database layers. Review the AWS public documentation for a definition of public and private subnets in a VPC.

The application architecture is made of the following:

  • A front-end in the private subnet where a WordPress application is installed, through a User Data script, in a type T3 medium EC2 instance.
  • A back-end in the private subnet where MySQL database is installed, through a User Data script, in a type T3 medium EC2 instance.
  • An Application Load Balancer (ALB) in the public subnet that will act as the entry point for the application.
  • A NAT instance to allow resources in the private subnet to initiate traffic to the Internet.

Clone the sample code from the AWS CDK examples repository

We can clone the AWS CDK code hosted on GitHub with:

$ git clone https://github.com/aws-samples/aws-cdk-examples.git

Then navigate to the directory aws-cdk-examples/python/vpc-ec2-local-zones using the following:

$ cd aws-cdk-examples/python/vpc-ec2-local-zones

Before starting the provisioning, let’s look at the code in the following sections.

Networking infrastructure

The networking infrastructure is usually the first building block that we must define. In AWS CDK, this can be done using the VPC construct:

import aws_cdk.aws_ec2 as ec2
vpc = ec2.Vpc(
            self,
            "Vpc",
            cidr=”172.31.100.0/24”,
            subnet_configuration=[
                ec2.SubnetConfiguration(
                    name = 'Public-Subnet',
                    subnet_type = ec2.SubnetType.PUBLIC,
                    cidr_mask = 26,
                ),
                ec2.SubnetConfiguration(
                    name = 'Private-Subnet',
                    subnet_type = ec2.SubnetType.PRIVATE_ISOLATED,
                    cidr_mask = 26,
                ),
            ]      
        )

Together with the VPC CIDR (i.e. 172.31.100.0/24), we define also the subnets configuration through the subnet_configuration parameter.

Note that in the subnet definitions above there is no specification of the Availability Zone or Local Zone that we want to associate them with. We can define this setting at the VPC level, overwriting the availability_zones method as shown here:

@property
def availability_zones(self):
   return [“us-east-1-atl-1a”]

As an alternative, you can use a Local Zone Name as the value of the availability_zones parameter in each Subnet definition. For a complete list of Local Zone Names, check out the Zone Names on the Local Zones Locations page.

Specifying ec2.SubnetType.PUBLIC  in the subnet_type parameter, AWS CDK  automatically creates an Internet Gateway (IGW) associated with our VPC and a default route in its routing table pointing to the IGW. With this setup, the Internet traffic will go directly to the IGW in the Local Zone without going through the parent AWS Region. For other connectivity options, check the AWS Local Zone User Guide.

The last piece of our networking infrastructure is a self-managed NAT instance. This will allow instances in the private subnet to communicate with services outside of the VPC and simultaneously prevent them from receiving unsolicited connection requests.

We can implement the best practices for NAT instances included in the AWS public documentation using a combination of parameters of the Instance construct, as shown here:

nat = ec2.Instance(self, "NATInstanceInLZ",
                 vpc=vpc,
                 security_group=self.create_nat_SG(vpc),
                 instance_type=ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.MEDIUM),
                 machine_image=ec2.MachineImage.latest_amazon_linux(),
                 user_data=ec2.UserData.custom(user_data),
                 vpc_subnets=ec2.SubnetSelection(availability_zones=[“us-east-1-atl-1a”], subnet_type=ec2.SubnetType.PUBLIC),
                 source_dest_check=False
                )

In the previous code example, we specify the following as parameters:

The final required step is to update the route table of the private subnet with the following:

priv_subnet.add_route("DefRouteToNAT",
            router_id=nat_instance.instance_id,
            router_type=ec2.RouterType.INSTANCE,
            destination_cidr_block="0.0.0.0/0",
            enables_internet_connectivity=True)

The application stack

The other resources, including the front-end instance managed by AutoScaling, the back-end instance, and ALB are deployed using the standard AWS CDK constructs. Note that the ALB service is only available in some Local Zones. If you plan to use a Local Zone where ALB isn’t supported, then you must deploy a load balancer on a self-managed EC2 instance, or use a load balancer available in AWS Marketplace.

Stack deployment

Next, let’s go through the AWS CDK bootstrapping process. This is required only for the first time that we use AWS CDK in a specific AWS environment (an AWS environment is a combination of an AWS account and Region).

$ cdk bootstrap

Now we can deploy the stack with the following:

$ cdk deploy

After the deployment is completed, we can connect to the application with a browser using the URL returned in the output of the cdk deploy command:

terminal screenshot

The WordPress install wizard will be displayed in the browser, thereby confirming that the deployment worked as expected:

The WordPress install wizard

Note that in this post we use the Local Zone in Atlanta. Therefore, we must deploy the stack in its parent Region, US East (N. Virginia). To select the Region used by the stack, configure the AWS CLI default profile.

Cleanup

To terminate the resources that we created in this post, you can simply run the following:

$ cdk destroy

Conclusion

In this post, we demonstrated how to interact programmatically with the different AWS APIs available for Local Zones. Furthermore, we deployed a simple WordPress application in the Atlanta Local Zone after analyzing the AWS CDK code used for the deployment.

We encourage you to try the examples provided in this post and get familiar with the programmatic configuration and deployment of resources in a Local Zone.

Adopt Recommendations and Monitor Predictive Scaling for Optimal Compute Capacity

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/evaluating-predictive-scaling-for-amazon-ec2-capacity-optimization/

This post is written by Ankur Sethi, Sr. Product Manager, EC2, and Kinnar Sen, Sr. Specialist Solution Architect, AWS Compute.

Amazon EC2 Auto Scaling helps customers optimize their Amazon EC2 capacity by dynamically responding to varying demand. Based on customer feedback, we enhanced the scaling experience with the launch of predictive scaling policies. Predictive scaling proactively adds EC2 instances to your Auto Scaling group in anticipation of demand spikes. This results in better availability and performance for your applications that have predictable demand patterns and long initialization times. We recently launched a couple of features designed to help you assess the value of predictive scaling – prescriptive recommendations on whether to use predictive scaling based on its potential availability and cost impact, and integration with Amazon CloudWatch to continuously monitor the accuracy of predictions. In this post, we discuss the features in detail and the steps that you can easily adopt to enjoy the benefits of predictive scaling.

Recap: Predictive Scaling

EC2 Auto Scaling helps customers maintain application availability by managing the capacity and health of the underlying cluster. Prior to predictive scaling, EC2 Auto Scaling offered dynamic scaling policies such as target tracking and step scaling. These dynamic scaling policies are configured with an Amazon CloudWatch metric that represents an application’s load. EC2 Auto Scaling constantly monitors this metric and responds according to your policies, thereby triggering the launch or termination of instances. Although it’s extremely effective and widely used, this model is reactive in nature, and for larger spikes, may lead to unfulfilled capacity momentarily as the cluster is scaling out. Customers mitigate this by adopting aggressive scale out and conservative scale in to manage the additional buffer of instances. However, sometimes applications take a long time to initialize or have a recurring pattern with a sudden spike of high demand. These can have an impact on the initial response of the system when it is scaling out. Customers asked for a proactive scaling mechanism that can scale capacity ahead of predictable spikes, and so we delivered predictive scaling.

Predictive scaling was launched to make the scaling action proactive as it anticipates the changes required in the compute demand and scales accordingly. The scaling action is determined by ensemble machine learning (ML) built with data from your Auto Scaling group’s scaling patterns, as well as billions of data points from our observations. Predictive scaling should be used for applications where demand changes rapidly but with a recurring pattern, instances require a long time to initialize, or where you’re manually invoking scheduled scaling for routine demand patterns. Predictive scaling not only forecasts capacity requirements based on historical usage, but also learns continuously, thereby making forecasts more accurate with time. Furthermore, predictive scaling policy is designed to only scale out and not scale in your Auto Scaling groups, eliminating the risk of ending with lesser capacity because of inexact predictions. You must use dynamic scaling policy, scheduled scaling, or your own custom mechanism for scale-ins. In case of exceptional demand spikes, this addition of dynamic scaling policy can also improve your application performance by bridging the gap between demand and predicted capacity.

What’s new with predictive scaling

Predictive scaling policies can be configured in a non-mutative ‘Forecast Only’ mode to evaluate the accuracy of forecasts. When you’re ready to start scaling, you can switch to the ‘Forecast and Scale’ mode. Now we prescriptively recommend whether your policy should be switched to ‘Forecast and Scale’ mode if it can potentially lead to better availability and lower costs, saving you the time and effort of doing such an evaluation manually. You can test different configurations by creating multiple predictive scaling policies in ‘Forecast Only’ mode, and choose the one that performs best in terms of availability and cost improvements.

Monitoring and observability are key elements of AWS Well Architected Framework. Now we also offer CloudWatch metrics for your predictive scaling policies so that that you can programmatically monitor your predictive scaling policy for demand pattern changes or prolonged periods of inaccurate predictions. This will enable you to monitor the key performance metrics and make it easier to adopt AWS Well-Architected best practices.

In the following sections, we deep dive into the details of these two features.

Recommendations for predictive scaling

Once you set up an Auto Scaling group with predictive scaling policy in Forecast Only mode as explained in this introduction to predictive scaling blog post , you can review the results of the forecast visually and adjust any parameters to more accurately reflect the behavior that you desire. Evaluating simply on the basis of visualization may not be very intuitive if the scaling patterns are erratic. Moreover, if you keep higher minimum capacities, then the graph may show a flat line for the actual capacity as your Auto Scaling group capacity is an outcome of existing scaling policy configurations and the minimum capacity that you configured. This makes it difficult to contemplate whether the lower capacity predicted by predictive scaling wouldn’t leave your Auto Scaling group under-scaled.

This new feature provides a prescriptive guidance to switch on predictive scaling in Forecast and Scale mode based on the factors of availability and cost savings. To determine the availability and cost savings, we compare the predictions against the actual capacity and the optimal, required capacity. This required capacity is inferred based on whether your instances were running at a higher or lower value than the target value for scaling metric that you defined as part of the predictive scaling policy configuration. For example, if an Auto Scaling group is running 10 instances at 20% CPU Utilization while the target defined in predictive scaling policy is 40%, then the instances are running under-utilized by 50% and the required capacity is assumed to be 5 instances (half of your current capacity). For an Auto Scaling group, based on the time range in which you’re interested (two weeks as default), we aggregate the cost saving and availability impact of predictive scaling. The availability impact measures for the amount of time that the actual metric value was higher than the target value that you defined to be optimal for each policy. Similarly, cost savings measures the aggregated savings based on the capacity utilization of the underlying Auto Scaling group for each defined policy. The final cost and availability will lead us to a recommendation based on:

  • If availability increases (or remains same) and cost reduces (or remains same), then switch on Forecast and Scale
  • If availability reduces, then disable predictive scaling
  • If availability increase comes at an increased cost, then the customer should take the call based on their cost-availability tradeoff threshold

This figure shows the console view of how the recommendations look like on the Auto Scaling console. For each policy we make prescriptive recommendation of whether to switch to Forecast And Scale mode along with whether doing so can lead to better availability and lower costFigure 1: Predictive Scaling Recommendations on EC2 Auto Scaling console

The preceding figure shows how the console reflects the recommendation for a predictive scaling policy. You get information on whether the policy can lead to higher availability and lower cost, which leads to a recommendation to switch to Forecast and Scale. To achieve this cost saving, you might have to lower your minimum capacity and aim for higher utilization in dynamic scaling policies.

To get the most value from this feature, we recommend that you create multiple predictive scaling policies in Forecast Only mode with different configurations, choosing different metrics and/or different target values. Target value is an important lever that changes how aggressive the capacity forecasts must be. A lower target value increases your capacity forecast resulting in better availability for your application. However, this also means more dollars to be spent on the Amazon EC2 cost. Similarly, a higher target value can leave you under-scaled while reactive scaling bridges the gap in just a few minutes. Separate estimates of cost and availability impact are provided for each of the predictive scaling policies. We recommend using a policy if either availability or cost are improved and the other variable improves or stays the same. As long as there is a predictable pattern, Auto Scaling enhanced with predictive scaling maintains high availability for your applications.

Continuous Monitoring of predictive scaling

Once you’re using a predictive scaling policy in Forecast and Scale mode based on the recommendation, you must monitor the predictive scaling policy for demand pattern changes or inaccurate predictions. We introduced two new CloudWatch Metrics for predictive scaling called ‘PredictiveScalingLoadForecast’ and ‘PredictiveScalingCapacityForecast’. Using CloudWatch mertic math feature, you can create a customized metric that measures the accuracy of predictions. For example, to monitor whether your policy is over or under-forecasting, you can publish separate metrics to measure the respective errors. In the following graphic, we show how the metric math expressions can be used to create a Mean Absolute Error for over-forecasting on the load forecasts. Because predictive scaling can only increase capacity, it is useful to alert when the policy is excessively over-forecasting to prevent unnecessary cost.This figure shows the CloudWatch graph of three metrics – the total CPU Utilization of the Auto Scaling group, the load forecast generated by predictive scaling, and the derived metric using metric math that measures error for over-forecastingFigure 2: Graphing an accuracy metric using metric math on CloudWatch

In the previous graph, the total CPU Utilization of the Auto Scaling group is represented by m1 metric in orange color while the predicted load by the policy is represented by m2 metric in green color. We used the following expression to get the ratio of over-forecasting error with respect to the actual value.

IF((m2-m1)>0, (m2-m1),0))/m1

Next, we will setup an alarm to automatically send notifications using Amazon Simple Notification Service (Amazon SNS). You can create similar accuracy monitoring for capacity forecasts, but remember that once the policy is in Forecast and Scale mode, it already starts influencing the actual capacity. Hence, putting alarms on load forecast accuracy might be more intuitive as load is generally independent of the capacity of an Auto Scaling group.

This figure shows creation of alarm when 10 out of 12 data points breach 0.02 threshold for the accuracy metricFigure 3: Creating a CloudWatch Alarm on the accuracy metric

In the above screenshot, we have set an alarm that triggers when our custom accuracy metric goes above 0.02 (20%) for 10 out of last 12 data points which translates to 10 hours of the last 12 hours. We prefer to alarm on a greater number of data points so that we get notified only when predictive scaling is consistently giving inaccurate results.

Conclusion

With these new features, you can make a more informed decision about whether predictive scaling is right for you and which configuration makes the most sense. We recommend that you start off with Forecast Only mode and switch over to Forecast and Scale based on the recommendations. Once in Forecast and Scale mode, predictive scaling starts taking proactive scaling actions so that your instances are launched and ready to contribute to the workload in advance of the predicted demand. Then continuously monitor the forecast to maintain high availability and cost optimization of your applications. You can also use the new predictive scaling metrics and CloudWatch features, such as metric math, alarms, and notifications, to monitor and take actions when predictions are off by a set threshold for prolonged periods.

Building Sustainable, Efficient, and Cost-Optimized Applications on AWS

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/building-sustainable-efficient-and-cost-optimized-applications-on-aws/

This blog post is written by Isha Dua Sr. Solutions Architect AWS, Ananth Kommuri Solutions Architect AWS, and Dr. Sam Mokhtari Sr. Sustainability Lead SA WA for AWS.

Today, more than ever, sustainability and cost-savings are top of mind for nearly every organization. Research has shown that AWS’ infrastructure is 3.6 times more energy efficient than the median of U.S. enterprise data centers and up to five times more energy efficient than the average in Europe. That said, simply migrating to AWS isn’t enough to meet the Environmental, Social, Governance (ESG) and Cloud Financial Management (CFM) goals that today’s customers are setting. In order to make conscious use of our planet’s resources, applications running on the cloud must be built with efficiency in mind.

That’s because cloud sustainability is a shared responsibility. At AWS, we’re responsible for optimizing the sustainability of the cloud – building efficient infrastructure, enough options to meet every customer’s needs, and the tools to manage it all effectively. As an AWS customer, you’re responsible for sustainability in the cloud – building workloads in a way that minimizes the total number of resource requirements and makes the most of what must be consumed.

Most AWS service charges are correlated with hardware usage, so reducing resource consumption also has the added benefit of reducing costs. In this blog post, we’ll highlight best practices for running efficient compute environments on AWS that maximize utilization and decrease waste, with both sustainability and cost-savings in mind.

First: Measure What Matters

Application optimization is a continuous process, but it has to start somewhere. The AWS Well Architected Framework Sustainability pillar includes an improvement process that helps customers map their journey and understand the impact of possible changes. There is a saying “you can’t improve what you don’t measure.”, which is why it’s important to define and regularly track metrics which are important to your business. Scope 2 Carbon emissions, such as those provided by the AWS Customer Carbon Footprint Tool, are one metric that many organizations use to benchmark their sustainability initiatives, but they shouldn’t be the only one.

Even after AWS meets our 2025 goal of powering our operations with 100% renewable energy, it’s still be important to maximize the utilization and minimize the consumption of the resources that you use. Just like installing solar panels on your house, it’s important to limit your total consumption to ensure you can be powered by that energy. That’s why many organizations use proxy metrics such as vCPU Hours, storage usage, and data transfer to evaluate their hardware consumption and measure improvements made to infrastructure over time.

In addition to these metrics, it’s helpful to baseline utilization against the value delivered to your end-users and customers. Tracking utilization alongside business metrics (orders shipped, page views, total API calls, etc) allows you to normalize resource consumption with the value delivered to your organization. It also provides a simple way to track progress towards your goals over time. For example, if the number of orders on your ecommerce site remained constant over the last month, but your AWS infrastructure usage decreased by 20%, you can attribute the efficiency gains to your optimization efforts, not changes in your customer behavior.

Utilize all of the available pricing models

Compute tasks are the foundation of many customers’ workloads, so it typically sees biggest benefit by optimization. Amazon EC2 provides resizable compute across a wide variety of compute instances, is well-suited to virtually every use case, is available via a number of highly flexible pricing options. One of the simplest changes you can make to decrease your costs on AWS is to review the purchase options for the compute and storage resources that you already use.

Amazon EC2 provides multiple purchasing options to enable you to optimize your costs based on your needs. Because every workload has different requirements, we recommend a combination of purchase options tailored for your specific workload needs. For steady-state workloads that can have a 1-3 year commitment, using Compute Savings Plans helps you save costs, move from one instance type to a newer, more energy-efficient alternative, or even between compute solutions (e.g., from EC2 instances to AWS Lambda functions, or AWS Fargate).

EC2 Spot instances are another great way to decrease cost and increase efficiency on AWS. Spot Instances make unused Amazon EC2 capacity available for customers at discounted prices. At AWS, one of our goals it to maximize utilization of our physical resources. By choosing EC2 Spot instances, you’re running on hardware that would otherwise be sitting idle in our datacenters. This increases the overall efficiency of the cloud, because more of our physical infrastructure is being used for meaningful work. Spot instances use market-based pricing that changes automatically based on supply and demand. This means that the hardware with the most spare capacity sees the highest discounts, sometimes up to XX% off on-demand prices, to encourage our customers to choose that configuration.

Savings Plans are ideal for predicable, steady-state work. On-demand is best suited for new, stateful, and spiky workloads which can’t be instance, location, or time flexible. Finally, Spot instances are a great way to supplement the other options for applications that are fault tolerant and flexible. AWS recommends using a mix of pricing models based on your workload needs and ability to be flexible.

By using these pricing models, you’re creating signals for your future compute needs, which helps AWS better forecast resource demands, manage capacity, and run our infrastructure in a more sustainable way.

Choose efficient, purpose-built processors whenever possible

Choosing the right processor for your application is as equally important consideration because under certain use cases a more powerful processor can allow for the same level of compute power with a smaller carbon footprint. AWS has the broadest choice of processors, such as Intel – Xeon scalable processors, AMD – AMD EPYC processors, GPU’s FPGAs, and Custom ASICs for Accelerated Computing.

AWS Graviton3, AWS’s latest and most power-efficient processor, delivers 3X better CPU performance per-watt than any other processor in AWS, provides up to 40% better price performance over comparable current generation x86-based instances for various workloads, and helps customers reduce their carbon footprint. Consider transitioning your workload to Graviton-based instances to improve the performance efficiency of your workload (see AWS Graviton Fast Start and AWS Graviton2 for ISVs). Note the considerations when transitioning workloads to AWS Graviton-based Amazon EC2 instances.

For machine learning (ML) workloads, use Amazon EC2 instances based on purpose-built Amazon Machine Learning (Amazon ML) chips, such as AWS TrainiumAWS Inferentia, and Amazon EC2 DL1.

Optimize for hardware utilization

The goal of efficient environments is to use only as many resources as required in order to meet your needs. Thankfully, this is made easier on the cloud because of the variety of instance choices, the ability to scale dynamically, and the wide array of tools to help track and optimize your cloud usage. At AWS, we offer a number of tools and services that can help you to optimize both the size of individual resources, as well as scale the total number of resources based on traffic and load.

Two of the most important tools to measure and track utilization are Amazon CloudWatch and the AWS Cost & Usage Report (CUR). With CloudWatch, you can get a unified view of your resource metrics and usage, then analyze the impact of user load on capacity utilization over time. The Cost & Usage Report (CUR) can help you understand which resources are contributing the most to your AWS usage, allowing you to fine-tune your efficiency and save on costs. CUR data is stored in S3, which allows you to query it with tools like Amazon Athena or generate custom reports in Amazon QuickSight or integrate with AWS Partner tools for better visibility and insights.

An example of a tool powered by CUR data is the AWS Cost Intelligence Dashboard. The Cost Intelligence Dashboard provides a detailed, granular, and recommendation-driven view of your AWS usage. With its prebuilt visualizations, it can help you identify which service and underlying resources are contributing the most towards your AWS usage, and see the potential savings you can realize by optimizing. It even provides right sizing recommendations and the appropriate EC2 instance family to help you optimize your resources.

Cost Intelligence Dashboard is also integrated with AWS Compute Optimizer, which makes instance type and size recommendations based on workload characteristics. For example, it can identify if the workload is CPU-intensive, if it exhibits a daily pattern, or if local storage is accessed frequently. Compute Optimizer then infers how the workload would have performed on various hardware platforms (for example, Amazon EC2 instance types) or using different configurations (for example, Amazon EBS volume IOPS settings, and AWS Lambda function memory sizes) to offer recommendations. For stable workloads, check AWS Compute Optimizer at regular intervals to identify right-sizing opportunities for instances. By right sizing with Compute Optimizer, you can increase resource utilization and reduce costs by up to 25%. Similarly, Lambda Power Tuning can help choose the memory allocated to Lambda functions is an optimization process that balances speed (duration) and cost while lowering your carbon emission in the process.

CloudWatch metrics are used to power EC2 Autoscaling, which can automatically choose the right instance to fit your needs with attribute-based instance selection and scale your entire instance fleet up and down based on demand in order to maintain high utilization. AWS Auto Scaling makes scaling simple with recommendations that let you optimize performance, costs, or balance between them. Configuring and testing workload elasticity will help save money, maintain performance benchmarks, and reduce the environmental impact of workloads. You can utilize the elasticity of the cloud to automatically increase the capacity during user load spikes, and then scale down when the load decreases. Amazon EC2 Auto Scaling allows your workload to automatically scale up and down based on demand. You can set up scheduled or dynamic scaling policies based on metrics such as average CPU utilization or average network in or out. Then, you can integrate AWS Instance Scheduler and Scheduled scaling for Amazon EC2 Auto Scaling to schedule shut downs and terminate resources that run only during business hours or on weekdays to further reduce your carbon footprint.

Design applications to minimize overhead and use fewer resources

Using the latest Amazon Machine Image (AMI) gives you updated operating systems, packages, libraries, and applications, which enable easier adoption as more efficient technologies become available. Up-to-date software includes features to measure the impact of your workload more accurately, as vendors deliver features to meet their own sustainability goals.

By reducing the amount of equipment that your company has on-premises and using managed services, you can help facilitate the move to a smaller, greener footprint. Instead of buying, storing, maintaining, disposing of, and replacing expensive equipment, businesses can purchase services as they need that are already optimized with a greener footprint. Managed services also shift responsibility for maintaining high average utilization and sustainability optimization of the deployed hardware to AWS. Using managed services will help distribute the sustainability impact of the service across all of the service tenants, thereby reducing your individual contribution. The following services help reduce your environmental impact because capacity management is automatically optimized.

 AWS  Managed Service   Recommendation for sustainability improvement

Amazon Aurora

You can use Amazon  Aurora Serverless to automatically start up, shut down, and scale capacity up or down based on your application’s needs.

Amazon Redshift

You can use Amazon Redshift Serverless to run and scale data warehouse capacity.

AWS Lambda

You can Migrate AWS Lambda functions to Arm-based AWS Graviton2 processors.

Amazon ECS

You can run Amazon ECS on AWS Fargate to avoid the undifferentiated heavy lifting by leveraging sustainability best practices AWS put in place for management of the control plane.

Amazon EMR

You can use EMR Serverless to avoid over- or under-provisioning resources for your data processing jobs.

AWS Glue

You can use Auto-scaling for AWS Glue to enable on-demand scaling up and scaling down of the computing resources.

 Centralized data centers consume a lot of energy, produce a lot of carbon emissions and cause significant electronic waste. While more data centers are moving towards green energy, an even more sustainable approach (alongside these so-called “green data centers”) is to actually cut unnecessary cloud traffic, central computation and storage as much as possible by shifting computation to the edge. Edge Computing stores and uses data locally, on or near the device it was created on. This reduces the amount of traffic sent to the cloud and, at scale, can limit the overall energy used and carbon emissions.

Use storage that best supports how your data is accessed and stored to minimize the resources provisioned while supporting your workload. Solid state devices (SSDs) are more energy intensive than magnetic drives and should be used only for active data use cases. You should look into using ephemeral storage whenever possible and categorize, centralize, deduplicate, and compress persistent storage.

AWS OutpostsAWS Local Zones and AWS Wavelength services deliver data processing, analysis, and storage close to your endpoints, allowing you to deploy APIs and tools to locations outside AWS data centers. Build high-performance applications that can process and store data close to where it’s generated, enabling ultra-low latency, intelligent, and real-time responsiveness. By processing data closer to the source, edge computing can reduce latency, which means that less energy is required to keep devices and applications running smoothly. Edge computing can help to reduce the carbon footprint of data centers by using renewable energy sources such as solar and wind power.

Conclusion

In this blog post, we discussed key methods and recommended actions you can take to optimize your AWS compute infrastructure for resource efficiency. Using the appropriate EC2 instance types with the right size, processor, instance storage and pricing model can enhance the sustainability of your applications. Use of AWS managed services, options for edge computing and continuously optimizing your resource usage can further improve the energy efficiency of your workloads. You can also analyze the changes in your emissions over time as you migrate workloads to AWS, re-architect applications, or deprecate unused resources using the Customer Carbon Footprint Tool.

Ready to get started? Check out the AWS Sustainability page to find out more about our commitment to sustainability and learn more about renewable energy usage, case studies on sustainability through the cloud, and more.

Building a Cloud in the Cloud: Running Apache CloudStack on Amazon EC2, Part 2

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/building-a-cloud-in-the-cloud-running-apache-cloudstack-on-amazon-ec2-part-2/

This blog is written by Mark Rogers, SDE II – Customer Engineering AWS.

In part 1, I showed you how to run Apache CloudStack with KVM on a single Amazon Elastic Compute Cloud (Amazon EC2) instance. That simple setup is great for experimentation and light workloads. In this post, things will get a lot more interesting. I’ll show you how to create an overlay network in your Amazon Virtual Private Cloud (Amazon VPC) that allows CloudStack to scale horizontally across multiple EC2 instances. This same method could work with other hypervisors, too.

If you haven’t read it yet, then start with part 1. It explains why this network setup is necessary. The same prerequisites apply to both posts.

Making things easier

I wrote some scripts to automate the CloudStack installation and OS configuration on CentOS 7. You can customize them to meet your needs. I also wrote some AWS CloudFormation templates you can copy in order to create a demo environment. The README file has more details.

The scalable method

Our team started out using a single EC2 instance, as described in my last post. That worked at first, but it didn’t have the capacity we needed. We were limited to a couple of dozen VMs, but we needed hundreds. We also needed to scale up and down as our needs changed. This meant we needed the ability to add and remove CloudStack hosts. Using a Linux bridge as a virtual subnet was no longer adequate.

To support adding hosts, we need a subnet that spans multiple instances. The solution I found is Virtual Extensible LAN (VXLAN). It’s lightweight, easy to configure, and included in the Linux kernel. VXLAN creates a layer 2 overlay network that abstracts away the details of the underlying network. It allows machines in different parts of a network to communicate as if they’re all attached to the same simple network switch.

Another example of an overlay network is an Amazon VPC. It acts like a physical network, but it’s actually a layer on top of other networks. It’s networks all the way down. VXLAN provides a top layer where CloudStack can sit comfortably, handling all of your VM needs, blissfully unaware of the world below it.

An overlay network comes with some big advantages. The biggest improvement is that you can have multiple hosts, allowing for horizontal scaling. Having more hosts not only gives you more computing power, but also lets you do rolling maintenance. Instead of putting the database and file storage on the management server, I’ll show you how to use Amazon Elastic File System (Amazon EFS) and Amazon Relational Database Service (Amazon RDS) for scalable and reliable storage.

EC2 Instances

Let’s start with three Amazon EC2 instances. One will be a router between the overlay network and your Amazon VPC, the second one will be your CloudStack management server, and the third one will be your VM host. You’ll also need a way to connect to your instances, such as a bastion host or a VPN endpoint.

Three EC2 instances are connected to an AWS subnet. There's an overlay network that spans all three instances.The router instance connects the overlay network to the AWS subnet. The management instance contains the CloudStack management service, which is attached to the overlay network. The host instance contains the CloudStack agent and some VMs, all of which are connected to the overlay network.VXLAN must send and receive multicast traffic. Only Nitro instances can be multicast senders. As you plan, look at the list of Nitro instance types.

The router won’t need much computing power, but it will need enough network bandwidth to meet your needs. If you put Amazon EFS in the same subnet as your instances, then they’ll communicate with it directly, thereby reducing the load on the router. Decide how much network throughput you want, and then pick a suitable Nitro instance type.

After creating the router instance, configure AWS to use it as a router. Stop source/destination checking in the instance’s network settings. Then update the applicable AWS route tables to use the router as the target for the overlay network. The router’s security group needs to allow ingress to the CloudStack UI (TCP port 8080) and any services you plan to offer from VMs.

For the management server, you’ll want a Nitro instance type. It’s going to use more CPU than your router, so plan accordingly.

In addition to being a Nitro type, the host instance must also be a metal type. Metal instances have hardware virtualization support, which is needed by KVM. If you have a new AWS account with a low on-demand vCPU limit, then consider starting with an m5zn.metal, which has 48 vCPUs. Otherwise, I suggest going directly to a c5.metal because it provides 96 vCPUs for a similar price. There are bigger types available depending on your compute needs, budget, and vCPU limit. If your account’s on-demand vCPU limit is too low, then you can file a support ticket to have it raised.

Networking

All of the instances should be on a dedicated subnet. Sharing the subnet with other instances can cause communication issues. For an example, refer to the following figure. The subnet has an instance named TroubleMaker that’s not on the overlay network. If TroubleMaker sends a request to the management instance’s overlay network address, then here’s what happens:

  1. The request goes through the AWS subnet to the router.
  2. The router forwards the request via the overlay network.
  3. The CloudStack management instance has a connection to the same AWS subnet that TroubleMaker is on. Therefore, it responds directly instead of using the router. This isn’t the return path that AWS is expecting, so the response is dropped.

This diagram depicts the steps described in the previous paragraph.

If you move TroubleMaker to a different subnet, then the requests and responses will all go through the router. That will fix the communication issues.

The instances in the overlay network will use special interfaces that serve as VXLAN tunnel endpoints (VTEPs). The VTEPs must know how to contact each other via the underlay network. You could manually give each instance a list of all of the other instances, but that’s a maintenance nightmare. It’s better to let the VTEPs discover each other, which they can do using multicast. You can add multicast support using AWS Transit Gateway.

Here are the steps to make VXLAN multicasts work:

  1. Enable multicast support when you create the transit gateway.
  2. Attach the transit gateway to your subnet.
  3. Create a transit gateway multicast domain with IGMPv2 support enabled.
  4. Associate the multicast domain with your subnet.
  5. Configure the eth0 interface on each instance to use IGMPv2. The following sample code shows how to do this.
  6. Make sure that your instance security groups allow ingress for IGMP queries (protocol 2 traffic from 0.0.0.0/32) and VXLAN traffic (UDP port 4789 from the other instances).

CloudStack VMs must connect to the same bridge as the VXLAN interface. As mentioned in the previous post, CloudStack cares about names. I recommend giving the interface a name starting with “eth”. Moreover, this naming convention tells CloudStack which bridge to use, thereby avoiding the need for a dummy interface like the one in the simple setup.

The following snippet shows how I configured the networking in CentOS 7. You must provide values for these variables:

  • $overlay_host_ip_address, $overlay_netmask, and $overlay_gateway_ip: Use values for the overlay network that you’re creating.
  • $dns_address: I recommend using the base of the VPC IPv4 network range, plus two. You shouldn’t use 169.654.169.253 because CloudStack reserves link-local addresses for its own use.
  • $multicast_address: The multicast address that you want VXLAN to use. Pick something in the multicast range that won’t conflict with anything else. I recommend choosing from the IPv4 local scope (239.255.0.0/16).
  • $interface_name: The name of the interface VXLAN should use to communicate with the physical network. This is typically eth0.

A couple of the steps are different for the router instance than for the other instances. Pay attention to the comments!

yum install -y bridge-utils net-tools

# IMPORTANT: Omit the GATEWAY setting on the router instance!
cat << EOF > /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
TYPE=Bridge
ONBOOT=yes
BOOTPROTO=none
IPV6INIT=no
IPV6_AUTOCONF=no
DELAY=5
STP=no
USERCTL=no
NM_CONTROLLED=no
IPADDR=$overlay_host_ip_address
NETMASK=$overlay_netmask
DNS1=$dns_address
GATEWAY=$overlay_gateway_ip
EOF

cat << EOF > /sbin/ifup-local
#!/bin/bash
# Set up VXLAN once cloudbr0 is available.
if [[ \$1 == "cloudbr0" ]]
then
    ip link add ethvxlan0 type vxlan id 100 dstport 4789 group "$multicast_address" dev "$interface_name"
    brctl addif cloudbr0 ethvxlan0
    ip link set up dev ethvxlan0
fi
EOF

chmod +x /sbin/ifup-local

# Transit Gateway requires IGMP version 2
echo "net.ipv4.conf.$interface_name.force_igmp_version=2" >> /etc/sysctl.conf
sysctl -p

# Enable IPv4 forwarding
# IMPORTANT: Only do this on the router instance!
echo 'net.ipv4.ip_forward=1' >> /etc/sysctl.conf
sysctl -p

# Restart the network service to make the changes take effect.
systemctl restart network

Storage

Let’s look at storage. Create an Amazon RDS database using the MySQL 8.0 engine, and set a master password for CloudStack’s database setup tool to use. Refer to the CloudStack documentation to find the MySQL settings that you’ll need. You can put the settings in an RDS parameter group. In case you’re wondering why I’m not using Amazon Aurora, it’s because CloudStack needs the MyISAM storage engine, which isn’t available in Aurora.

I recommend Amazon EFS for file storage. For efficiency, create a mount target in the subnet with your EC2 instances. That will enable them to communicate directly with the mount target, thereby bypassing the overlay network and router. Note that the system VMs will use Amazon EFS via the router.

If you want, you can consolidate your CloudStack file systems. Just create a single file system with directories for each zone and type of storage. For example, I use directories named /zone1/primary, /zone1/secondary, /zone2/primary, etc. You should also consider enabling provisioned throughput on the file system, or you may run out of bursting credits after booting a few VMs.

One consequence of the file system’s scalability is that the amount of free space (8 exabytes) will cause an integer overflow in CloudStack! To avoid this problem, reduce storage.overprovisioning.factor in CloudStack’s global settings from 2 to 1.

When your environment is ready, install CloudStack. When it asks for a default gateway, remember to use the router’s overlay network address. When you add a host, make sure that you use the host’s overlay network IP address.

Cleanup

If you used my CloudFormation template, delete the stack and remove any route table entries you added.

If you didn’t use CloudFormation, here are the things to delete:

  1. The CloudStack EC2 instances
  2. The Amazon RDS database and parameter group
  3. The Amazon EFS file system
  4. The transit gateway multicast domain subnet association
  5. The transit gateway multicast domain
  6. The transit gateway VPC attachment
  7. The transit gateway
  8. The route table entries that you created
  9. The security groups that you created for the instances, database, and file system

Conclusion

The approach I shared has many steps, but they’re not bad when you have a plan. Whether you need a simple setup for experiments, or you need a scalable environment for a data center migration, you now have a path forward. Give it a try, and comment here about the things you learned. I hope you find it useful and fun!

“Apache”, “Apache CloudStack”, and “CloudStack” are trademarks of the Apache Software Foundation.

Building a Cloud in the Cloud: Running Apache CloudStack on Amazon EC2, Part 1

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/building-a-cloud-in-the-cloud-running-apache-cloudstack-on-amazon-ec2-part-1/

This blog is written by Mark Rogers, SDE II – Customer Engineering AWS.

How do you put a cloud inside another cloud? Some features that make Amazon Elastic Compute Cloud (Amazon EC2) secure and wonderful also make running CloudStack difficult. The biggest obstacle is that AWS and CloudStack both want to manage network resources. Therefore, we must keep them out of each other’s way. This requires some steps that aren’t obvious, and it took a long time to figure out. I’m going to share what I learned, so that you can navigate the process more easily.

Apache CloudStack is an open-source platform for deploying and managing virtual machines (VMs) and the associated network and storage infrastructure. You would normally run it on your own hardware to create your own cloud. But there can be advantages to running it inside of an Amazon Virtual Private Cloud (Amazon VPC), including how it could help you migrate out of a data center. It’s a great way to create disposable environments for experiments or training. Furthermore, it’s a convenient way to test-drive the new CloudStack support in Amazon Elastic Kubernetes Service (Amazon EKS) Anywhere. In my case, I needed to create development and test environments for a project that uses the CloudStack API. The environments needed to be shared and scalable. Our build pipelines were already in AWS, so it made sense to put the new environments there, too.

CloudStack can work with a number of hypervisors. The instructions in this article will use Kernel-based Virtual Machine (KVM) on Linux. KVM will manage the VMs at a low level, and CloudStack will manage KVM.

Prerequisites

Most of the information in this article should be applicable to a range of CloudStack versions. I targeted CloudStack 4.14 on CentOS 7. I also tested CloudStack versions 4.16 and 4.17, and I recommend them.

The official CentOS 7 x86_64 HVM image works well. If you use a different Linux flavor or version, then you might have to modify some of the implementation details.

You’ll need to know the basics of CloudStack. The scope of this article is making CloudStack and AWS coexist peacefully. Once CloudStack is running, I’m assuming that you’ll handle things from there.  Refer to the AWS documentation and CloudStack documentation for information on security and other best practices.

Making things easier

I wrote some scripts to automate the installation. You can run them on EC2 instances with CentOS 7, and they’ll do all the installation and OS configuration for you. You can use them as they are, or customize them to meet your needs. I also wrote some AWS CloudFormation templates you can copy in order to create a demo environment. The README file has more details.

Amazon EC2 instance types

KVM requires hardware virtualization support. Most EC2 instances are VMs that don’t support nested virtualization. To get access to the bare hardware, you need a metal instance type.

I like c5.metal because it’s one of the least expensive metal types, and has a low cost per vCPU. It has 96 vCPUs and 192 GiB of memory. If you run 20 VMs on it, with 4 CPU cores and 8 GiB of memory each, then you’d still have 16 vCPUs and 32 GiB to share between the operating system, CloudStack, and MySQL. Using CloudStack’s overprovisioning feature, you could fit even more VMs if they’re running light loads.

Networking

The biggest challenge is the network. AWS knows which IP and MAC addresses should exist, and it knows the machines to which they should belong. It blocks any traffic that doesn’t fit its idea of how the network should behave. Simultaneously, CloudStack assumes that any IP or MAC address it invents should work just fine. When CloudStack assigns addresses to VMs on an AWS subnet, their network traffic gets blocked.

You could get around this by enabling network address translation (NAT) on the instance running CloudStack. That’s a great solution if it fits your needs, but it makes it hard for other machines in your Amazon VPC to contact your VMs. I recommend a different approach.

Although AWS restricts what you can do with its layer 2 network, it’s perfectly happy to let you run your own layer 3 router. Your EC2 instance can act as a router to a new virtual subnet that’s outside of the jurisdiction of AWS. The instance integrates with AWS just like a VPN appliance, routing traffic to wherever it needs to go. CloudStack can do whatever it wants in the virtual subnet, and everybody’s happy.

What do I mean by a virtual subnet? This is a subnet that exists only inside the EC2 instance.  It consists of logical network interfaces attached to a Linux bridge. The entire subnet exists inside a single EC2 instance. It doesn’t scale well, but it’s simple. In my next post, I’ll cover a more complicated setup with an overlay network that spans multiple instances to allow horizontal scaling.

The simple way

The simple way is to put everything in one EC2 instance, including the database, file storage, and a virtual subnet. Because everything’s stored locally, allocate enough disk space for your needs. 500 GB will be enough to support a few basic VMs. Create or select a security group for your instance that gives users access to the CloudStack UI (TCP port 8080). The security group should also allow access to any services that you’ll offer from your VMs.

EC2 instance summary info showing 1 instance, CentOS 7 (x86_64) AMI, c5.metal instance type, a security group name, and a 500 GiB volume

When you have your instance, configure AWS to treat it as a router.

  1. Go to Amazon EC2 in the AWS Management Console.
  2. Select your instance, and stop source/destination checking.

In the EC2 Actions menu, select Networking, then Change source/destination check.

3. Update the subnet route tables.

a. Go to the VPC settings, and select Route Tables.

b. Identify the tables for subnets that need CloudStack access.

c. In each of these tables, add a route to the new virtual subnet. The route target should be your EC2 instance.

4. Depending on your network needs, you may also need to add routes to transit gateways, VPN endpoints, etc.

Because everything will be on one server, creating a virtual subnet is simply a matter of creating a Linux bridge. CloudStack must find a network adapter attached to the bridge. Therefore, add a dummy interface with a name that CloudStack will recognize.

A single EC2 instance contains the CloudStack management service, the CloudStack agent, a dummy network interface, several virtual machines, and a router. All of those things are connected to each other by a virtual subnet that exists inside the instance. The instance's elastic network interface is connected between the router and the Amazon VPC.The following snippet shows how I configure networking in CentOS 7. You must provide values for the variables $virutal_host_ip_address and $virtual_netmask to reflect the virtual subnet that you want to create. For $dns_address, I recommend the base of the VPC IPv4 network range, plus two. You shouldn’t use 169.654.169.253 because CloudStack reserves link-local addresses for its own use.

yum install -y bridge-utils net-tools

# The bridge must be named cloudbr0.

cat << EOF > /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
TYPE=Bridge
ONBOOT=yes
BOOTPROTO=none
IPV6INIT=no
IPV6_AUTOCONF=no
DELAY=5
STP=yes
USERCTL=no
NM_CONTROLLED=no
IPADDR=$virtual_host_ip_address
NETMASK=$virtual_netmask
DNS1=$dns_address
EOF

# Create a dummy network interface.
cat << EOF > /etc/sysconfig/modules/dummy.modules
#!/bin/sh
/sbin/modprobe dummy numdummies=1
/sbin/ip link set name ethdummy0 dev dummy0
EOF

chmod +x /etc/sysconfig/modules/dummy.modules
/etc/sysconfig/modules/dummy.modules

cat << EOF > /etc/sysconfig/network-scripts/ifcfg-ethdummy0
TYPE=Ethernet
BOOTPROTO=none
NAME=ethdummy0
DEVICE=ethdummy0
ONBOOT=yes
BRIDGE=cloudbr0
NM_CONTROLLED=no
EOF

# Turn the instance into a router

echo 'net.ipv4.ip_forward=1' >> /etc/sysctl.conf
sysctl -p

# Must kill dhclient or the network service won't restart properly.
# A reboot would also work, if you’d rather do that.

pkill dhclient
systemctl restart network

CloudStack must know which IP addresses to use for inter-service communication. It will select by resolving the machine’s fully qualified domain name (FQDN) to an address. The following commands will make it to choose the right one. You must provide a value for $virtual_host_ip_address.

hostnamectl set-hostname cloudstack.localdomain

echo "$virtual_host_ip_address cloudstack.localdomain" >> 
/etc/hosts

You can finish the setup by following the Quick Installation Guide.

Remember that CloudStack is only directly connected to your virtual network. The EC2 instance is the router that connects the virtual subnet to the Amazon VPC. When you’re configuring CloudStack, use your instance’s virtual subnet address as the default gateway.

Use the EC2 instance's virtual subnet IP address as the default gateway in CloudStack. In this example, the virtual subnet is 10.100.0.0/16, and the instance's address in that subnet is 10.100.0.1. CloudStack then uses 10.100.0.1 as the default gateway.

To access CloudStack from your workstation, you’ll need a connection to your VPC. This can be through a client VPN or a bastion host. If you use a bastion, its subnet needs a route to your virtual subnet, and you’ll need an SSH tunnel for your browser to access the CloudStack UI. The UI is at http://x.x.x.x:8080/client/, where x.x.x.x is your CloudStack instance’s virtual subnet address. Note that CloudStack’s console viewer won’t work if you’re using an SSH tunnel.

If you’re just experimenting with CloudStack, then I suggest saving money by stopping your instance when it isn’t needed. The safe way to do that is:

  1. Disable your zone in the CloudStack UI.
  2. Put the primary storage into maintenance mode.
  3. Wait for the switch to maintenance mode to be complete.
  4. Stop the EC2 instance.

When you’re ready to turn everything back on, simply reverse those steps. If you have any virtual routers in CloudStack, then you may need to start those, too.

Cleanup

If you used my CloudFormation template, then delete the stack and remove any route table entries you added. If you didn’t use CloudFormation, then terminate the EC2 instance, delete the security group you created for it, and remove any route table entries that you added.

Conclusion

Getting CloudStack to run on AWS isn’t so bad. The hardest part is simply knowing how. The setup explained here is great for small installations, but it can only scale vertically. In my next post, I’ll show you how to create an installation that scales horizontally. Instead of using a virtual subnet that exists in a single EC2 instance, we’ll build an overlay network that spans multiple instances. It will use more components and features, including some that might be new to you. I hope you find it interesting!

Now that you can create a simple setup, give it a try! I hope you have fun and learn something new along the way. Comment with the results of your experiments.

“Apache”, “Apache CloudStack”, and “CloudStack” are trademarks of the Apache Software Foundation.

Enabling load-balancing of non-HTTP(s) traffic on AWS Wavelength

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/enabling-load-balancing-of-non-https-traffic-on-aws-wavelength/

This blog post is written by Jack Chen, Telco Solutions Architect, and Robert Belson, Developer Advocate.

AWS Wavelength embeds AWS compute and storage services within 5G networks, providing mobile edge computing infrastructure for developing, deploying, and scaling ultra-low-latency applications. AWS recently introduced support for Application Load Balancer (ALB) in AWS Wavelength zones. Although ALB addresses Layer-7 load balancing use cases, some low latency applications that get deployed in AWS Wavelength Zones rely on UDP-based protocols, such as QUIC, WebRTC, and SRT, which can’t be load-balanced by Layer-7 Load Balancers. In this post, we’ll review popular load-balancing patterns on AWS Wavelength, including a proposed architecture demonstrating how DNS-based load balancing can address customer requirements for load-balancing non-HTTP(s) traffic across multiple Amazon Elastic Compute Cloud (Amazon EC2) instances. This solution also builds a foundation for automatic scale-up and scale-down capabilities for workloads running in an AWS Wavelength Zone.

Load balancing use cases in AWS Wavelength

In the AWS Regions, customers looking to deploy highly-available edge applications often consider Amazon Elastic Load Balancing (Amazon ELB) as an approach to automatically distribute incoming application traffic across multiple targets in one or more Availability Zones (AZs). However, at the time of this publication, AWS-managed Network Load Balancer (NLB) isn’t supported in AWS Wavelength Zones and ALB is being rolled out to all AWS Wavelength Zones globally. As a result, this post will seek to document general architectural guidance for load balancing solutions on AWS Wavelength.

As one of the most prominent AWS Wavelength use cases, highly-immersive video streaming over UDP using protocols such as WebRTC at scale often require a load balancing solution to accommodate surges in traffic, either due to live events or general customer access patterns. These use cases, relying on Layer-4 traffic, can’t be load-balanced from a Layer-7 ALB. Instead, Layer-4 load balancing is needed.

To date, two infrastructure deployments involving Layer-4 load balancers are most often seen:

  • Amazon EC2-based deployments: Often the environment of choice for earlier-stage enterprises and ISVs, a fleet of EC2 instances will leverage a load balancer for high-throughput use cases, such as video streaming, data analytics, or Industrial IoT (IIoT) applications
  • Amazon EKS deployments: Customers looking to optimize performance and cost efficiency of their infrastructure can leverage containerized deployments at the edge to manage their AWS Wavelength Zone applications. In turn, external load balancers could be configured to point to exposed services via NodePort objects. Furthermore, a more popular choice might be to leverage the AWS Load Balancer Controller to provision an ALB when you create a Kubernetes Ingress.

Regardless of deployment type, the following design constraints must be considered:

  • Target registration: For load balancing solutions not managed by AWS, seamless solutions to load balancer target registration must be managed by the customer. As one potential solution, visit a recent HAProxyConf presentation, Practical Advice for Load Balancing at the Network Edge.
  • Edge Discovery: Although DNS records can be populated into Amazon Route 53 for each carrier-facing endpoint, DNS won’t deterministically route mobile clients to the most optimal mobile endpoint. When available, edge discovery services are required to most effectively route mobile clients to the lowest latency endpoint.
  • Cross-zone load balancing: Given the hub-and-spoke design of AWS Wavelength, customer-managed load balancers should proxy traffic only to that AWS Wavelength Zone.

Solution overview – Amazon EC2

In this solution, we’ll present a solution for a highly-available load balancing solution in a single AWS Wavelength Zone for an Amazon EC2-based deployment. In a separate post, we’ll cover the needed configurations for the AWS Load Balancer Controller in AWS Wavelength for Amazon Elastic Kubernetes Service (Amazon EKS) clusters.

The proposed solution introduces DNS-based load balancing, a technique to abstract away the complexity of intelligent load-balancing software and allow your Domain Name System (DNS) resolvers to distribute traffic (equally, or in a weighted distribution) to your set of endpoints.

Our solution leverages the weighted routing policy in Route 53 to resolve inbound DNS queries to multiple EC2 instances running within an AWS Wavelength zone. As EC2 instances for a given workload get deployed in an AWS Wavelength zone, Carrier IP addresses can be assigned to the network interfaces at launch.

Through this solution, Carrier IP addresses attached to AWS Wavelength instances are automatically added as DNS records for the customer-provided public hosted zone.

To determine how Route 53 responds to queries, given an arbitrary number of records of a public hosted zone, Route53 offers numerous routing policies:

Simple routing policy – In the event that you must route traffic to a single resource in an AWS Wavelength Zone, simple routing can be used. A single record can contain multiple IP addresses, but Route 53 returns the values in a random order to the client.

Weighted routing policy – To route traffic more deterministically using a set of proportions that you specify, this policy can be selected. For example, if you would like Carrier IP A to receive 50% of the traffic and Carrier IP B to receive 50% of the traffic, we’ll create two individual A records (one for each Carrier IP) with a weight of 50 and 50, respectively. Learn more about Route 53 routing policies by visiting the Route 53 Developer Guide.

The proposed solution leverages weighted routing policy in Route 53 DNS to route traffic to multiple EC2 instances running within an AWS Wavelength zone.

Reference architecture

The following diagram illustrates the load-balancing component of the solution, where EC2 instances in an AWS Wavelength zone are assigned Carrier IP addresses. A weighted DNS record for a host (e.g., www.example.com) is updated with Carrier IP addresses.

DNS-based load balancing

When a device makes a DNS query, it will be returned to one of the Carrier IP addresses associated with the given domain name. With a large number of devices, we expect a fair distribution of load across all EC2 instances in the resource pool. Given the highly ephemeral mobile edge environments, it’s likely that Carrier IPs could frequently be allocated to accommodate a workload and released shortly thereafter. However, this unpredictable behavior could yield stale DNS records, resulting in a “blackhole” – routes to endpoints that no longer exist.

Time-To-Live (TTL) is a DNS attribute that specifies the amount of time, in seconds, that you want DNS recursive resolvers to cache information about this record.

In our example, we should set to 30 seconds to force DNS resolvers to retrieve the latest records from the authoritative nameservers and minimize stale DNS responses. However, a lower TTL has a direct impact on cost, as a result of increased number of calls from recursive resolvers to Route53 to constantly retrieve the latest records.

The core components of the solution are as follows:

Alongside the services above in the AWS Wavelength Zone, the following services are also leveraged in the AWS Region:

  • AWS Lambda – a serverless event-driven function that makes API calls to the Route 53 service to update DNS records.
  • Amazon EventBridge– a serverless event bus that reacts to EC2 instance lifecycle events and invokes the Lambda function to make DNS updates.
  • Route 53– cloud DNS service with a domain record pointing to AWS Wavelength-hosted resources.

In this post, we intentionally leave the specific load balancing software solution up to the customer. Customers can leverage various popular load balancers available on the AWS Marketplace, such as HAProxy and NGINX. To focus our solution on the auto-registration of DNS records to create functional load balancing, this solution is designed to support stateless workloads only. To support stateful workloads, sticky sessions – a process in which routes requests to the same target in a target group – must be configured by the underlying load balancer solution and are outside of the scope of what DNS can provide natively.

Automation overview

Using the aforementioned components, we can implement the following workflow automation:

Event-driven Auto Scaling Workflow

Amazon CloudWatch alarm can trigger the Auto Scaling group Scale out or Scale in event by adding or removing EC2 instances. Eventbridge will detect the EC2 instance state change event and invoke the Lambda function. This function will update the DNS record in Route53 by either adding (scale out) or deleting (scale in) a weighted A record associated with the EC2 instance changing state.

Configuration of the automatic auto scaling policy is out of the scope of this post. There are many auto scaling triggers that you can consider using, based on predefined and custom metrics such as memory utilization. For the demo purposes, we will be leveraging manual auto scaling.

In addition to the core components that were already described, our solution also utilizes AWS Identity and Access Management (IAM) policies and CloudWatch. Both services are key components to building AWS Well-Architected solutions on AWS. We also use AWS Systems Manager Parameter Store to keep track of user input parameters. The deployment of the solution is automated via AWS CloudFormation templates. The Lambda function provided should be uploaded to an AWS Simple Storage Service (Amazon S3) bucket.

Amazon Virtual Private Cloud (Amazon VPC), subnets, Carrier Gateway, and Route Tables are foundational building blocks for AWS-based networking infrastructure. In our deployment, we are creating a new VPC, one subnet in an AWS Wavelength zone of your choice, a Carrier Gateway, and updating the route table for this subnet to point the default route to the Carrier Gateway.

Wavelength VPC architecture.

Deployment prerequisites

The following are prerequisites to deploy the described solution in your account:

  • Access to an AWS Wavelength zone. If your account is not allow-listed to use AWS Wavelength zones, then opt-in to AWS Wavelength zones here.
  • Public DNS Hosted Zone hosted in Route 53. You must have access to a registered public domain to deploy this solution. The zone for this domain should be hosted in the same account where you plan to deploy AWS Wavelength workloads.
    If you don’t have a public domain, then you can register a new one. Note that there will be a service charge for the domain registration.
  • Amazon S3 bucket. For the Lambda function that updates DNS records in Route 53, store the source code as a .zip file in an Amazon S3 bucket.
  • Amazon EC2 Key pair. You can use an existing Key pair for the deployment. If you don’t have a KeyPair in the region where you plan to deploy this solution, then create one by following these instructions.
  • 4G or 5G-connected device. Although the infrastructure can be deployed independent of the underlying connected devices, testing the connectivity will require a mobile device on one of the Wavelength partner’s networks. View the complete list of Telecommunications providers and Wavelength Zone locations to learn more.

Conclusion

In this post, we demonstrated how to implement DNS-based load balancing for workloads running in an AWS Wavelength zone. We deployed the solution that used the EventBridge Rule and the Lambda function to update DNS records hosted by Route53. If you want to learn more about AWS Wavelength, subscribe to AWS Compute Blog channel here.

Scaling AWS Outposts rack deployments with ACE racks

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/scaling-aws-outposts-rack-deployments-with-ace-racks/

This blog post is written by Eric Vasquez, Specialist Hybrid Edge Solutions Architect, and Paul Scherer, Senior Network Service Tech.

Overview

AWS Outposts brings managed, monitored AWS infrastructure, compute, and storage to your on-premises environment. It provides the same AWS APIs, and console experience you would get within the AWS Region to which the Outpost is homed to. You may already have an Outposts rack. An Outpost can consist of one or more racks creating a pool of consumable resources as a single logical Outpost. In this post, we will introduce you to an Aggregation, Core, Edge (ACE) rack.

Depending on your familiarity with the Outpost family, you might have already heard about an ACE rack. An ACE rack serves as an aggregation point for multi-rack Outpost deployments. ACE racks reduce the physical networking port requirements as well as the logical interfaces needed, while allowing for connectivity between multiple racks in your logical Outpost. ACE racks are recommended for customers with planned deployments beyond three racks excluding the ACE rack itself.

We recommend that all customers leverage an ACE rack if planning expansions beyond three racks in the long-term, even if the initial deployment is a single rack. An ACE rack contains four routers, and these routers can connect to either two or four customer upstream devices. For the best redundancy, reliability, and resiliency, we recommend deploying an ACE rack to four upstream customer devices.

ACE racks support 10G, 40G, and 100G connections to a customer network. However, 100G connections between each ACE router to a customer device are recommended.

Outpost architecture aceOutpost extension from region and ACE rack deployment in a 15 rack Outpost configuration

Each Outposts rack comes standard with redundant Outpost networking devices, power supplies, and two top-of-rack patch panels which serve as demarcation points between the Outpost rack and your customer networking device (CND). For the remainder of this post, we’ll refer to the Outpost Networking Devices as OND and customer switches/routers as CND. The Outpost rack ONDs form Border Gateway Protocol (BGP) neighbor relationships with either your CND or the ACE rack using point-to-point (P2P) Virtual LAN (VLAN) interfaces.

For Outposts installation without an ACE rack, each Outposts OND connects to your LAN using single-mode or multi-mode fiber with LC connectors supporting 1G, 10G, 40G, or 100G connectivity. We provide flexibility for the CNDs and allow either Layer 2 or Layer 3 devices, including firewalls. Each OND uses a single LACP port channel that carries 2 VLAN point-to-points virtual interfaced (VIF)to establish 2 BGP relationships over the port channel to your upstream CND and aggregate total bandwidth. This results in each Outpost rack requiring a minimum of two physical uplinks, but as a general best practice we recommend two-per-device for a total of four uplinks, along with two LACP port channels and 4 VLAN to establish point-to-points (P2P) BGP peering’s. Note that the IP’s used in the following diagram are just examples.

Outpost Service link and Local Gateway VLANOutpost Service link and Local Gateway VLAN

As we continue to expand rack deployments, so will the number of physical uplinks and VLAN interfaces required for the added OND to a CND. When we introduce the ACE rack, the OND is no longer attached to your CND. Instead, it goes directly to ACE devices, which provide at least one uplink to your network switch/router. In this topology, AWS owns the VLAN interface allocation and configuration between compute rack OND and the ACE routers.

Let’s cover the potential downsides to a multi-rack installation without an ACE rack. In this case, we have a three-rack Outpost deployment, with one uplink (two per rack) from each rack OND to the CND. This would require you to provide: six physical ports on your devices, six fiber cables,12 VLAN VIFs, 12 P2P subnets potentially exhausting 24 ips, and six port channels.

In comparison to a three-rack install that sits behind an ACE rack, you provide fewer physical network ports on your devices, fewer fiber cabling uplinks, fewer VLAN VIFs, fewer port channels, and fewer P2P’s. Each ACE router will have its own LACP port channel with 2x VLAN VIFs in each channel (the same as an Outposts Networking Devices (OND) <> Customer connection). The following table highlights the advantages in using an ACE rack when running a multi-rack Outpost, which becomes more desirable as you continue to scale.

2-Rack Outpost

Installation

3-Rack Outpost

Installation

4-Rack Outpost

Installation

Requirement

Without ACE With ACE Without ACE With ACE Without ACE With ACE

Physical Ports

4

4

6

4

8

4

Fiber Cables

4

4 6 4 8

4

LACP Port Channels

4

4 6 4

8

4

VLAN VIFs

8

8 12 8 16

8

P2P Subnets 8 8 12 8 16

8

ACE VS Non-ACE Rack Components Comparison

Furthermore, you should consider the additional weight, and power requirements that an ACE rack introduces when planning for multi-rack deployments. In addition to initial kVA requirements for the Outpost racks you must account for the resources required for an ACE rack. An ACE rack consumes up to 10kVA of power and weighs up to 705 lbs. Carefully planning additional capacity for these resources with your AWS account team will be critical for a successful deployment.

Similar to an Outpost rack, an ACE rack deployment is monitored by AWS. The rack provides telemetry data transmitted over a set of VPN tunnels back to the anchor points in the Region to which the Outpost is homed. This allows AWS to monitor the rack for hardware failures, performance degradation, and other alarm conditions including Links, Interfaces going down, and BGP drops.

As part of the Outpost ordering process, AWS will work closely with you to determine the location for install, power availability on-site, and the network configuration of both the Outposts rack and ACE rack. This includes BGP configuration, and the Customer Owned IP Address (CoIP), which is the pool of IP addresses for route advertisements back to your CND. The COIP pool allows resources inside your Outpost rack to communicate with on-premises resources and vice-versa. Another connectivity option would be the Direct VPC Routing (DVR) where we advertise VPC subnets associated with your LGW to your on-premises networks. Outposts uses a networking connectivity back to the Region for management purposes called the service link (SL). The SL is an encrypted set of VPN connections used whenever the Outpost communicates with your chosen home Region.

Conclusion

This post addresses the most common questions surrounding ACE racks, how an ACE rack can be deployed, and why an ACE rack would be leveraged for a multi-Outpost rack deployment. In this post, we demonstrated how an ACE rack serves as a consolidation point in your on-premises environment, making multi-rack deployments scalable, while reducing complexity and physical port allocation for connectivity between an Outpost and your LAN. In addition, we described how you can get this process started. If you want to learn more about Outposts fundamentals and how you can build your applications with AWS services using Outposts for hybrid cloud deployments you can learn more check out the Outposts user guide.

AWS Local Zones and AWS Outposts, choosing the right technology for your edge workload

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/aws-local-zones-and-aws-outposts-choosing-the-right-technology-for-your-edge-workload/

This blog post is written by Joe Sacco, Senior Technical Account Manager.

The AWS Global Cloud Infrastructure includes 30 Launched Regions, 96 Availability Zones (AZs), 410+ Points of Presence with 400+ Edge Locations, and 13 Regional Edge Caches.  With over 200 AWS services, most customer workloads can run in the AWS Regions. However, for some location-sensitive workloads with low-latency or data residency requirements, and when an AWS Region isn’t close enough, AWS offers two additional infrastructure options: AWS Local Zones and AWS Outposts. Although Local Zones and Outposts solve for similar problems, we’ll review use cases as well as the services and features available that can help you decide which offering best suits your needs.

Let’s start with an overview of Local Zones and Outposts.

What are Local Zones?

Local Zones are a new type of infrastructure deployment that places AWS compute, storage, database, and other select AWS services in large metropolitan areas closer to end users. This gives you access to single-digit millisecond latency with the use of AWS Direct Connect and the ability to meet data residency requirements. Local Zones are also connected to their parent Region via AWS’s redundant and high bandwidth private network. This gives applications running in Local Zones fast, secure, and seamless access to a complete list of services in the parent Region.

Unlike Outposts, which you deploy within your datacenter or a co-location of your choice, Local Zones are owned, managed, and operated by AWS. Local Zones eliminate the need for you to manage power, connectivity, and capacity. Furthermore, you can provision workloads on a Local Zone from your AWS Management Console just as you would for AZs and Regions today.

AWS Local Zones how it worksWhat is Outposts?

Outposts is a family of fully managed solutions delivering AWS infrastructure and services to virtually any on-premises or edge location for a truly consistent hybrid experience. Outposts lets you run some AWS services locally and connect to a broad range of services available in the local AWS Region. Outposts comes in two types of offerings: Outposts rack and Outposts servers, with which you can run applications and workloads on-premises using the same AWS infrastructure, services, tools, and APIs as in AWS Regions.

The Outposts rack is available as an industry standard 42U form factor. It provides the same AWS infrastructure, services, tools, and APIs to your data center or co-location space  that you would find in an AWS Region.

Outposts Rack

The Outposts servers come in a 1U or 2U form factor and are designed for locations that have limited space or smaller capacity requirements. Both support different compute instances, as detailed in the Outposts servers feature page.

Outposts ServersCustomer use cases

Now that we have an overview of both Local Zones and Outposts service offerings, let’s dive into use cases, the differences between them, and how your business can leverage each to accomplish your workloads requirements.

Low latency

Customers today require low latency computing for workloads, such as medical imaging, transaction processing for Enterprise Resource Planning (ERP) applications, enterprise migration with hybrid architecture, real-time multiplayer gaming, telco network function virtualization, and regulated gaming workloads.

Outposts can meet ultra-low latency requirements. This is accomplished by bringing AWS services on premises and to the edge at Outpost Sites. An Outpost site is the physical location where your Outpost operates, and it can be local within one of your data centers or at a co-location facility of your choice.

When accessing from within the same metro, Local Zones will provide you with a low, single millisecond latency experience when communicating with your applications. Latency between Local Zones and AWS Regions or Local Zones and on-premises environments varies, and these will depend on how close the nearest Local Zone is as well as the type of modality used for the connection (Public Internet, VPN, and AWS Direct Connect). You should always choose the closest Local Zone location to achieve the lowest possible latency. For use cases such as mobile gaming, you can utilize Local Zones by deploying your applications to a Local Zone location nearest to your end users. Local Zones are generally available in 17 metros across the US, 4 outside the US, and we are continuing to launch Local Zones in 30 cities across 25 countries. Check out updates for more general availability of Local Zones.

Data residency

On occasion, data must remain in a specific geographic region for regulatory or information security reasons. Healthcare and other regulated industries, such as financial services or Oil & Gas, have specific data residency requirements.

Outposts helps meet a customer’s data residency requirements because it’s installed on premises and essentially brings AWS to where the data currently resides. This allows you to pick and control where your workloads run, and where your data will stay. Check out the full list of countries and territories where Outposts is available on the FAQs page of Outposts rack and the FAQs page of Outposts servers.

Local Zones bring AWS closer or within a customer’s geographic boundary in a fully AWS owned and operated mode. Although Local Zones can help meet data residency use cases in some scenarios, data residency requirements vary depending on the jurisdictions. Therefore, you should work closely with your compliance and information security teams when choosing the Local Zone location in which to deploy your regulated workloads.

Migration and modernization

When trying to migrate to the cloud and modernize your stack, some workloads can be challenging. Often there are on-premises applications which are difficult to move into Regions due to latency-sensitive system intermittencies between their various components. As dependencies arise, you may choose to segment these migrations into smaller pieces. Then this will require latency-sensitive connectivity between the various parts of the application.

Outposts and Local Zones both allow for a gradual migration and modernization of your stack. You can choose to migrate parts of their workloads while still maintaining latency-sensitive connectivity between components until the entirety is ready to move.

Factors in selecting Local Zones or Outposts

Choosing between Local Zones and Outposts will depend on the following factors, and you should examine all of them together when selecting a service for your use case.

  1. Latency requirements

Local Zones can achieve low single millisecond latency when accessing within the same metro. On the other hand, Outposts can achieve ultra-low latency requirements when deployed within your datacenter or at a co-location facility of your choice. When selecting one over the other, you must work backward from your goal and workload requirements.

If you’re conducting a migration and modernization strategy which requires ultra-low latency between a workloads application and database tiers that are difficult to migrate to the AWS Regions, then Outposts would be the right solution for you.

Alternatively, if your workload involves streaming live broadcasts to end users which requires low single millisecond latency, but your end users are located where an AWS Region isn’t available, then Local Zones distributed across various metros would work best to serve your content.

  1. Availability of services needed to support your workload

Local Zones and Outposts differ with their list of supported AWS services, and you must review your workload’s service requirements when determining the best fit for you. For example, if a customer has a computer vision workload that requires storing and retrieving large volumes of images locally using Amazon Simple Storage Service (Amazon S3), then Outposts and certain Local Zones meet this requirement while other Local Zones don’t. Learn how you can use Amazon S3 on Outposts for computer vision workloads.

Outposts rack and servers support different sets of AWS services locally. You can view comparisons between them, or visit the Outposts servers and Outposts rack feature sites for more details.

Local Zones’ features vary depending on the location in which you choose to deploy. You can view more details and a full list of supported features and services per location on our Local Zones features page.

  1. Investment and management of infrastructure on-premises

Management of the infrastructure and prerequisites are another factor when considering which AWS service best suits your needs.

Outposts is ordered through AWS, and it requires installation in a customer’s on-premises datacenter or co-location provider of their choice. Outposts rack installation is handled by AWS, while Outposts servers installation is done by the customer or a third-party of their choosing. There are power and redundant networking requirements for the Outpost Site, as well as a required subscription to AWS Enterprise Support or On-Ramp Support.

Local Zones infrastructure is fully-managed by AWS, including the power, networking, and capacity. This reduces operational management as well as the overhead cost for customers. An Enterprise support agreement isn’t required to utilize Local Zones.

You should always choose Regions or Local Zones if your use case allows, and use Outposts when a Region or Local Zone isn’t a good fit. If both Outposts and Local Zones fit a customer’s use case and requirements, then Local Zones will be the preferred choice.

  1. Regulations, compliance, and information security

If a Local Zone is either unavailable or unable to meet your residency requirements within your geographic boundary consider Outposts, which can be deployed to a data center or co-location facility of your choice. Data residency requirements can be a factor based on your industry and the regulations to which your workload must adhere. Furthermore, you should work closely with your compliance and information security teams when choosing between Local Zones or Outposts.

Conclusion

Whether you’re dealing with latency-sensitive applications, data residency requirements, or a migration and modernization strategy, AWS provides options and flexibility for you to leverage the same AWS infrastructure, services, APIs, and tools to metro areas and on-premises locations with Local Zones and Outposts.

The decision of which technology to use will depend on several factors that we discussed above. You must work across teams within your organization to make sure that the latency requirements (low single millisecond latency within a metro for Local Zones vs the ultra low latency of Outposts when deployed close to or within your datacenter), data reseidency needs, installation prerequisites, and availability of services to support your workload are met.

Once these factors are taken into account, and you have made a choice, visit our product pages for Outposts and Local Zones with information on how you can get started.

Monitoring shared AWS Outposts rack capacity

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/monitoring-shared-aws-outposts-rack-capacity/

This post is written by Adam Imeson, Sr. Hybrid Edge Specialist Solutions Architect.

AWS Outposts rack is a fully-managed service that offers the same AWS infrastructure, APIs, tools, and a subset of AWS services to any data center, colocation space, or on-premises facility for a consistent hybrid experience. Outposts rack is ideal for workloads that require low latency, access to on-premises systems, local data processing, data residency, and migration of applications with local system interdependencies.

An Outpost is a pool of AWS compute and storage capacity deployed at a customer site. In an Outposts rack deployment, an Outpost may comprise of one or more racks connected together at the site. It’s common for customers to order their Outpost in a dedicated account and then integrate with their multi-account organizational architecture by sharing the Outpost via AWS Resource Access Manager (AWS RAM). This post will explain how to set up cross-account Amazon CloudWatch metrics so that disparate stakeholders within your organization can effectively monitor your Outpost’s capacity to meet their specific needs.

Overview

The AWS account that you use to order an Outpost owns that Outpost. This includes all metrics and health events pertaining to that Outpost. Many customers must integrate Outposts into their multi-account environments, as discussed in the “Best practices: AWS Outposts in a multi-account AWS environment” posts (part 1 and part 2). This post will go into more detail on how to monitor Outposts in these environments.

The nuance here stems from the different ways to share access to AWS resources. AWS RAM allows infrastructure resources to be shared across multiple accounts. Then, the consumer accounts can launch resources on the infrastructure as though they owned it. AWS Identity and Access Management (IAM) allows customers to modify a given account’s permissions such that users in other accounts can make AWS API calls that affect the given account.

An Outpost provides infrastructure resources, so customers can share Outposts via AWS RAM. CloudWatch metrics about Outposts are data which customers retrieve using AWS API calls, so customers can share access to those metrics using IAM.

In a typical customer’s AWS Organization, there are two cases to consider. First, when the customer is sharing an Outpost to multiple development accounts, each account needs to view metrics relevant to the Outpost so that the development accounts can deploy and operate their applications.Diagram depicting an Outpost that a customer has shared to three different accounts using RAM. The three different accounts each have a different application deployed in them.

Second, when the customer has several accounts that each own different Outposts, the customer’s centralized monitoring account needs to track metrics relevant to each of the Outposts.

Diagram depicting three accounts that each own a separate Outpost, with all three accounts sharing Outpost metrics to CloudWatch in the customer’s central monitoring account.

This post will explain strategies for both cases.

Customers must monitor the health of the Outpost’s connection to its regional control plane (the Outpost’s service link), as an Outpost is an extension of an AWS Availability Zone (AZ) and is designed to be connected to an AZ at all times. The health of the Outpost’s service link is a crucial variable when application owners are diagnosing disruptions to their application, and also when infrastructure owners are diagnosing disruptions to a site. Customers can monitor their service link’s status with the ConnectedStatus metric.

Customers also must monitor their Outposts’ current capacity. Outposts necessarily have a limited capacity footprint when compared to an AWS Region. Application owners must make informed decisions about capacity as they scale their apps over time or respond to occasional hardware failures. Infrastructure owners also must maintain a holistic view of capacity across all of the Outposts for which they are responsible so that they can plan for capacity expansion over time. Customers can monitor their Outposts’ capacity using the various capacity metrics that Outposts provide.

For an overview of how to set up a capacity dashboard and capacity-based CloudWatch alarms within a single account, see “Monitoring AWS Outposts capacity.” This post will expand on the single-account strategy by introducing cross-account capabilities. See also “Cross-Account Cross-Region Dashboards with Amazon CloudWatch.” These two posts provide practical walkthroughs for setting up the metric flows explained below.

Setting up Outposts metric permissions for your organization

This post assumes that you have multiple Outposts in different accounts that are all part of the same Organization. You’re sharing these Outposts into accounts that development teams use to deploy and operate their applications. You also have a centralized monitoring account where your infrastructure team tracks various metrics across all accounts. Your Organization might look something like this:

A base diagram depicting six AWS accounts with different names. Outpost Account 1 contains an Outpost. Outpost Account 2 contains a different Outpost. Monitoring Account contains Amazon CloudWatch. Accounts A through C contain Applications A through C respectively.

The first Outpost is shared to Accounts A and B, and the second Outpost is only shared to Account B. This is just an example of how a customer might set up their environment so that Application A can deploy on Outpost 1, and Application B can deploy on both Outpost 1 and 2.

The same base diagram of the six AWS accounts as before, with arrows added to depict AWS RAM resource shares. Outpost Account 1 shares its Outpost to Accounts A and B. Outpost Account 2 shares its Outpost to Account B.

To enable centralized monitoring, each account shares CloudWatch metrics with the central monitoring account as described in “Cross-Account Cross-Region Dashboards with Amazon CloudWatch.”

The same base diagram of the six AWS accounts, with arrows added to depict CloudWatch metrics being shared from all five of the other accounts to the Monitoring Account.

Now there are application accounts which can launch on the desired Outposts, and all of the accounts are sharing metrics with the central monitoring account. The team responsible for procuring and managing the Outposts can now set up dashboards in the central monitoring account in accordance with “Monitoring AWS Outposts capacity” to get a holistic view of capacity. This is valuable for capacity planning as applications naturally grow over time.

However, this may not be sufficient for operations. Consider that each application team needs to understand how much capacity is available on the Outpost that they’re using. This is crucial for teams operating highly available applications to maintain awareness of whether they still have N+1 capacity available on the Outpost to use in the event of a hardware failure. This is also important for planning expansions to the application ahead of time, as application teams have the best understanding of the future needs of their applications. Finally, application teams can use the metrics to track the operational health of the Outpost, which is crucial for root-causing any application disruptions.

You can implement this by sharing CloudWatch metrics from the Outpost accounts to the application accounts which are consuming the Outposts’ capacity, as shown in the following diagram.

The same base diagram of the six AWS accounts, with arrows depicting CloudWatch metrics being shared. Outpost Account 1 is sharing CloudWatch metrics to Accounts A and B. Outpost Account 2 is sharing CloudWatch metrics to Account B.

Walkthrough

Log in to your application account and navigate to the CloudWatch console. Open the Settings menu and choose Configure.

Screenshot of the CloudWatch Console’s Settings menu.

Scroll to the bottom. In the View cross-account cross-region section, choose Edit.

Screenshot of the Cross-account cross-region sub-menu in the CloudWatch console.

Choose your preferred account selection method from the three options and choose Save changes. I recommend the Custom account selector option, as it strikes a good balance between a simple setup and ease of use. If you choose this option, then input the Outpost owner account’s account ID and a human-readable name for the account. This name will appear in the drop-down when you’re using the CloudWatch console to view metrics from other accounts later.

Screenshot of the Cross-account cross-region sub-menu in the CloudWatch console, with the “Custom account selector” option selected and “123456789012 Outpost owner account” in the input field.

Your application account is now prepared to view metrics from the Outpost owner account. Now log in to the account that owns the Outpost and navigate to the CloudWatch console. You still need to share the Outpost’s metrics to the application account. Open the Settings page again, and choose Configure in the Cross-account cross-region section as before. This time, choose Share data in the Share your CloudWatch data section:

Screenshot of the Cross-account cross-region sub-menu in the CloudWatch console, with the “Share data” button circled in red in the “Share your CloudWatch data” section.

Choose Add account and input the application account’s account ID. Then scroll to the bottom of the page and choose Launch CloudFormation template.

Screenshot of the “Share your CloudWatch data sub-menu in the CloudWatch console. The “Specific accounts” option in the “Sharing” section is highlighted, and the sample account ID “234567890123” is typed into the input field.

The AWS CloudFormation template will create the CloudWatch-CrossAccountSharingRole. This role gives CloudWatch read access to the AWS account that you specified, the application account. You can view and modify this role using the IAM console if you want to. For example, you might adjust the role to allow read access to an entire Organizational Unit (OU).

Now, log back in to the application account and navigate to the CloudWatch console. Choose All metrics in the left-side menu. In the Metrics section, select the Outpost owner account from the drop-down.

Screenshot of the CloudWatch console’s “All metrics” sub-page. The account selection drop-down is circled in red in the “Metrics” subsection.

You can now view the metrics from the Outpost owner account and incorporate them into the dashboards in the application account. Now the application teams can track the Outposts’ ConnectedStatus metrics to be alerted on any disconnections from the region, and they can track the Outposts’ capacity metrics as well. It’s a best practice to alarm on Outpost capacity metrics once a consumption threshold defined by business needs has been breached.

Conclusion

Outposts rack allows customers to deploy AWS infrastructure into virtually any data center, colocation space, or on-premises facility. Outposts are tied to the AWS account that ordered them, and customers can share Outposts among AWS accounts within the same Organization. When multiple teams within a customer’s Organization are interacting with the same Outpost, that introduces additional monitoring surface area for capacity and service health. This post explains how customers can accommodate their teams’ different needs by sharing Outposts metrics around their Organization along with their Outposts. As best practices, customers should share their Outposts capacity and ConnectedStatus metrics to teams who are running applications on Outposts. Customers’ operations teams should also work with their stakeholders to define a maximum capacity utilization threshold for a given Outpost and alarm on that threshold.

Our guide to AWS Compute at re:Invent 2022

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/our-guide-to-aws-compute-at-reinvent-2022/

This blog post is written by Shruti Koparkar, Senior Product Marketing Manager, Amazon EC2.

AWS re:Invent is the most transformative event in cloud computing and it is starting on November 28, 2022. AWS Compute team has many exciting sessions planned for you covering everything from foundational content, to technology deep dives, customer stories, and even hands on workshops. To help you build out your calendar for this year’s re:Invent, let’s look at some highlights from the AWS Compute track in this blog. Please visit the session catalog for a full list of AWS Compute sessions.

Learn what powers AWS Compute

AWS offers the broadest and deepest functionality for compute. Amazon Elastic Cloud Compute (Amazon EC2) offers granular control for managing your infrastructure with the choice of processors, storage, and networking.

The AWS Nitro System is the underlying platform for our all our modern EC2 instances. It enables AWS to innovate faster, further reduce cost for our customers, and deliver added benefits like increased security and new instance types.

Discover the benefits of AWS Silicon

AWS has invested years designing custom silicon optimized for the cloud. This investment helps us deliver high performance at lower costs for a wide range of applications and workloads using AWS services.

  • Explore the AWS journey into silicon innovation with our “CMP201: Silicon Innovation at AWS” session. We will cover some of the thought processes, learnings, and results from our experience building silicon for AWS Graviton, AWS Nitro System, and AWS Inferentia.
  • To learn about customer-proven strategies to help you make the move to AWS Graviton quickly and confidently while minimizing uncertainty and risk, attend “CMP410: Framework for adopting AWS Graviton-based instances”.

 Explore different use cases

Amazon EC2 provides secure and resizable compute capacity for several different use-cases including general purpose computing for cloud native and enterprise applications, and accelerated computing for machine learning and high performance computing (HPC) applications.

High performance computing

  • HPC on AWS can help you design your products faster with simulations, predict the weather, detect seismic activity with greater precision, and more. To learn how to solve world’s toughest problems with extreme-scale compute come join us for “CMP205: HPC on AWS: Solve complex problems with pay-as-you-go infrastructure”.
  • Single on-premises general-purpose supercomputers can fall short when solving increasingly complex problems. Attend “CMP222: Redefining supercomputing on AWS” to learn how AWS is reimagining supercomputing to provide scientists and engineers with more access to world-class facilities and technology.
  • AWS offers many solutions to design, simulate, and verify the advanced semiconductor devices that are the foundation of modern technology. Attend “CMP320: Accelerating semiconductor design, simulation, and verification” to hear from ARM and Marvel about how they are using AWS to accelerate EDA workloads.

Machine Learning

Cost Optimization

Hear from our customers

We have several sessions this year where AWS customers are taking the stage to share their stories and details of exciting innovations made possible by AWS.

Get started with hands-on sessions

Nothing like a hands-on session where you can learn by doing and get started easily with AWS compute. Our speakers and workshop assistants will help you every step of the way. Just bring your laptop to get started!

You’ll get to meet the global cloud community at AWS re:Invent and get an opportunity to learn, get inspired, and rethink what’s possible. So build your schedule in the re:Invent portal and get ready to hit the ground running. We invite you to stop by the AWS Compute booth and chat with our experts. We look forward to seeing you in Las Vegas!

Introducing the price-capacity-optimized allocation strategy for EC2 Spot Instances

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/introducing-price-capacity-optimized-allocation-strategy-for-ec2-spot-instances/

This blog post is written by Jagdeep Phoolkumar, Senior Specialist Solution Architect, Flexible Compute and Peter Manastyrny, Senior Product Manager Tech, EC2 Core.

Amazon EC2 Spot Instances are unused Amazon Elastic Compute Cloud (Amazon EC2) capacity in the AWS Cloud available at up to a 90% discount compared to On-Demand prices. One of the best practices for using EC2 Spot Instances is to be flexible across a wide range of instance types to increase the chances of getting the aggregate compute capacity. Amazon EC2 Auto Scaling and Amazon EC2 Fleet make it easy to configure a request with a flexible set of instance types, as well as use a Spot allocation strategy to determine how to fulfill Spot capacity from the Spot Instance pools that you provide in your request.

The existing allocation strategies available in Amazon EC2 Auto Scaling and Amazon EC2 Fleet are called “lowest-price” and “capacity-optimized”. The lowest-price allocation strategy allocates Spot Instance pools where the Spot price is currently the lowest. Customers told us that in some cases the lowest-price strategy picks the Spot Instance pools that are not optimized for capacity availability and results in more frequent Spot Instance interruptions. As an improvement over lowest-price allocation strategy, in August 2019 AWS launched the capacity-optimized allocation strategy for Spot Instances, which helps customers tap into the deepest Spot Instance pools by analyzing capacity metrics. Since then, customers have seen a significantly lower interruption rate with capacity-optimized strategy when compared to the lowest-price strategy. You can read more about these customer stories in the Capacity-Optimized Spot Instance Allocation in Action at Mobileye and Skyscanner blog post. The capacity-optimized allocation strategy strictly selects the deepest pools. Therefore, sometimes it can pick high-priced pools even when there are low-priced pools available with marginally less capacity. Customers have been telling us that, for an optimal experience, they would like an allocation strategy that balances the best trade-offs between lowest-price and capacity-optimized.

Today, we’re excited to share the new price-capacity-optimized allocation strategy that makes Spot Instance allocation decisions based on both the price and the capacity availability of Spot Instances. The price-capacity-optimized allocation strategy should be the first preference and the default allocation strategy for most Spot workloads.

This post illustrates how the price-capacity-optimized allocation strategy selects Spot Instances in comparison with lowest-price and capacity-optimized. Furthermore, it discusses some common use cases of the price-capacity-optimized allocation strategy.

Overview

The price-capacity-optimized allocation strategy makes Spot allocation decisions based on both capacity availability and Spot prices. In comparison to the lowest-price allocation strategy, the price-capacity-optimized strategy doesn’t always attempt to launch in the absolute lowest priced Spot Instance pool. Instead, price-capacity-optimized attempts to diversify as much as possible across the multiple low-priced pools with high capacity availability. As a result, the price-capacity-optimized strategy in most cases has a higher chance of getting Spot capacity and delivers lower interruption rates when compared to the lowest-price strategy. If you factor in the cost associated with retrying the interrupted requests, then the price-capacity-optimized strategy becomes even more attractive from a savings perspective over the lowest-price strategy.

We recommend the price-capacity-optimized allocation strategy for workloads that require optimization of cost savings, Spot capacity availability, and interruption rates. For existing workloads using lowest-price strategy, we recommend price-capacity-optimized strategy as a replacement. The capacity-optimized allocation strategy is still suitable for workloads that either use similarly priced instances, or ones where the cost of interruption is so significant that any cost saving is inadequate in comparison to a marginal increase in interruptions.

Walkthrough

In this section, we illustrate how the price-capacity-optimized allocation strategy deploys Spot capacity when compared to the other two allocation strategies. The following example configuration shows how Spot capacity could be allocated in an Auto Scaling group using the different allocation strategies:

{
    "AutoScalingGroupName": "myasg ",
    "MixedInstancesPolicy": {
        "LaunchTemplate": {
            "LaunchTemplateSpecification": {
                "LaunchTemplateId": "lt-abcde12345"
            },
            "Overrides": [
                {
                    "InstanceRequirements": {
                        "VCpuCount": {
                            "Min": 4,
                            "Max": 4
                        },
                        "MemoryMiB": {
                            "Min": 0,
                            "Max": 16384
                        },
                        "InstanceGenerations": [
                            "current"
                        ],
                        "BurstablePerformance": "excluded",
                        "AcceleratorCount": {
                            "Max": 0
                        }
                    }
                }
            ]
        },
        "InstancesDistribution": {
            "OnDemandPercentageAboveBaseCapacity": 0,
            "SpotAllocationStrategy": "spot-allocation-strategy"
        }
    },
    "MinSize": 10,
    "MaxSize": 100,
    "DesiredCapacity": 60,
    "VPCZoneIdentifier": "subnet-a12345a,subnet-b12345b,subnet-c12345c"
}

First, Amazon EC2 Auto Scaling attempts to balance capacity evenly across Availability Zones (AZ). Next, Amazon EC2 Auto Scaling applies the Spot allocation strategy using the 30+ instances selected by attribute-based instance type selection, in each Availability Zone. The results after testing different allocation strategies are as follows:

  • Price-capacity-optimized strategy diversifies over multiple low-priced Spot Instance pools that are optimized for capacity availability.
  • Capacity-optimize strategy identifies Spot Instance pools that are only optimized for capacity availability.
  • Lowest-price strategy by default allocates the two lowest priced Spot Instance pools that aren’t optimized for capacity availability

To find out how each allocation strategy fares regarding Spot savings and capacity, we compare ‘Cost of Auto Scaling group’ (number of instances x Spot price/hour for each type of instance) and ‘Spot interruptions rate’ (number of instances interrupted/number of instances launched) for each allocation strategy. We use fictional numbers for the purpose of this post. However, you can use the Cloud Intelligence Dashboards to find the actual Spot Saving, and the Amazon EC2 Spot interruption dashboard to log Spot Instance interruptions. The example results after a 30-day period are as follows:

Allocation strategy

Instance allocation

Cost of Auto Scaling group

Spot interruptions rate

price-capacity-optimized

40 c6i.xlarge

20 c5.xlarge

$4.80/hour 3%

capacity-optimized

60 c5.xlarge

$5.00/hour

2%

lowest-price

30 c5a.xlarge

30 m5n.xlarge

$4.75/hour

20%

As per the above table, with the price-capacity-optimized strategy, the cost of the Auto Scaling group is only 5 cents (1%) higher, whereas the rate of Spot interruptions is six times lower (3% vs 20%) than the lowest-price strategy. In summary, from this exercise you learn that the price-capacity-optimized strategy provides the optimal Spot experience that is the best of both the lowest-price and capacity-optimized allocation strategies.

Common use-cases of price-capacity-optimized allocation strategy

Earlier we mentioned that the price-capacity-optimized allocation strategy is recommended for most Spot workloads. To elaborate further, in this section we explore some of these common workloads.

Stateless and fault-tolerant workloads

Stateless workloads that can complete ongoing requests within two minutes of a Spot interruption notice, and the fault-tolerant workloads that have a low cost of retries, are the best fit for the price-capacity-optimized allocation strategy. This category has workloads such as stateless containerized applications, microservices, web applications, data and analytics jobs, and batch processing.

Workloads with a high cost of interruption

Workloads that have a high cost of interruption associated with an expensive cost of retries should implement checkpointing to lower the cost of interruptions. By using checkpointing, you make the price-capacity-optimized allocation strategy a good fit for these workloads, as it allocates capacity from the low-priced Spot Instance pools that offer a low Spot interruptions rate. This category has workloads such as long Continuous Integration (CI), image and media rendering, Deep Learning, and High Performance Compute (HPC) workloads.

Conclusion

We recommend that customers use the price-capacity-optimized allocation strategy as the default option. The price-capacity-optimized strategy helps Amazon EC2 Auto Scaling groups and Amazon EC2 Fleet provision target capacity with an optimal experience. Updating to the price-capacity-optimized allocation strategy is as simple as updating a single parameter in an Amazon EC2 Auto Scaling group and Amazon EC2 Fleet.

To learn more about allocation strategies for Spot Instances, visit the Spot allocation strategies documentation page.

Running AI-ML Object Detection Model to Process Confidential Data using Nitro Enclaves

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/running-ai-ml-object-detection-model-to-process-confidential-data-using-nitro-enclaves/

This blog post was written by, Antoine Awad, Solutions Architect, Kevin Taylor, Senior Solutions Architect and Joel Desaulniers, Senior Solutions Architect.

Machine Learning (ML) models are used for inferencing of highly sensitive data in many industries such as government, healthcare, financial, and pharmaceutical. These industries require tools and services that protect their data in transit, at rest, and isolate data while in use. During processing, threats may originate from the technology stack such as the operating system or programs installed on the host which we need to protect against. Having a process that enforces the separation of roles and responsibilities within an organization minimizes the ability of personnel to access sensitive data. In this post, we walk you through how to run ML inference inside AWS Nitro Enclaves to illustrate how your sensitive data is protected during processing.

We are using a Nitro Enclave to run ML inference on sensitive data which helps reduce the attack surface area when the data is decrypted for processing. Nitro Enclaves enable you to create isolated compute environments within Amazon EC2 instances to protect and securely process highly sensitive data. Enclaves have no persistent storage, no interactive access, and no external networking. Communication between your instance and your enclave is done using a secure local channel called a vsock. By default, even an admin or root user on the parent instance will not be able to access the enclave.

Overview

Our example use-case demonstrates how to deploy an AI/ML workload and run inferencing inside Nitro Enclaves to securely process sensitive data. We use an image to demonstrate the process of how data can be encrypted, stored, transferred, decrypted and processed when necessary, to minimize the risk to your sensitive data. The workload uses an open-source AI/ML model to detect objects in an image, representing the sensitive data, and returns a summary of the type of objects detected. The image below is used for illustration purposes to provide clarity on the inference that occurs inside the Nitro Enclave. It was generated by adding bounding boxes to the original image based on the coordinates returned by the AI/ML model.

Image of airplanes with bounding boxes

Figure 1 – Image of airplanes with bounding boxes

To encrypt this image, we are using a Python script (Encryptor app – see Figure 2) which runs on an EC2 instance, in a real-world scenario this step would be performed in a secure environment like a Nitro Enclave or a secured workstation before transferring the encrypted data. The Encryptor app uses AWS KMS envelope encryption with a symmetrical Customer Master Key (CMK) to encrypt the data.

Image Encryption with AWS KMS using Envelope Encryption

Figure 2 – Image Encryption with AWS KMS using Envelope Encryption

Note, it’s also possible to use asymmetrical keys to perform the encryption/decryption.

Now that the image is encrypted, let’s look at each component and its role in the solution architecture, see Figure 3 below for reference.

  1. The Client app reads the encrypted image file and sends it to the Server app over the vsock (secure local communication channel).
  2. The Server app, running inside a Nitro Enclave, extracts the encrypted data key and sends it to AWS KMS for decryption. Once the data key is decrypted, the Server app uses it to decrypt the image and run inference on it to detect the objects in the image. Once the inference is complete, the results are returned to the Client app without exposing the original image or sensitive data.
  3. To allow the Nitro Enclave to communicate with AWS KMS, we use the KMS Enclave Tool which uses the vsock to connect to AWS KMS and decrypt the encrypted key.
  4. The vsock-proxy (packaged with the Nitro CLI) routes incoming traffic from the KMS Tool to AWS KMS provided that the AWS KMS endpoint is included on the vsock-proxy allowlist. The response from AWS KMS is then sent back to the KMS Enclave Tool over the vsock.

As part of the request to AWS KMS, the KMS Enclave Tool extracts and sends a signed attestation document to AWS KMS containing the enclave’s measurements to prove its identity. AWS KMS will validate the attestation document before decrypting the data key. Once validated, the data key is decrypted and securely returned to the KMS Tool which securely transfers it to the Server app to decrypt the image.

Solution architecture diagram for this blog post

Figure 3 – Solution architecture diagram for this blog post

Environment Setup

Prerequisites

Before we get started, you will need the following prequisites to deploy the solution:

  1. AWS account
  2. AWS Identity and Access Management (IAM) role with appropriate access

AWS CloudFormation Template

We are going to use AWS CloudFormation to provision our infrastructure.

  1. Download the CloudFormation (CFN) template nitro-enclave-demo.yaml. This template orchestrates an EC2 instance with the required networking components such as a VPC, Subnet and NAT Gateway.
  2. Log in to the AWS Management Console and select the AWS Region where you’d like to deploy this stack. In the example, we select Canada (Central).
  3. Open the AWS CloudFormation console at: https://console.aws.amazon.com/cloudformation/
  4. Choose Create Stack, Template is ready, Upload a template file. Choose File to select nitro-enclave-demo.yaml that you saved locally.
  5. Choose Next, enter a stack name such as NitroEnclaveStack, choose Next.
  6. On the subsequent screens, leave the defaults, and continue to select Next until you arrive at the Review step
  7. At the Review step, scroll to the bottom and place a checkmark in “I acknowledge that AWS CloudFormation might create IAM resources with custom names.” and click “Create stack”
  8. The stack status is initially CREATE_IN_PROGRESS. It will take around 5 minutes to complete. Click the Refresh button periodically to refresh the status. Upon completion, the status changes to CREATE_COMPLETE.
  9. Once completed, click on “Resources” tab and search for “NitroEnclaveInstance”, click on its “Physical ID” to navigate to the EC2 instance
  10. On the Amazon EC2 page, select the instance and click “Connect”
  11. Choose “Session Manager” and click “Connect”

EC2 Instance Configuration

Now that the EC2 instance has been provisioned and you are connected to it, follow these steps to configure it:

  1. Install the Nitro Enclaves CLI which will allow you to build and run a Nitro Enclave application:
    sudo amazon-linux-extras install aws-nitro-enclaves-cli -y
    sudo yum install aws-nitro-enclaves-cli-devel -y
    
  2. Verify that the Nitro Enclaves CLI was installed successfully by running the following command:
    nitro-cli --version

    Nitro Enclaves CLI

  3. To download the application from GitHub and build a docker image, you need to first install Docker and Git by executing the following commands:
    sudo yum install git -y
    sudo usermod -aG ne ssm-user
    sudo usermod -aG docker ssm-user
    sudo systemctl start docker && sudo systemctl enable docker
    

Nitro Enclave Configuration

A Nitro Enclave is an isolated environment which runs within the EC2 instance, hence we need to specify the resources (CPU & Memory) that the Nitro Enclaves allocator service dedicates to the enclave.

  1. Enter the following commands to set the CPU and Memory available for the Nitro Enclave allocator service to allocate to your enclave container:
    ALLOCATOR_YAML=/etc/nitro_enclaves/allocator.yaml
    MEM_KEY=memory_mib
    DEFAULT_MEM=20480
    sudo sed -r "s/^(\s*${MEM_KEY}\s*:\s*).*/\1${DEFAULT_MEM}/" -i "${ALLOCATOR_YAML}"
    sudo systemctl start nitro-enclaves-allocator.service && sudo systemctl enable nitro-enclaves-allocator.service
    
  2. To verify the configuration has been applied, run the following command and note the values for memory_mib and cpu_count:
    cat /etc/nitro_enclaves/allocator.yaml

    Enclave Configuration File

Creating a Nitro Enclave Image

Download the Project and Build the Enclave Base Image

Now that the EC2 instance is configured, download the workload code and build the enclave base Docker image. This image contains the Nitro Enclaves Software Development Kit (SDK) which allows an enclave to request a cryptographically signed attestation document from the Nitro Hypervisor. The attestation document includes unique measurements (SHA384 hashes) that are used to prove the enclave’s identity to services such as AWS KMS.

  1. Clone the Github Project
    cd ~/ && git clone https://github.com/aws-samples/aws-nitro-enclaves-ai-ml-object-detection.git
  2. Navigate to the cloned project’s folder and build the “enclave_base” image:
    cd ~/aws-nitro-enclaves-ai-ml-object-detection/enclave-base-image
    sudo docker build ./ -t enclave_base

    Note: The above step will take approximately 8-10 minutes to complete.

Build and Run The Nitro Enclave Image

To build the Nitro Enclave image of the workload, build a docker image of your application and then use the Nitro CLI to build the Nitro Enclave image:

  1. Download TensorFlow pre-trained model:
    cd ~/aws-nitro-enclaves-ai-ml-object-detection/src
    mkdir -p models/faster_rcnn_openimages_v4_inception_resnet_v2_1 && cd models/
    wget -O tensorflow-model.tar.gz https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1?tf-hub-format=compressed
    tar -xvf tensorflow-model.tar.gz -C faster_rcnn_openimages_v4_inception_resnet_v2_1
  2. Navigate to the use-case folder and build the docker image for the application:
    cd ~/aws-nitro-enclaves-ai-ml-object-detection/src
    sudo docker build ./ -t nitro-enclave-container-ai-ml:latest
  3. Use the Nitro CLI to build an Enclave Image File (.eif) using the docker image you built in the previous step:
    sudo nitro-cli build-enclave --docker-uri nitro-enclave-container-ai-ml:latest --output-file nitro-enclave-container-ai-ml.eif
  4. The output of the previous step produces the Platform configuration registers or PCR hashes and a nitro enclave image file (.eif). Take note of the PCR0 value, which is a hash of the enclave image file.Example PCR0:
    {
        "Measurements": {
            "PCR0": "7968aee86dc343ace7d35fa1a504f955ee4e53f0d7ad23310e7df535a187364a0e6218b135a8c2f8fe205d39d9321923"
            ...
        }
    }
  5. Launch the Nitro Enclave container using the Enclave Image File (.eif) generated in the previous step and allocate resources to it. You should allocate at least 4 times the EIF file size for enclave memory. This is necessary because the tmpfs filesystem uses half of the memory and the remainder of the memory is used to uncompress the initial initramfs where the application executable resides. For CPU allocation, you should allocate CPU in full cores i.e. 2x vCPU for x86 hyper-threaded instances.
    In our case, we are going to allocate 14GB or 14,366 MB for the enclave:

    sudo nitro-cli run-enclave --cpu-count 2 --memory 14336 --eif-path nitro-enclave-container-ai-ml.eif

    Note: Allow a few seconds for the server to boot up prior to running the Client app in the below section “Object Detection using Nitro Enclaves”.

Update the KMS Key Policy to Include the PCR0 Hash

Now that you have the PCR0 value for your enclave image, update the KMS key policy to only allow your Nitro Enclave container access to the KMS key.

  1. Navigate to AWS KMS in your AWS Console and make sure you are in the same region where your CloudFormation template was deployed
  2. Select “Customer managed keys”
  3. Search for a key with alias “EnclaveKMSKey” and click on it
  4. Click “Edit” on the “Key Policy”
  5. Scroll to the bottom of the key policy and replace the value of “EXAMPLETOBEUPDATED” for the “kms:RecipientAttestation:PCR0” key with the PCR0 hash you noted in the previous section and click “Save changes”

AI/ML Object Detection using a Nitro Enclave

Now that you have an enclave image file, run the components of the solution.

Requirements Installation for Client App

  1. Install the python requirements using the following command:
    cd ~/aws-nitro-enclaves-ai-ml-object-detection/src
    pip3 install -r requirements.txt
  2. Set the region that your CloudFormation stack is deployed in. In our case we selected Canada (Centra)
    CFN_REGION=ca-central-1
  3. Run the following command to encrypt the image using the AWS KMS key “EnclaveKMSKey”, make sure to replace “ca-central-1” with the region where you deployed your CloudFormation template:
    python3 ./envelope-encryption/encryptor.py --filePath ./images/air-show.jpg --cmkId alias/EnclaveKMSkey --region $CFN_REGION
  4. Verify that the output contains: file encrypted? True
    Note: The previous command generates two files: an encrypted image file and an encrypted data key file. The data key file is generated so we can demonstrate an attempt from the parent instance at decrypting the data key.

Launching VSock Proxy

Launch the VSock Proxy which proxies requests from the Nitro Enclave to an external endpoint, in this case, to AWS KMS. Note the file vsock-proxy-config.yaml contains a list of endpoints which allow-lists the endpoints that an enclave can communicate with.

cd ~/aws-nitro-enclaves-ai-ml-object-detection/src
vsock-proxy 8001 "kms.$CFN_REGION.amazonaws.com" 443 --config vsock-proxy-config.yaml &

Object Detection using Nitro Enclaves

Send the encrypted image to the enclave to decrypt the image and use the AI/ML model to detect objects and return a summary of the objects detected:

cd ~/aws-nitro-enclaves-ai-ml-object-detection/src
python3 client.py --filePath ./images/air-show.jpg.encrypted | jq -C '.'

The previous step takes around a minute to complete when first called. Inside the enclave, the server application decrypts the image, runs it through the AI/ML model to generate a list of objects detected and returns that list to the client application.

Parent Instance Credentials

Attempt to Decrypt Data Key using Parent Instance Credentials

To prove that the parent instance is not able to decrypt the content, attempt to decrypt the image using the parent’s credentials:

cd ~/aws-nitro-enclaves-ai-ml-object-detection/src
aws kms decrypt --ciphertext-blob fileb://images/air-show.jpg.data_key.encrypted --region $CFN_REGION

Note: The command is expected to fail with AccessDeniedException, since the parent instance is not allowed to decrypt the data key.

Cleaning up

  1. Open the AWS CloudFormation console at: https://console.aws.amazon.com/cloudformation/.
  2. Select the stack you created earlier, such as NitroEnclaveStack.
  3. Choose Delete, then choose Delete Stack.
  4. The stack status is initially DELETE_IN_PROGRESS. Click the Refresh button periodically to refresh its status. The status changes to DELETE_COMPLETE after it’s finished and the stack name no longer appears in your list of active stacks.

Conclusion

In this post, we showcase how to process sensitive data with Nitro Enclaves using an AI/ML model deployed on Amazon EC2, as well as how to integrate an enclave with AWS KMS to restrict access to an AWS KMS CMK so that only the Nitro Enclave is allowed to use the key and decrypt the image.

We encrypt the sample data with envelope encryption to illustrate how to protect, transfer and securely process highly sensitive data. This process would be similar for any kind of sensitive information such as personally identifiable information (PII), healthcare or intellectual property (IP) which could also be the AI/ML model.

Dig deeper by exploring how to further restrict your AWS KMS CMK using additional PCR hashes such as PCR1 (hash of the Linux kernel and bootstrap), PCR2 (Hash of the application), and other hashes available to you.

Also, try our comprehensive Nitro Enclave workshop which includes use-cases at different complexity levels.