Tag Archives: Amazon EC2

Architecting for Disaster Recovery on AWS Outposts Racks with AWS Elastic Disaster Recovery

2024-04-19 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/architecting-for-disaster-recovery-on-aws-outposts-racks-with-aws-elastic-disaster-recovery/

This blog post is written by Brianna Rosentrater, Hybrid Edge Specialist SA.

AWS Elastic Disaster Recovery Service (AWS DRS) now supports disaster recovery (DR) architectures that include on-premises Windows and Linux workloads running on AWS Outposts. AWS DRS minimizes downtime and data loss with fast, reliable recovery of on-premises and cloud-based applications using affordable storage, minimal compute, and point-in-time recovery. Both services are billed and managed from your AWS Management Console.

Like workloads running in AWS Regions, it’s critical to plan for failures. Outposts are designed with resiliency in mind, providing redundant power, networking, and are available to order with N+M active compute instance capacity. In other words, for every physical N compute servers, you have the option of including M redundant hosts capable of handling the workload during a failure. When leveraging AWS DRS with Outpost, you can plan for larger-scale failure modes, such as data center outages, by replicating mission-critical workloads to other remote data center locations or the AWS Region.

In this post, you’ll learn how AWS DRS can be used with Outpost rack to architect for high availability in the event of a site failure. The post will examine several different architectures enabled by AWS DRS that provide DR for Outpost, and the benefits of each method described.

Prerequisites

Each of these architectures described below need the following:

At least one Outpost rack
Outpost(s) must be in Direct VPC Routing (DVR) mode
Amazon S3 on Outposts (required for all AWS DRS replication destinations)
An AWS Replication Agent installed on each source server

Public internet access isn’t needed, AWS PrivateLink and AWS Direct Connect are supported for replication and failback which is a significant security benefit.

Planning for failure

Disasters come in many forms and are often unplanned and unexpected events. Regardless of whether your workload resides on premises, in a colocation facility, or in an AWS Region, it’s critical to define the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) which are often workload-specific. These two metrics profile how long a service can be down during recovery and quantify the acceptable amount of data loss. RTO and RPO guide you in choosing the appropriate strategy such as backup and recovery, pilot light, warm standby, or a multi-site (active-active) approach.

With AWS DRS, while failing back to a test machine (not the original source server), replication of the source server continues. This allows failback drills without impacting RPO, and non-disruptive failback drills are an important part of disaster planning to validate your recovery plan meets your expected RPO/RTO as per your business requirements.

How AWS DRS integrates with Outpost

AWS DRS uses an AWS Replication Agent at the source to capture the workload and transfer it to a lightweight staging area, which resides on an Outpost equipped with Amazon S3 on Outposts. This method also provides the ability to perform low-effort, non-disruptive DR drills before making the final cutover. The AWS Replication Agent doesn’t need a reboot nor does it impact your applications during installation.

When an Outpost’s subnet is selected as the target for replication or launch, all associated AWS DRS components remain within the Outpost, including the AWS DRS server conversion technology. These conversion servers convert source disks of servers being migrated so that they can boot and run in the target infrastructure, Amazon EBS volumes, snapshots, and replication servers. The replication servers replicate the disks to the target infrastructure. With AWS DRS you can control the data replication path using private connectivity options such as a virtual private network (VPN), AWS Direct Connect, VPC peering, or another private connection. Learn more about using a private IP for data replication.

AWS DRS provides nearly continuous replication for mission-critical workloads and supports deployment patterns including on-premises to Outpost, Outpost to Region, Region to Outpost, and between two logical Outposts through local networks. To leverage Outpost with AWS DRS, simply select the Outpost subnet as your target or source for replication when configuring AWS DRS for your workload. If you are currently using CloudEndure DR for disaster recovery with Outpost, see these detailed instructions for migrating to AWS DRS from CloudEndure DR.

DR from on-premises to Outpost

Outpost can be used as a DR target for on-premises workloads. By deploying an Outpost in a remote data center or colocation a significant distance from the source within the same geo-political boundary, you can replicate workloads across great distances and increase resiliency of the data while ensuring adherence to data residency policies or legislation.

Figure 1 – DR from on-premises to Outposts

In Figure 1, on premises sources replicate traffic from a LAN to a staging area residing in an Outpost subnet via the local gateway. This allows workloads to failover from their on-premises environment to an Outpost in a different physical location during a disaster.

The staging areas and replication servers run on Amazon Elastic Compute Cloud (Amazon EC2) with Amazon EBS volumes and require Amazon S3 on Outposts where the Amazon EBS snapshots reside.

The replication agent is responsible for providing nearly continuous, block-level replication from your LAN using TCP/1500 with traffic routing to Amazon EC2 instances using the Outposts local gateway.

DR from Outpost to Region

Since its initial release, Outpost has supported Amazon EBS snapshots written to Amazon S3 located in the AWS Region. Backup to an AWS Region is one of the most cost-effective and easiest-to-configure DR approaches, enabling data redundancy outside of your Outpost and data center.

This method also offers flexibility for restoration within an AWS Region if the original deployment is irrecoverable. However, depending on the frequency of the snapshots and the timing of the failure, backup, and recovery to the Region has the potential to have an RPO/RTO spanning hours depending on the throughput of the service link.

For critical workloads, AWS DRS can reduce RTO to minutes and RPO in the sub-second range. After creating an initial replication of workloads that reside on the Outpost, AWS DRS provides nearly continuous, block-level replication in the Region. Just like replication from non-AWS virtual machines or bare metal servers, AWS DRS resources, including Replication Servers, Conversion Servers, Amazon EBS Volumes, and Snapshots reside in the Region.

Figure 2 – DR from Outpost to Region

In Figure 2, data replication is performed over the service link from Amazon EC2 instances running locally on an Outpost to an AWS Region. The service link traverses either public Region connectivity or AWS Direct Connect.

AWS Direct Connect is the recommended option because it provides low latency and consistent bandwidth for the service link back to a Region, which also improves the reliability of transmission for AWS DRS replication traffic.

The service link is comprised of redundant, encrypted VPN tunnels. Replication traffic can also be sent privately without traversing the public internet by leveraging Private Virtual Interfaces with Direct Connect for the service link.

With this architecture in place, you can mitigate disasters and reduce downtime by failing over to the AWS Region using AWS DRS.

DR from Region to Outpost

AWS provides multiple Availability Zones (AZs) within a Region and isolated AWS Regions globally for the greatest possible fault tolerance and stability. The reliability pillar of AWS’s Well-Architected Framework encourages distributing workloads across AZs and replicating data between Regions when the need for distances exceeds those of AZs.

AWS DRS supports nearly continuous replication of workloads from a Region to an Outpost within your data center or colocation facility for DR. This deployment model provides increased durability from a source AWS Region to an Outpost anchored to a different Region.

In this model, AWS DRS components remain on-premises within the Outpost, but data charges are applicable as data egresses from the Region back to the data center and Amazon S3 on Outposts is required on the destination Outpost.

Figure 3 – DR from Region to Outpost

Implementing the preceding architecture diagram enables failover of critical workloads from the Region to on-premises Outposts seamlessly. Keep in mind that AWS Regions provide the management and control plane for Outpost, making it critical to consider probability and frequency of service link interruptions as a part of your DR planning. Scenarios such as warm standby with pre-allocated Amazon EC2 and Amazon EBS resources may prove more resilient during service link disruptions.

DR between two Outposts

Each logical Outpost is comprised of one or more physical racks. Logical Outposts are in independent colocations of one another, and support deployments in disparate data centers or colocation facilities. You can elect to have multiple logical Outposts anchored to different Availability Zones or Regions. AWS DRS unlocks options for replication between two logical Outposts, leading to increased resiliency and reducing the impact of your data center as a single point of failure. In the following architecture, nearly continuous replication captured from a single Outpost source is applied at a second logical Outpost.

Figure 4 – DR between two Outposts

Supporting both directional and bidirectional replication between Outposts can minimize disruption caused by events that take down a data center, Availability Zone, or even the entire Region result in minimal disruption. In the following architecture diagram, bidirectional data replication occurs between the Outposts by routing traffic via the local gateways, minimizing outbound data charges from the Region and allowing for more direct routing between deployment sites that could potentially span significant distances. AWS DRS cannot communicate with resources directly utilizing a customer-owned IP address pool (CoIP pool).

Figure 5 – DR between two Outposts – bidirectional

Architecture Considerations

When planning an Outpost deployment leveraging AWS DRS, it’s critical to consider the impact on storage. As a general best practice, AWS recommends planning for a 2:1 ratio consisting of EBS volumes used for nearly continuous replication and Amazon EBS snapshots on Amazon S3 for point-in-time recovery. While it’s unlikely that all servers would need recovery simultaneously, it’s also important to allocate a reserve of EBS volume capacity, which will launch at the time of recovery. Amazon S3 on Outpost is needed for each Outpost used as a replication destination, and the recommendation is to plan for a 1:1 ratio consisting of S3 on Outposts storage, plus the rate of data change. For example, if your data change rate is 10%, you’d want to plan for 110% S3 on Outpost use with AWS DRS.

Amazon CloudWatch has integrated metrics for EC2, Amazon EBS, and Amazon S3 capacity on Outposts, making it easy to create custom tailored dashboards and integrate with Simple Notification Service (Amazon SNS) for alerts at defined thresholds. Monitoring these metrics is critical in making sure that proper free space is available for data replication to occur unimpeded. CloudWatch has metrics available for AWS DRS as well. You can also use the AWS DRS service page in the AWS Console to monitor the status of your recovery instances.

Consider taking advantage of Recovery Plans within AWS DRS to make sure that related services are recovered in a particular order. For example, during a disaster, it might be critical to first bring up a database before recovering application tiers. Recovery plans provide the ability to group related services and apply wait times to individual targets.

Conclusion

AWS Outpost enables low latency, data residency, or data gravity-constrained workloads by supplying managed cloud compute and storage services within your data center or colocation. When coupled with AWS DRS, you can decrease RPO and RTO through a variety of flexible deployment models with sources and destinations ranging from on-premises, the Region, or another AWS Outpost.

AWS Weekly Roundup: Amazon EC2 G6 instances, Mistral Large on Amazon Bedrock, AWS Deadline Cloud, and more (April 8, 2024)

2024-04-08 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-mistral-large-aws-clean-rooms-ml-aws-deadline-cloud-and-more-april-8-2024/

We’re just two days away from AWS Summit Sydney (April 10–11) and a month away from the AWS Summit season in Southeast Asia, starting with the AWS Summit Singapore (May 7) and the AWS Summit Bangkok (May 30). If you happen to be in Sydney, Singapore, or Bangkok around those dates, please join us.

Last Week’s Launches
If you haven’t read last week’s Weekly Roundup yet, Channy wrote about the AWS Chips Taste Test, a new initiative from Jeff Barr as part of April’ Fools Day.

Here are some launches that caught my attention last week:

New Amazon EC2 G6 instances — We announced the general availability of Amazon EC2 G6 instances powered by NVIDIA L4 Tensor Core GPUs. G6 instances can be used for a wide range of graphics-intensive and machine learning use cases. G6 instances deliver up to 2x higher performance for deep learning inference and graphics workloads compared to Amazon EC2 G4dn instances. To learn more, visit the Amazon EC2 G6 instance page.

Mistral Large is now available in Amazon Bedrock — Veliswa wrote about the availability of the Mistral Large foundation model, as part of the Amazon Bedrock service. You can use Mistral Large to handle complex tasks that require substantial reasoning capabilities. In addition, Amazon Bedrock is now available in the Paris AWS Region.

Amazon Aurora zero-ETL integration with Amazon Redshift now in additional Regions — Zero-ETL integration announcements were my favourite launches last year. This Zero-ETL integration simplifies the process of transferring data between the two services, allowing customers to move data between Amazon Aurora and Amazon Redshift without the need for manual Extract, Transform, and Load (ETL) processes. With this announcement, Zero-ETL integrations between Amazon Aurora and Amazon Redshift is now supported in 11 additional Regions.

Announcing AWS Deadline Cloud — If you’re working in films, TV shows, commercials, games, and industrial design and handling complex rendering management for teams creating 2D and 3D visual assets, then you’ll be excited about AWS Deadline Cloud. This new managed service simplifies the deployment and management of render farms for media and entertainment workloads.

AWS Clean Rooms ML is Now Generally Available — Last year, I wrote about the preview of AWS Clean Rooms ML. In that post, I elaborated a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. Now, AWS Clean Rooms ML is available for you to use.

Knowledge Bases for Amazon Bedrock now supports private network policies for OpenSearch Serverless — Here’s exciting news for you who are building with Amazon Bedrock. Now, you can implement Retrieval-Augmented Generation (RAG) with Knowledge Bases for Amazon Bedrock using Amazon OpenSearch Serverless (OSS) collections that have a private network policy.

Amazon EKS extended support for Kubernetes versions now generally available — If you’re running Kubernetes version 1.21 and higher, with this Extended Support for Kubernetes, you can stay up-to-date with the latest Kubernetes features and security improvements on Amazon EKS.

AWS Lambda Adds Support for Ruby 3.3 — Coding in Ruby? Now, AWS Lambda supports Ruby 3.3 as its runtime. This update allows you to take advantage of the latest features and improvements in the Ruby language.

Amazon EventBridge Console Enhancements — The Amazon EventBridge console has been updated with new features and improvements, making it easier for you to manage your event-driven applications with a better user experience.

Private Access to the AWS Management Console in Commercial Regions — If you need to restrict access to personal AWS accounts from the company network, you can use AWS Management Console Private Access. With this launch, you can use AWS Management Console Private Access in all commercial AWS Regions.

From community.aws
The community.aws is a home for us, builders, to share our learnings with building on AWS. Here’s my Top 3 posts from last week:

Other AWS News
Here are some additional news items, open-source projects, and Twitch shows that you might find interesting:

Build On Generative AI – Join Tiffany and Darko to learn more about generative AI, see their demos and discuss different aspects of generative AI with the guest speakers. Streaming every Monday on Twitch, 9:00 AM US PT.

AWS open source news and updates – If you’re looking for various open-source projects and tools from the AWS community, please read the AWS open-source newsletter maintained by my colleague, Ricardo.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Amsterdam (April 9), Sydney (April 10–11), London (April 24), Singapore (May 7), Berlin (May 15–16), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Dubai (May 29), Thailand (May 30), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore cloud security in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania for two-and-a-half days of immersive cloud security learning designed to help drive your business initiatives.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Poland (April 11), Bay Area (April 12), Kenya (April 20), and Turkey (May 18).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Applying Spot-to-Spot consolidation best practices with Karpenter

2024-03-26 Chris Munns

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/applying-spot-to-spot-consolidation-best-practices-with-karpenter/

This post is written by Robert Northard – AWS Container Specialist Solutions Architect, and Carlos Manzanedo Rueda – AWS WW SA Leader for Efficient Compute

Karpenter is an open source node lifecycle management project built for Kubernetes. In this post, you will learn how to use the new Spot-to-Spot consolidation functionality released in Karpenter v0.34.0, which helps further optimize your cluster. Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances are spare Amazon EC2 capacity available for up to 90% off compared to On-Demand prices. One difference between On-Demand and Spot is that Spot Instances can be interrupted by Amazon EC2 when the capacity is needed back. Karpenter’s built-in support for Spot Instances allows users to seamlessly implement Spot best practices and helps users optimize the cost of stateless, fault tolerant workloads. For example, when Karpenter observes a Spot interruption, it automatically starts a new node in response.

Karpenter provisions nodes in response to unschedulable pods based on aggregated CPU, memory, volume requests, and other scheduling constraints. Over time, Karpenter has added functionality to simplify instance lifecycle configuration, providing a termination controller, instance expiration, and drift detection. Karpenter also helps optimize Kubernetes clusters by selecting the optimal instances while still respecting Kubernetes pod-to-node placement nuances, such as nodeSelector, affinity and anti-affinity, taints and tolerations, and topology spread constraints.

The Kubernetes scheduler assigns pods to nodes based on their scheduling constraints. Over time, as workloads are scaled out and scaled in or as new instances join and leave, the cluster placement and instance load might end up not being optimal. In many cases, it results in unnecessary extra costs. Karpenter has a consolidation feature that improves cluster placement by identifying and taking action in situations such as:

when a node is empty
when a node can be removed as the pods that are running on it can be rescheduled into other existing nodes
when the number of pods in a node has gone down and the node can now be replaced with a lower-priced and rightsized variant (which is shown in the following figure)

Karpenter consolidation, replacing one 2xlarge Amazon EC2 Instance with an xlarge Amazon EC2 Instance.

Karpenter versions prior to v0.34.0 only supported consolidation for Amazon EC2 On-Demand Instances. On-Demand consolidation allowed consolidating from On-Demand into Spot Instances and to lower-priced On-Demand Instances. However, once a pod was placed on a Spot Instance, Spot nodes were only removed when the nodes were empty. In v0.34.0, you can enable the feature gate to use Spot-to-Spot consolidation.

Solution overview

When launching Spot Instances, Karpenter uses the price-capacity-optimized allocation strategy when calling the Amazon EC2 instant Fleet API (shown in the following figure) and passes in a selection of compute instance types based on the Karpenter NodePool configuration. The Amazon EC2 Fleet API in instant mode is a synchronous API call that immediately returns a list of instances that launched and any instance that could not be launched. For any instances that could not be launched, Karpenter might request alternative capacity or remove any soft Kubernetes scheduling constraints for the workload.

Karpenter instance orchestration

Spot-to-Spot consolidation needed an approach that was different from On-Demand consolidation. For On-Demand consolidation, rightsizing and lowest price are the main metrics used. For Spot-to-Spot consolidation to take place, Karpenter requires a diversified instance configuration (see the example NodePool defined in the walkthrough) with at least 15 instances types. Without this constraint, there would be a risk of Karpenter selecting an instance that has lower availability and, therefore, higher frequency of interruption.

Prerequisites

The following prerequisites are required to complete the walkthrough:

Install an Amazon Elastic Kubernetes Service (Amazon EKS) cluster (version 1.29 or higher) with Karpenter (v0.34.0 or higher). The Karpenter Getting Started Guide provides steps for setting up an Amazon EKS cluster and adding Karpenter.
Enable replacement with Spot consolidation through the SpotToSpotConsolidation feature gate. This can be enabled during a helm install of the Karpenter chart by adding –-set settings.featureGates.spotToSpotConsolidation=true argument.
Install kubectl, the Kubernetes command line tool for communicating with the Kubernetes control plane API, and kubectl context configured with Cluster Operator and Cluster Developer permissions.

Walkthrough

The following walkthrough guides you through the steps for simulating Spot-to-Spot consolidation.

1. Create a Karpenter NodePool and EC2NodeClass

Create a Karpenter NodePool and EC2NodeClass. Replace the following with your own values. If you used the Karpenter Getting Started Guide to create your installation, then the value would be your cluster name.

Replace <karpenter-discovery-tag-value> with your subnet tag for Karpenter subnet and security group auto-discovery.
Replace <role-name> with the name of the AWS Identity and Access Management (IAM) role for node identity.

cat <<EOF > nodepool.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    metadata:
      labels:
        intent: apps
    spec:
      nodeClassRef:
        name: default
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c","m","r"]
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values: ["nano","micro","small","medium"]
        - key: karpenter.k8s.aws/instance-hypervisor
          operator: In
          values: ["nitro"]
  limits:
    cpu: 100
    memory: 100Gi
  disruption:
    consolidationPolicy: WhenUnderutilized
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: Bottlerocket
  subnetSelectorTerms:          
    - tags:
        karpenter.sh/discovery: "<karpenter-discovery-tag-value>"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "<karpenter-discovery-tag-value>"
  role: "<role-name>"
  tags:
    Name: karpenter.sh/nodepool/default
    IntentLabel: "apps"
EOF

kubectl apply -f nodepool.yaml

The NodePool definition demonstrates a flexible configuration with instances from the C, M, or R EC2 instance families. The configuration is restricted to use smaller instance sizes but is still diversified as much as possible. For example, this might be needed in scenarios where you deploy observability DaemonSets. If your workload has specific requirements, then see the supported well-known labels in the Karpenter documentation.

2. Deploy a sample workload

Deploy a sample workload by running the following command. This command creates a Deployment with five pod replicas using the pause container image:

cat <<EOF > inflate.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 5
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      nodeSelector:
        intent: apps
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
          resources:
            requests:
              cpu: 1
              memory: 1.5Gi
EOF
kubectl apply -f inflate.yaml

Next, check the Kubernetes nodes by running a kubectl get nodes CLI command. The capacity pool (instance type and Availability Zone) selected depends on any Kubernetes scheduling constraints and spare capacity size. Therefore, it might differ from this example in the walkthrough. You can see Karpenter launched a new node of instance type c6g.2xlarge, an AWS Graviton2-based instance, in the eu-west-1c Region:

$ kubectl get nodes -L karpenter.sh/nodepool -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone -L karpenter.sh/capacity-type

NAME                                     STATUS   ROLES    AGE   VERSION               NODEPOOL   INSTANCE-TYPE   ZONE         CAPACITY-TYPE
ip-10-0-12-17.eu-west-1.compute.internal Ready    <none>   80s   v1.29.0-eks-a5ec690   default    c6g.2xlarge     eu-west-1c   spot

3. Scale in a sample workload to observe consolidation

To invoke a Karpenter consolidation event scale, inflate the deployment to 1. Run the following command:

kubectl scale --replicas=1 deployment/inflate

Tail the Karpenter logs by running the following command. If you installed Karpenter in a different Kubernetes namespace, then replace the name for the -n argument in the command:

kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter --all-containers=true -f --tail=20

After a few seconds, you should see the following disruption via consolidation message in the Karpenter logs. The message indicates the c6g.2xlarge Spot node has been targeted for replacement and Karpenter has passed the following 15 instance types—m6gd.xlarge, m5dn.large, c7a.xlarge, r6g.large, r6a.xlarge and 10 other(s)—to the Amazon EC2 Fleet API:

{"level":"INFO","time":"2024-02-19T12:09:50.299Z","logger":"controller.disruption","message":"disrupting via consolidation replace, terminating 1 candidates ip-10-0-12-181.eu-west-1.compute.internal/c6g.2xlarge/spot and replacing with spot node from types m6gd.xlarge, m5dn.large, c7a.xlarge, r6g.large, r6a.xlarge and 10 other(s)","commit":"17d6c05","command-id":"60f27cb5-98fa-40fb-8231-05b31fd41892"}

Check the Kubernetes nodes by running the following kubectl get nodes CLI command. You can see that Karpenter launched a new node of instance type c6g.large:

$ kubectl get nodes -L karpenter.sh/nodepool -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone -L karpenter.sh/capacity-type

NAME                                      STATUS   ROLES    AGE   VERSION               NODEPOOL   INSTANCE-TYPE ZONE       CAPACITY-TYPE
ip-10-0-12-156.eu-west-1.compute.internal           Ready    <none>   2m1s   v1.29.0-eks-a5ec690   default    c6g.large       eu-west-1c   spot

Use kubectl get nodeclaims to list all objects of type NodeClaim and then describe the NodeClaim Kubernetes resource using kubectl get nodeclaim/<claim-name> -o yaml. In the NodeClaim .spec.requirements, you can also see the 15 instance types passed to the Amazon EC2 Fleet API:

apiVersion: karpenter.sh/v1beta1
kind: NodeClaim
...
spec:
  nodeClassRef:
    name: default
  requirements:
  ...
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - c5.large
    - c5ad.large
    - c6g.large
    - c6gn.large
    - c6i.large
    - c6id.large
    - c7a.large
    - c7g.large
    - c7gd.large
    - m6a.large
    - m6g.large
    - m6gd.large
    - m7g.large
    - m7i-flex.large
    - r6g.large
...

What would happen if a Spot node could not be consolidated?

If a Spot node cannot be consolidated because there are not 15 instance types in the compute selection, then the following message will appear in the events for the NodeClaim object. You might get this event if you overly constrained your instance type selection:

Normal  Unconsolidatable   31s   karpenter  SpotToSpotConsolidation requires 15 cheaper instance type options than the current candidate to consolidate, got 1

Spot best practices with Karpenter

The following are some best practices to consider when using Spot Instances with Karpenter.

Avoid overly constraining instance type selection: Karpenter selects Spot Instances using the price-capacity-optimized allocation strategy, which balances the price and availability of AWS spare capacity. Although a minimum of 15 instances are needed, you should avoid constraining instance types as much as possible. By not constraining instance types, there is a higher chance of acquiring Spot capacity at large scales with a lower frequency of Spot Instance interruptions at a lower cost.
Gracefully handle Spot interruptions and consolidation actions: Karpenter natively handles Spot interruption notifications by consuming events from an Amazon Simple Queue Service (Amazon SQS) queue, which is populated with Spot interruption notifications through Amazon EventBridge. As soon as Karpenter receives a Spot interruption notification, it gracefully drains the interrupted node of any running pods while also provisioning a new node for which those pods can schedule. With Spot Instances, this process needs to complete within 2 minutes. For a pod with a termination period longer than 2 minutes, the old node will be interrupted prior to those pods being rescheduled. To test a replacement node, AWS Fault Injection Service (FIS) can be used to simulate Spot interruptions.
Carefully configure resource requests and limits for workloads: Rightsizing and optimizing your cluster is a shared responsibility. Karpenter effectively optimizes and scales infrastructure, but the end result depends on how well you have rightsized your pod requests and any other Kubernetes scheduling constraints. Karpenter does not consider limits or resource utilization. For most workloads with non-compressible resources, such as memory, it is generally recommended to set requests==limits because if a workload tries to burst beyond the available memory of the host, an out-of-memory (OOM) error occurs. Karpenter consolidation can increase the probability of this as it proactively tries to reduce total allocatable resources for a Kubernetes cluster. For help with rightsizing your Kubernetes pods, consider exploring Kubecost, Vertical Pod Autoscaler configured in recommendation mode, or an open source tool such as Goldilocks.
Configure metrics for Karpenter: Karpenter emits metrics in the Prometheus format, so consider using Amazon Managed Service for Prometheus to track interruptions caused by Karpenter Drift, consolidation, Spot interruptions, or other Amazon EC2 maintenance events. These metrics can be used to confirm that interruptions are not having a significant impact on your service’s availability and monitor NodePool usage and pod lifecycles. The Karpenter Getting Started Guide contains an example Grafana dashboard configuration.

You can learn more about other application best practices in the Reliability section of the Amazon EKS Best Practices Guide.

Cleanup

To avoid incurring future charges, delete any resources you created as part of this walkthrough. If you followed the Karpenter Getting Started Guide to set up a cluster and add Karpenter, follow the clean-up instructions in the Karpenter documentation to delete the cluster. Alternatively, if you already had a cluster with Karpenter, delete the resources created as part of this walkthrough:

kubectl delete -f inflate.yaml
kubectl delete -f nodepool.yaml

Conclusion

In this post, you learned how Karpenter can actively replace a Spot node with another more cost-efficient Spot node. Karpenter can consolidate Spot nodes that have the right balance between lower price and low-frequency interruptions when there are at least 15 selectable instances to balance price and availability.

To get started, check out the Karpenter documentation as well as Karpenter Blueprints, which is a repository including common workload scenarios following the best practices.

You can share your feedback on this feature by a raising a GitHub Issue.

AWS Weekly Roundup — Claude 3 Sonnet support in Bedrock, new instances, and more — March 11, 2024

2024-03-11 Marcia Villalba

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-claude-3-sonnet-support-in-bedrock-new-instances-and-more-march-11-2024/

Last Friday was International Women’s Day (IWD), and I want to take a moment to appreciate the amazing ladies in the cloud computing space that are breaking the glass ceiling by reaching technical leadership positions and inspiring others to go and build, as our CTO Werner Vogels says.

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon Bedrock – Now supports Anthropic’s Claude 3 Sonnet foundational model. Claude 3 Sonnet is two times faster and has the same level of intelligence as Anthropic’s highest-performing models, Claude 2 and Claude 2.1. My favorite characteristic is that Sonnet is better at producing JSON outputs, making it simpler for developers to build applications. It also offers vision capabilities. You can learn more about this foundation model (FM) in the post that Channy wrote early last week.

AWS re:Post – Launched last week! AWS re:Post Live is a weekly Twitch livestream show that provides a way for the community to reach out to experts, ask questions, and improve their skills. The show livestreams every Monday at 11 AM PT.

Amazon CloudWatch – Now streams daily metrics on CloudWatch metric streams. You can use metric streams to send a stream of near real-time metrics to a destination of your choice.

Amazon Elastic Compute Cloud (Amazon EC2) – Announced the general availability of new metal instances, C7gd, M7gd, and R7gd. These instances have up to 3.8 TB of local NVMe-based SSD block-level storage and are built on top of the AWS Nitro System.

AWS WAF – Now supports configurable evaluation time windows for request aggregation with rate-based rules. Previously, AWS WAF was fixed to a 5-minute window when aggregating and evaluating the rules. Now you can select windows of 1, 2, 5 or 10 minutes, depending on your application use case.

AWS Partners – Last week, we announced the AWS Generative AI Competency Partners. This new specialization features AWS Partners that have shown technical proficiency and a track record of successful projects with generative artificial intelligence (AI) powered by AWS.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Some other updates and news that you may have missed:

One of the articles that caught my attention recently compares different design approaches for building serverless microservices. This article, written by Luca Mezzalira and Matt Diamond, compares the three most common designs for serverless workloads and explains the benefits and challenges of using one over the other.

And if you are interested in the serverless space, you shouldn’t miss the Serverless Office Hours, which airs live every Tuesday at 10 AM PT. Join the AWS Serverless Developer Advocates for a weekly chat on the latest from the serverless space.

The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the ones in French, German, Italian, and Spanish.

AWS Open Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Summit season is about to start. The first ones are Paris (April 3), Amsterdam (April 9), and London (April 24). AWS Summits are free events that you can attend in person and learn about the latest in AWS technology.

GOTO x AWS EDA Day London 2024 – On May 14, AWS partners with GOTO bring to you the event-driven architecture (EDA) day conference. At this conference, you will get to meet experts in the EDA space and listen to very interesting talks from customers, experts, and AWS.

You can browse all upcoming in-person and virtual events here.

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Weekly Roundup — Amazon API Gateway, AWS Step Functions, Amazon ECS, Amazon EKS, Amazon LightSail, Amazon VPC, and more — January 29, 2024

2024-01-29 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-api-gateway-aws-step-functions-amazon-ecs-amazon-eks-amazon-lightsail-amazon-vpc-and-more-january-29-2024/

This past week our service teams continue to innovate on your behalf, and a lot has happened in the Amazon Web Services (AWS) universe. I’ll also share about all the AWS Community events and initiatives that are happening around the world.

Let’s dive in!

Last week’s launches
Here are some launches that got my attention:

AWS Step Functions adds integration for 33 services including Amazon Q – AWS Step Functions is a visual workflow service capable of orchestrating over 11,000+ API actions from over 220 AWS services to help customers build distributed applications at scale. This week, AWS Step Functions expands its AWS SDK integrations with support for 33 additional AWS services, including Amazon Q, AWS B2B Data Interchange, and Amazon CloudFront KeyValueStore.

Amazon Elastic Container Service (Amazon ECS) Service Connect introduces support for automatic traffic encryption with TLS Certificates – Amazon ECS launches support for automatic traffic encryption with Transport Layer Security (TLS) certificates for its networking capability called ECS Service Connect. With this support, ECS Service Connect allows your applications to establish a secure connection by encrypting your network traffic.

Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon EKS Distro support Kubernetes version 1.29 – Kubernetes version 1.29 introduced several new features and bug fixes. You can create new EKS clusters using v1.29 and upgrade your existing clusters to v1.29 using the Amazon EKS console, the eksctl command line interface, or through an infrastructure-as-code (IaC) tool.

IPv6 instance bundles on Amazon Lightsail – With these new instance bundles, you can get up and running quickly on IPv6-only without the need for a public IPv4 address with the ease of use and simplicity of Amazon Lightsail. If you have existing Lightsail instances with a public IPv4 address, you can migrate your instances to IPv6-only in a few simple steps.

Amazon Virtual Private Cloud (Amazon VPC) supports idempotency for route table and network ACL creation – Idempotent creation of route tables and network ACLs is intended for customers that use network orchestration systems or automation scripts that create route tables and network ACLs as part of a workflow. It allows you to safely retry creation without additional side effects.

Amazon Interactive Video Service (Amazon IVS) announces audio-only pricing for Low-Latency Streaming – Amazon IVS is a managed live streaming solution that is designed to make low-latency or real-time video available to viewers around the world. It now offers audio-only pricing for its Low-Latency Streaming capability at 1/10th of the existing HD video rate.

Sellers can resell third-party professional services in AWS Marketplace – AWS Marketplace sellers, including independent software vendors (ISVs), consulting partners, and channel partners, can now resell third-party professional services in AWS Marketplace. Services can include implementation, assessments, managed services, training, or premium support.

Introducing the AWS Small and Medium Business (SMB) Competency – This is the first go-to-market AWS Specialization designed for partners who deliver to small and medium-sized customers. The SMB Competency provides enhanced benefits for AWS Partners to invest and focus on SMB customer business, such as becoming the go-to standard for participation in new pilots and sales initiatives and receiving unique access to scale demand generation engines.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

X in Y – We launched existing services and instance types in additional Regions:

Amazon Q in QuickSight is now available in Europe (Frankfurt). With Amazon Q in QuickSight, business users can generate compelling data stories that provide narratives of data, create dashboard summaries to share key insights from data in seconds, and confidently answer questions not answered by dashboards and reports.
Amazon Launch Wizard is now available in Asia Pacific (Melbourne) and Europe (Spain, Zurich). AWS Launch Wizard provides a step-by-step guide to help size, configure, and deploy AWS resources for SAP HANA and SAP NetWeaver systems built on SAP HANA and Adaptive Server Enterprise (ASE) using application programming interfaces (APIs) or a console-based experience.
Amazon RDS Custom for Oracle is now available in Europe (Paris). By using Amazon RDS Custom for Oracle, you can benefit from the agility of a managed database service, with features such as automated backups and point-in-time recovery, and also help meet your database application’s customization requirements.
Amazon EC2 M7a, R7a instances now available in Asia Pacific (Tokyo). M7a and R7a instances, powered by 4th Gen AMD EPYC processors (code-named Genoa) with a maximum frequency of 3.7 GHz, deliver up to 50 percent higher performance compared to M6a and R6a instances, respectively.
Amazon EC2 C7i instances are now available in Europe (Frankfurt) and South America (São Paulo). C7i instances are powered by custom 4th Gen Intel Xeon Scalable processors only available on AWS. They offer up to 15 percent better performance over comparable x86-based Intel processors utilized by other cloud providers.
Amazon EC2 High Memory instances now available in Europe (Stockholm). Amazon EC2 High Memory instances are certified by SAP for running Business Suite on HANA, SAP S/4HANA, Data Mart Solutions on HANA, Business Warehouse on HANA, and SAP BW/4HANA in production environments.
Amazon Connect SMS is now available in Asia Pacific (Seoul, Sydney). Amazon Connect SMS makes it easy for you to resolve customer issues via text messaging.

Other AWS news
Here are some additional projects, programs, and news items that you might find interesting:

Export a Software Bill of Materials using Amazon Inspector – Generating an SBOM gives you critical security information that offers you visibility into specifics about your software supply chain, including the packages you use the most frequently and the related vulnerabilities that might affect your whole company. My colleague Varun Sharma in South Africa shows how to export a consolidated SBOM for the resources monitored by Amazon Inspector across your organization in industry standard formats, including CycloneDx and SPDX. It also shares insights and approaches for analyzing SBOM artifacts using Amazon Athena.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Innovate: AI/ML and Data Edition – Register now for the Asia Pacific & Japan AWS Innovate online conference on February 22, 2024, to explore, discover, and learn how to innovate with artificial intelligence (AI) and machine learning (ML). Choose from over 50 sessions in three languages and get hands-on with technical demos aimed at generative AI builders.

AWS Summit Paris – The AWS Summit Paris is an annual event that is held in Paris, France. It is a great opportunity for cloud computing professionals from all over the world to learn about the latest AWS technologies, network with other professionals, and collaborate on projects. The Summit is free to attend and features keynote presentations, breakout sessions, and hands-on labs. Registrations are open!

AWS Community re:Invent re:Caps – Join a Community re:Cap event organized by volunteers from AWS User Groups and AWS Cloud Clubs around the world to learn about the latest announcements from AWS re:Invent.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Behavior Driven Chaos with AWS Fault Injection Simulator

2024-01-29 Richard Whitworth

Post Syndicated from Richard Whitworth original https://aws.amazon.com/blogs/architecture/behavior-driven-chaos-with-aws-fault-injection-simulator/

A common challenge organizations face is how to gain confidence in and provide evidence for the continuous resilience of their workloads. Using modern chaos engineering principles can help in meeting this challenge, but the practice of chaos engineering can become complex. As a result, both the definition of the inputs and comprehension of the outputs of the process can become inaccessible to non-technical stakeholders.

In this post, we will explore a working example of how you can build chaos experiments using human readable language, AWS Fault Injection Simulator (FIS), and a framework familiar to Developers and Test Engineers. In turn, this will help you to produce auditable evidence of your workload’s continuous resilience in a way that is more engaging and understandable to a wider community of stakeholders.

If you are new to chaos engineering, including the process and benefits, a great place to start is with the Architecture Blog post on workload resiliency.

Chaos experiment attributes

For a chaos experiment to be considered complete, the experiment should exhibit the following attributes:

Defined steady state
Hypothesis
Defined variables and experiment actions to take
Verification of the hypothesis

Combining FIS and Behave

FIS enables you to create the experiment actions outlined in the list of chaos experiment attributes. You can use the actions in FIS to simulate the effect of disruptions on your workloads so that you can observe the resulting outcome and gain valuable insights into the workload’s resilience. However, there are additional attributes that should be defined when writing a fully featured chaos experiment.

This is what combining Python-style Behave with FIS enables you to do (other behavior-driven development frameworks exist for different languages). By approaching chaos experiments in this way, you get the benefit of codifying all of your chaos experiment attributes, such as the hypothesis, steady state and verification of the hypothesis using human readable Gherkin syntax, then automating the whole experiment in code.

Using Gherkin syntax enables non-technical stakeholders to review, validate, and contribute to chaos experiments, plus it helps to ensure the experiments can be driven by business outcomes and personas. If you have defined everything as code, then the whole process can be wrapped into the appropriate stage of your CI/CD pipelines to ensure existing experiments are always run to avoid regression. You can also iteratively add new chaos experiments as new business features are enabled in your workloads or you become aware of new potential disruptions. In addition, using a behavior-driven development (BDD) framework, like Behave, also enables developers and test engineers to deliver the capability quickly since they are likely already familiar with BDD and Behave.

The remainder of this blog post provides an example of this approach using an experiment that can be built on to create a set of experiments for your own workloads. The code and resources used throughout this blog are available in the AWS Samples aws-fis-behaviour-driven-chaos repository, which provides a CloudFormation template that builds the target workload for our chaos experiment.

The workload comprises an Amazon Virtual Private Cloud with a public subnet, an EC2 Auto-scaling Group and EC2 instances running NGINX. The CloudFormation template also creates an FIS experiment template, comprising a standard FIS Amazon Elastic Compute Cloud (Amazon EC2) action. For your own implementation, we recommend that you keep the CloudFormation for FIS separate to the CloudFormation, which builds the workload so that it can be maintained independently. Please note, for simplicity, they are together in the same repo for this blog.

Note: The Behave code in the repo is structured in a way we suggest you adopt for your own repo. It keeps the scenario definition separated from the Python-specific implementation of the steps and in turn the outline of the steps is separated from the step helper methods. This will allow you to build a set of re-usable step helper methods that can be dropped-into/called-from any Behave step. This can help keep your test codebase as DRY and efficient as possible as it grows. This can be very challenging for large test frameworks.

Figure 1 shows the AWS services and components we’re interacting with in this post.

Figure 1. Infrastructure for the chaos experiment

Defining and launching the chaos experiment

We start by defining our chaos experiment in Gherkin syntax with the Gherkin’s Scenario being used to articulate the hypothesis for our chaos experiment as follows:

Scenario: My website is resilient to infrastructure failure

Given My website is up and can serve 10 transactions per second
And I have an EC2 Auto-Scaling Group with at least 3 running EC2 instances
And I have an EC2 Auto-Scaling Group with instances distributed across at least 3 Availability Zones
When an EC2 instance is lost
Then I can continue to serve 10 transactions per second
And 90 percent of transactions to my website succeed

Our initial Given, And steps validate that the conditions and environment that we are launching the Scenario in are sufficient for the experiment to be successful (the steady state). Therefore, if the environment is already out of bounds (read: the website isn’t running) before we begin, then the test will fail anyway, and we don’t want a false positive result. Since the steps are articulated as code using Behave, the test report will demonstrate what caused the experiment to fail and be able to identify if it was an environmental issue (false positive) rather than a true positive failure (the workload didn’t respond as we anticipated) during our chaos experiment.

The Given, And steps are launched using steps like the following example. Steps, in turn, call the relevant step_helper functions. Note how the phrases from the scenario are represented in the decorator for the step_impl function; this is how you link the human readable language in the scenario to the Python code that initiates the test logic.

@step("My website is up and can serve {number} transactions per second")
def step_impl(context, number):

    target = f"http://{context.config.userdata['website_hostname']}"

    logger.info(f'Sending traffic to target website: {target} for the next 60 seconds, please wait....')
    send_traffic_to_website(target, 60, "before_chaos", int(number))

    assert verify_locust_run(int(number), "before_chaos") is True

Once the Given, And steps have initiated successfully, we are satisfied that the conditions for the experiment are appropriate. Next, we launch the chaos actions using the When step. Here, we interact with FIS using boto3 to start the experiment template that was created earlier using CloudFormation. The following code snippet shows the code, which begins this step:

@step("an EC2 instance is lost")
def step_impl(context):

    if "fis" not in context.clients:
        create_client("fis", context)

    state = start_experiment(
        context.clients["fis"], context.config.userdata["fis_experiment_id"]
    )["experiment"]["state"]["status"]
    logger.info(f"FIS experiment state is: {state}")

    assert state in ["running", "initiating"]

The experiment template being used here is intentionally a very simple, single-step experiment as an example for this blog. FIS enables you to create very elaborate multi-step experiments in a straightforward manner, for more information please refer to the AWS FIS actions reference.

The experiment is now in flight! We launched the Then, And steps to validate our hypothesis expressed in the Scenario. Now, we query the website endpoint to see if we get any failed requests:

@step("I can continue to serve {number} transactions per second")
def step_impl(context, number):

    target = f"http://{context.config.userdata['website_hostname']}"

    logger.info(f'Sending traffic to target website {target} for the next 60 seconds, please wait....')
    send_traffic_to_website(target, 60, "after_chaos", int(number))

    assert verify_locust_run(int(number), "after_chaos") is True


@step("{percentage} percent of transactions to my website succeed")
def step_impl(context, percentage):

    assert success_percent(int(percentage), "after_chaos") is True

You can add as many Given, When, Then steps to validate your Scenario (the experiment’s hypothesis) as you need; for example, you can use additional FIS actions to validate what happens if a network failure prevents traffic to a subnet. You can also code your own actions using AWS Systems Manager or boto3 calls of your choice.

In our experiment, the results have validated our hypothesis, as seen in Figure 2.

Figure 2. Hypothesis validation example

There are a few different ways to format your results when using Behave so that they are easier to pull into a report; Allure is a nice example.

To follow along, the steps in the Implementation Details section will help launch the chaos experiment at your CLI. As previously stated, if you were to use this approach in your development lifecycle, you would hoist this into your CI/CD pipeline and tooling and not launch it locally.

Implementation details

Prerequisites

To deploy the chaos experiment and test application, you will need:

An AWS account
Access to AWS CloudShell or have the AWS CLI installed with a configured profile and appropriate AWS Access Keys. Please ensure you have a recent and working version of Python (v3.9 or newer).

Note: Website availability tests are initiated from your CLI in the sample code used in this blog. If you are traversing a busy corporate proxy or a network connection that is not stable, then it may cause the experiment to fail.

Further, to keep the prerequisites as minimal and self-contained as possible for this blog, we are using Locust as a Python library, which is not a robust implementation of Locust. Using a Behave step implementation, we instantiate a local Locust runner to send traffic to the website we want to test before and after the step, which takes the chaos action. For a robust implementation in your own test suite, you could build a Locust implementation behind a REST API or use a load-testing suite with an existing REST API, like Blazemeter, which can be called from a Behave step and run for the full length of the experiment.

The CloudFormation that you will launch with this post creates some public facing EC2 instances. You should restrict access to just your public IP address using the instructions below. You can find your IP at https://checkip.amazonaws.com/. Use the IP address shown with a trailing /32 e.g. 1.2.3.4/32

Environment preparation

Clone the git repository aws-fis-behaviour-driven-chaos that contains the blog resources using the below command:

git clone https://github.com/aws-samples/aws-fis-behaviour-driven-chaos.git

We recommend creating a new, clean Python virtual environment and activating it:

python3 -m venv behavefisvenv
source behavefisvenv/bin/activate

Deployment steps

To be carried out from the root of the blog repo:

Install the Python dependencies into your Python environment:
pip install -r requirements.txt
Create the test stack and wait for completion (ensure you replace the parameter value for AllowedCidr with your public IP address):
aws cloudformation create-stack --stack-name my-chaos-stack --template-body file://cloudformation/infrastructure.yaml --region=eu-west-1 --parameters ParameterKey=AllowedCidr,ParameterValue=1.2.3.4/32 --capabilities CAPABILITY_IAM
aws cloudformation wait stack-create-complete --stack-name my-chaos-stack --region=eu-west-1
Once the deployment reaches a create-complete state, retrieve the stack outputs:
aws cloudformation describe-stacks --stack-name my-chaos-stack --region=eu-west-1
Copy the OutputValue of the stack Outputs for AlbHostname and FisExperimentId into the behave/userconfig.json file, replacing the placeholder values for website_hostname and fis_experiment_id, respectively.
Replace the region value in the behave/userconfig.json file with the region you built the stack in (if altered in Step 2).
Change directory into behave/.
cd behave/
Launch behave:
behave
Once completed, Locust results will appear inside the behave folder (Figure 3 is an example).

Figure 3. Example CLI output

Cleanup

If you used the CloudFormation templates that we provided to create AWS resources to follow along with this blog post, delete them now to avoid future recurring charges.

To delete the stack, run:

aws cloudformation delete-stack --stack-name my-chaos-stack --region=eu-west-1 &&
aws cloudformation wait stack-delete-complete --stack-name my-chaos-stack --region=eu-west-1

Conclusion

This blog post has given usable and actionable insights into how you can wrap FIS actions, plus experiment templates in a way that fully defines and automates a chaos experiment with language that will be accessible to stakeholders outside of the test engineering team. You can extend on what is presented here to test your own workloads with your own methods and metrics through a powerful suite of chaos experiments, which will build confidence in your workload’s continuous resilience and enable you to provide evidence of this to the wider organization.

Optimizing video encoding with FFmpeg using NVIDIA GPU-based Amazon EC2 instances

2024-01-04 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/optimizing-video-encoding-with-ffmpeg-using-nvidia-gpu-based-amazon-ec2-instances/

This post is written by Alejandro Gil, Solutions Architect and Joseba Echevarría, Solutions Architect.

Introduction

The purpose of this blog post is to compare video encoding performance between CPUs and Nvidia GPUs to determine the price/performance ratio in different scenarios while highlighting where it would be best to use a GPU.

Video encoding plays a critical role in modern media delivery, enabling efficient storage, delivery, and playback of high-quality video content across a wide range of devices and platforms.

Video encoding is frequently performed solely by the CPU because of its widespread availability and flexibility. Still, modern hardware includes specialized components designed specifically to obtain very high performance video encoding and decoding.

Nvidia GPUs, such as those found in the P and G Amazon EC2 instances, include this kind of built-in hardware in their NVENC (encoding) and NVDEC (decoding) accelerator engines, which can be used for real-time video encoding/decoding with minimal impact on the performance of the CPU or GPU.

Figure 1: NVIDIA NVDEC/NVENC architecture. Source https://developer.nvidia.com/video-codec-sdk

Scenario

Two main transcoding job types should be considered depending on the video delivery use case, 1) batch jobs for on demand video files and 2) streaming jobs for real-time, low latency use cases. In order to achieve optimal throughput and cost efficiency, it is a best practice to encode the videos in parallel using the same instance.

The utilized instance types in this benchmark can be found in figure 2 table (i.e g4dn and p3). For hardware comparison purposes, the p4d instance has been included in the table, showing the GPU specs and total number of NVDEC & NVENC cores in these EC2 instances. Based on the requirements, multiple GPU instances types are available in EC2.

Instance size	GPUs	GPU model	NVDEC generation	NVENC generation	NVDEC cores/GPU	NVENC cores/GPU
g4dn.xlarge	1	T4	4^th	7^th	2	1
p3.2xlarge	1	V100	3^rd	6^th	1	3
p4d.24xlarge	8	A100	4^th	N/A	5	0

Figure 2: GPU instances specifications

Benchmark

In order to determine which encoding strategy is the most convenient for each scenario, a benchmark will be conducted comparing CPU and GPU instances across different video settings. The results will be further presented using graphical representations of the performance indicators obtained.

The benchmark uses 3 input videos with different motion and detail levels (still, medium motion and high dynamic scene) in 4k resolution at 60 frames per second. The tests will show the average performance for encoding with FFmpeg 6.0 in batch (using Constant Rate Factor (CRF) mode) and streaming (using Constant Bit Rate (CBR)) with x264 and x265 codecs to five output resolutions (1080p, 720p, 480p, 360p and 160p).

The benchmark tests encoding the target videos into H.264 and H.265 using the x264 and x265 open-source libraries in FFmpeg 6.0 on the CPU and the NVENC accelerator when using the Nvidia GPU. The H.264 standard enjoys broad compatibility, with most consumer devices supporting accelerated decoding. The H.265 standard offers superior compression at a given level of quality than H.264 but hardware accelerated decoding is not as widely deployed. As a result, for most media delivery scenarios having more than one video format will be required in order to provide the best possible user experience.

Offline (batch) encoding

This test consists of a batch encoding with two different standard presets (ultrafast and medium for CPU-based encoding and p1 and medium presets for GPU-accelerated encoding) defined in the FFmpeg guide.

The following chart shows the relative cost of transcoding 1 million frames to the 5 different output resolutions in parallel for CPU-encoding EC2 instance (c6i.4xlarge) and two types of GPU-powered instances (g4dn.xlarge and p3.2xlarge). The results are normalized so that the cost of x264 ultrafast preset on c6i.4xlarge is equal to one.

Figure 3: Batch encoding performance for CPU and GPU instances.

The performance of batch encoding in the best GPU instance (g4dn.xlarge) shows around 73% better price/performance in x264 compared to the c6i.4xlarge and around 82% improvement in x265.

A relevent aspect to have in consideration is that the presets used are not exactly equivalent for each hardware because FFmpeg uses different operators depending on where the process runs (i.e CPU or GPU). As a consequence, the video outputs in each case have a noticeable difference between them. Generally, NVENC-based encoded videos (GPU) tend to have a higher quality in H.264, whereas CPU outputs present more encoding artifacts. The difference is more noticeable for lower quality cases (ultrafast/p1 presets or streaming use cases).

The following images compare the output quality for the medium motion video in the ultrafast/p1 and medium presets.

It is clearly seen in the following example, that the h264_nevenc (GPU) codec outperforms the libx264 codec (CPU) in terms of quality, showing less pixelation, especially in the ultrafast preset. For the medium preset, although the quality difference is less pronounced, the GPU output file is noticeably larger (refer to Figure 6 table).

Figure 4: Result comparison between GPU and CPU for h264, ultrafast

Figure 5: Result comparison between GPU and CPU for h264, medium

The output file sizes mainly depend on the preset, codec and input video. The different configurations can be found in the following table.

Figure 6: Sizes for output batch encoded videos. Streaming not represented because the size is the same (fixed bitrate)

Live stream encoding

For live streaming use cases, it is useful to measure how many streams a single instance can maintain transcoding to five output resolutions (1080p, 720p, 480p, 360p and 160p). The following results are the relative cost of each instance, which is the ratio of number of streams the instance was able to sustain divided by the cost per hour.

Figure 6: Streaming encoding performance for CPU and GPU instances.

The previous results show that a GPU-based instance family like g4dn is ideal for streaming use cases, where they can sustain up to 4 parallel encodings from 4K to 1080p, 720p, 480p, 360p & 160p simultaneously. Notice that the GPU-based p5 family performance is not compensating the cost increase.

On the other hand, the CPU-based instances can sustain 1 parallel stream (at most). If you want to sustain the same number of parallel streams in Intel-based instances, you’d have to opt for a much larger instance (c6i.12xlarge can almost sustain 3 simultaneous streams, but it struggles to keep up with the more dynamic scenes when encoding with x265) with a much higher cost ($2.1888 hourly for c6i.12xlarge vs $0.587 for g4dn.xlarge).

The price/performance difference is around 68% better in GPU for x264 and 79% for x265.

Conclusion

The results show that for the tested scenarios there can be a price-performance gain when transcoding with GPU compared to CPU. Also, GPU-encoded videos tend to have an equal or higher perceived quality level to CPU-encoded counterparts and there is no significant performance penalty for encoding to the more advanced H.265 format, which can make GPU-based encoding pipelines an attractive option.

Still, CPU-encoders do a particularly good job with containing output file sizes for most of the cases we tested, producing smaller output file sizes even when the perceived quality is simmilar. This is an important aspect to have into account since it can have a big impact in cost. Depending on the amount of media files distributed and consumed by final users, the data transfer and storage cost will noticeably increase if GPUs are used. With this in mind, it is important to weight the compute costs with the data transfer and storage costs for your use case when chosing to use CPU or GPU-based video encoding.

One additional point to be considered is pipeline flexibility. Whereas the GPU encoding pipeline is rigid, CPU-based pipelines can be modified to the customer’s needs, including additional FFmpeg filters to accommodate future needs as required.

The test did not include any specific quality measurements in the transcoded images, but it would be interesting to perform an analysis based on quantitative VMAF (or similar algorithm) metrics for the videos. We always recommend to make your own test to validate if the results obtained meet your requirements.

Benchmarking method

This blog post extends on the original work described in Optimized Video Encoding with FFmpeg on AWS Graviton Processors and the benchmarking process has been maintained in order to preserve consistency of the benchmark results. The original article analyzes in detail the price/performance advantages of AWS Graviton 3 compared to other processors.

Figure 7: Batch encoding workflow

Amazon Q brings generative AI-powered assistance to IT pros and developers (preview)

2023-11-28 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/amazon-q-brings-generative-ai-powered-assistance-to-it-pros-and-developers-preview/

Today, we are announcing the preview of Amazon Q, a new type of generative artificial intelligence (AI) powered assistant that is specifically for work and can be tailored to a customer’s business.

Amazon Q brings a set of capabilities to support developers and IT professionals. Now you can use Amazon Q to get started building applications on AWS, research best practices, resolve errors, and get assistance in coding new features for your applications. For example, Amazon Q Code Transformation can perform Java application upgrades now, from version 8 and 11 to version 17.

Amazon Q is available in multiple areas of AWS to provide quick access to answers and ideas wherever you work. Here’s a quick look at Amazon Q, including in integrated development environment (IDE):

Building applications together with Amazon Q
Application development is a journey. It involves a continuous cycle of researching, developing, deploying, optimizing, and maintaining. At each stage, there are many questions—from figuring out the right AWS services to use, to troubleshooting issues in the application code.

Trained on 17 years of AWS knowledge and best practices, Amazon Q is designed to help you at each stage of development with a new experience for building applications on AWS. With Amazon Q, you minimize the time and effort you need to gain the knowledge required to answer AWS questions, explore new AWS capabilities, learn unfamiliar technologies, and architect solutions that fuel innovation.

Let us show you some capabilities of Amazon Q.

1. Conversational Q&A capability
You can interact with the Amazon Q conversational Q&A capability to get started, learn new things, research best practices, and iterate on how to build applications on AWS without needing to shift focus away from the AWS console.

To start using this feature, you can select the Amazon Q icon on the right-hand side of the AWS Management Console.

For example, you can ask, “What are AWS serverless services to build serverless APIs?” Amazon Q provides concise explanations along with references you can use to follow up on your questions and validate the guidance. You can also use Amazon Q to follow up on and iterate your questions. Amazon Q will show more deep-dive answers for you with references.

There are times when we have questions for a use case with fairly specific requirements. With Amazon Q, you can elaborate on your use cases in more detail to provide context.

For example, you can ask Amazon Q, “I’m planning to create serverless APIs with 100k requests/day. Each request needs to lookup into the database. What are the best services for this workload?” Amazon Q responds with a list of AWS services you can use and tries to limit the answer results to those that are accurately referenceable and verified with best practices.

Here is some additional information that you might want to note:

Amazon Q conversational Q&A capability is currently in preview in all commercial AWS Regions.
This capability is integrated into the AWS Management Console, AWS Console Mobile Application, AWS Documentation, AWS websites, and Slack and Teams through AWS Chatbot, making it more convenient and easier to find what you need.

2. Optimize Amazon EC2 instance selection
Choosing the right Amazon Elastic Compute Cloud (Amazon EC2) instance type for your workload can be challenging with all the options available. Amazon Q aims to make this easier by providing personalized recommendations.

To use this feature, you can ask Amazon Q, “Which instance families should I use to deploy a Web App Server for hosting an application?” This feature is also available when you choose to launch an instance in the Amazon EC2 console. In Instance type, you can select Get advice on instance type selection. This will show a dialog to define your requirements.

Your requirements are automatically translated into a prompt on the Amazon Q chat panel. Amazon Q returns with a list of suggestions of EC2 instances that are suitable for your use cases. This capability helps you pick the right instance type and settings so your workloads will run smoothly and more cost-efficiently.

This capability to provide EC2 instance type recommendations based on your use case is available in preview in all commercial AWS Regions.

3. Troubleshoot and solve errors directly in the console
Amazon Q can also help you to solve errors for various AWS services directly in the console. With Amazon Q proposed solutions, you can avoid slow manual log checks or research.

Let’s say that you have an AWS Lambda function that tries to interact with an Amazon DynamoDB table. But, for an unknown reason (yet), it fails to run. Now, with Amazon Q, you can troubleshoot and resolve this issue faster by selecting Troubleshoot with Amazon Q.

Amazon Q provides concise analysis of the error which helps you to understand the root cause of the problem and the proposed resolution. With this information, you can follow the steps described by Amazon Q to fix the issue.

In just a few minutes, you will have the solution to solve your issues, saving significant time without disrupting your development workflow. The Amazon Q capability to help you troubleshoot errors in the console is available in preview in the US West (Oregon) for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon ECS, and AWS Lambda.

4. Network troubleshooting assistance
You can also ask Amazon Q to assist you in troubleshooting network connectivity issues caused by network misconfiguration in your current AWS account. For this capability, Amazon Q works with Amazon VPC Reachability Analyzer to check your connections and inspect your network configuration to identify potential issues.

This makes it easy to diagnose and resolve AWS networking problems, such as “Why can’t I SSH to my EC2 instance?” or “Why can’t I reach my web server from the Internet?” which you can ask Amazon Q.

Then, on the response text, you can select preview experience here, which will provide explanations to help you to troubleshoot network connectivity-related issues.

Here are a few things you need to know:

This capability is currently available in preview in the US East (N. Virginia).
To learn more about the feature and sample questions, see Getting started with Amazon Q network troubleshooting in the AWS Documentation.

5. Integration and conversational capabilities within your IDEs
As we mentioned, Amazon Q is also available in supported IDEs. This allows you to ask questions and get help within your IDE by chatting with Amazon Q or invoking actions by typing / in the chat box.

To get started, you need to install or update the latest AWS Toolkit and sign in to Amazon CodeWhisperer. Once you’re signed in to Amazon CodeWhisperer, it will automatically activate the Amazon Q conversational capability in the IDE. With Amazon Q enabled, you can now start chatting to get coding assistance.

You can ask Amazon Q to describe your source code file.

From here, you can improve your application, for example, by integrating it with Amazon DynamoDB. You can ask Amazon Q, “Generate code to save data into DynamoDB table called save_data() accepting data parameter and return boolean status if the operation successfully runs.”

Once you’ve reviewed the generated code, you can do a manual copy and paste into the editor. You can also select Insert at cursor to place the generated code into the source code directly.

This feature makes it really easy to help you focus on building applications because you don’t have to leave your IDE to get answers and context-specific coding guidance. You can try the preview of this feature in Visual Studio Code and JetBrains IDEs.

6. Feature development capability
Another exciting feature that Amazon Q provides is guiding you interactively from idea to building new features within your IDE and Amazon CodeCatalyst. You can go from a natural language prompt to application features in minutes, with interactive step-by-step instructions and best practices, right from your IDE. With a prompt, Amazon Q will attempt to understand your application structure and break down your prompt into logical, atomic implementation steps.

To use this capability, you can start by invoking an action command /dev in Amazon Q and describe the task you need Amazon Q to process.

Then, from here, you can review, collaborate and guide Amazon Q in the chat for specific areas that need to be implemented.

Additional capabilities to help you ship features faster with complete pull requests are available if you’re using Amazon CodeCatalyst. In Amazon CodeCatalyst, you can assign a new or an existing issue to Amazon Q, and it will process an end-to-end development workflow for you. Amazon Q will review the existing code, propose a solution approach, seek feedback from you on the approach, generate merge-ready code, and publish a pull request for review. All you need to do after is to review the proposed solutions from Amazon Q.

The following screenshots show a pull request created by Amazon Q in Amazon CodeCatalyst.

Here are a couple of things that you should know:

Amazon Q feature development capability is currently in preview in Visual Studio Code and Amazon CodeCatalyst
To use this capability in IDE, you need to have the Amazon CodeWhisperer Professional tier. Learn more on the Amazon CodeWhisperer pricing page.

7. Upgrade applications with Amazon Q Code Transformation
With Amazon Q, you can now upgrade an entire application within a few hours by starting a guided code transformation. This capability, called Amazon Q Code Transformation, simplifies maintaining, migrating, and upgrading your existing applications.

To start, navigate to the CodeWhisperer section and then select Transform. Amazon Q Code Transformation automatically analyzes your existing codebase, generates a transformation plan, and completes the key transformation tasks suggested by the plan.

Some additional information about this feature:

Amazon Q Code Transformation is available in preview today in the AWS Toolkit for IntelliJ IDEA and the AWS Toolkit for Visual Studio Code.
To use this capability, you need to have the Amazon CodeWhisperer Professional tier during the preview.
During preview, you can can upgrade Java 8 and 11 applications to version 17, a Java Long-Term Support (LTS) release.

Get started with Amazon Q today
With Amazon Q, you have an AI expert by your side to answer questions, write code faster, troubleshoot issues, optimize workloads, and even help you code new features. These capabilities simplify every phase of building applications on AWS.

Amazon Q lets you engage with AWS Support agents directly from the Q interface if additional assistance is required, eliminating any dead ends in the customer’s self-service experience. The integration with AWS Support is available in the console and will honor the entitlements of your AWS Support plan.

Learn more

— Donnie & Channy

Introducing Amazon EC2 high memory U7i Instances for large in-memory databases (preview)

2023-11-27 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/introducing-amazon-ec2-high-memory-u7i-instances-for-large-in-memory-databases-preview/

The new U7i instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. Powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids), the instances are now available in multiple AWS regions in preview form, in the US West (Oregon), Asia Pacific (Seoul), and Europe (Frankfurt) AWS Regions, as follows:

Instance Name	vCPUs	Memory (DDR5)	EBS Bandwidth	Network Bandwidth
u7in-16tb.224xlarge	896	16,384 GiB	100 Gbps	100 Gbps
u7in-24tb.224xlarge	896	24,576 GiB	100 Gbps	100 Gbps
u7in-32tb.224xlarge	896	32,768 GiB	100 Gbps	100 Gbps

We are also working on a smaller instance:

Instance Name	vCPUs	Memory (DDR5)	EBS Bandwidth	Network Bandwidth
u7i-12tb.224xlarge	896	12,288 GiB	60 Gbps	100 Gbps

Here’s what 32 TiB of memory looks like:

And here are the 896 vCPUs (and lots of other info):

When compared to the first generation of High Memory instances, the U7i instances offer up to 125% more compute performance and up to 120% more memory performance. They also provide 2.5x as much EBS bandwidth, giving you the ability to hydrate in-memory databases at a rate of up to 44 terabytes per hour.

Each U7i instance supports attachment of up to 128 General Purpose (gp2 and gp3) or Provisioned IOPS (io1 and io2 Block Express) EBS volumes. Each io2 Block Express volume can be as big as 64 TiB and can deliver up to 256K IOPS at up to 32 Gbps, making them a great match for the U7i instance.

On the network side, the instances support ENA Express and deliver up to 25 Gbps of bandwidth per network flow.

Supported operating systems include Red Hat Enterprise Linux and SUSE Enterprise Linux Server.

Join the Preview
If you are ready to put the U7i instances to the test in your environment, join the preview.

— Jeff;

The attendee’s guide to the AWS re:Invent 2023 Compute track

2023-11-17 Chris Munns

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/the-attendees-guide-to-the-aws-reinvent-2023-compute-track/

This post by Art Baudo – Principal Product Marketing Manager – AWS EC2, and Pranaya Anshu – Product Marketing Manager – AWS EC2

We are just a few weeks away from AWS re:Invent 2023, AWS’s biggest cloud computing event of the year. This event will be a great opportunity for you to meet other cloud enthusiasts, find productive solutions that can transform your company, and learn new skills through 2000+ learning sessions.

Even if you are not able to join in person, you can catch-up with many of the sessions on-demand and even watch the keynote and innovation sessions live.

If you’re able to join us, just a reminder we offer several types of sessions which can help maximize your learning in a variety of AWS topics. Breakout sessions are lecture-style 60-minute informative sessions presented by AWS experts, customers, or partners. These sessions are recorded and uploaded a few days after to the AWS Events YouTube channel.

re:Invent attendees can also choose to attend chalk-talks, builder sessions, workshops, or code talk sessions. Each of these are live non-recorded interactive sessions.

Chalk-talk sessions: Attendees will interact with presenters, asking questions and using a whiteboard in session.
Builder Sessions: Attendees participate in a one-hour session and build something.
Workshops sessions: Attendees join a two-hour interactive session where they work in a small team to solve a real problem using AWS services.
Code talk sessions: Attendees participate in engaging code-focused sessions where an expert leads a live coding session.

To start planning your re:Invent week, check-out some of the Compute track sessions below. If you find a session you’re interested in, be sure to reserve your seat for it through the AWS attendee portal.

Explore the latest compute innovations

This year AWS compute services have launched numerous innovations: From the launch of over 100 new Amazon EC2 instances, to the general availability of Amazon EC2 Trn1n instances powered by AWS Trainium and Amazon EC2 Inf2 instances powered by AWS Inferentia2, to a new way to reserve GPU capacity with Amazon EC2 Capacity Blocks for ML. There’s a lot of exciting launches to take in.

Explore some of these latest and greatest innovations in the following sessions:

CMP102 | What’s new with Amazon EC2
Provides an overview on the latest Amazon EC2 innovations. Hear about recent Amazon EC2 launches, learn how about differences between Amazon EC2 instances families, and how you can use a mix of instances to deliver on your cost, performance, and sustainability goals.
CMP217 | Select and launch the right instance for your workload and budget
Learn how to select the right instance for your workload and budget. This session will focus on innovations including Amazon EC2 Flex instances and the new generation of Intel, AMD, and AWS Graviton instances.
CMP219-INT | Compute innovation for any application, anywhere
Provides you with an understanding of the breadth and depth of AWS compute offerings and innovation. Discover how you can run any application, including enterprise applications, HPC, generative artificial intelligence (AI), containers, databases, and games, on AWS.

Customer experiences and applications with machine learning

Machine learning (ML) has been evolving for decades and has an inflection point with generative AI applications capturing widespread attention and imagination. More customers, across a diverse set of industries, choose AWS compared to any other major cloud provider to build, train, and deploy their ML applications. Learn about the generative AI infrastructure at Amazon or get hands-on experience building ML applications through our ML focused sessions, such as the following:

CMP206 | Behind-the-scenes look at generative AI infrastructure at Amazon
Learn how to power performant generative AI applications while keeping costs under control. Get a behind-the-scenes look at how purpose-built infrastructure from AWS including AWS Trainium and AWS Inferentia2 are used by Amazon teams.
CMP329-R | PyTorch best practices for generative AI & LLM inference architectures
Understand best practices for using PyTorch to deploy Large Language Models (LLMs) on a cluster of Amazon EC2 Inf2 instances managed by Amazon Elastic Kubernetes Service (EKS).
CMP402 | Build a generative AI chatbot using your own data with Amazon Titan
Build and deploy a generative AI-powered chatbot on AWS using AWS Inferentia, AWS Trainium, Amazon SageMaker, Amazon Bedrock, Amazon S3, and Amazon OpenSearch Service.

Discover what powers AWS compute

AWS has invested years designing custom silicon optimized for the cloud to deliver the best price performance for a wide range of applications and workloads using AWS services. Learn more about the AWS Nitro System, processors at AWS, and ML chips.

CMP306 | Deep dive into the AWS Nitro System
Delve further into the design, architecture, and new innovations to the AWS Nitro platform in this session.
CMP309 | Compute innovations enabled by the AWS Nitro System
Deep dive in to the AWS Nitro System, the underlying platform for modern EC2 instances, the Nitro System, which has allowed AWS to innovate faster, further reduce your costs, and deliver increased security.
CMP313 | AWS Graviton: The best price performance for your AWS workloads
Learn more about AWS Graviton, hear customer stories about recent Graviton adoptions, how you can migrate your workloads to Graviton while saving cost, and gaining sustainability benefits.

Optimize your compute costs

At AWS, we focus on delivering the best possible cost structure for our customers. Frugality is one of our founding leadership principles. Cost effective design continues to shape everything we do, from how we develop products to how we run our operations. Come learn of new ways to optimize your compute costs through AWS services, tools, and optimization strategies in the following sessions:

CMP207 | Capacity, availability, cost efficiency: Pick three
Use AWS strategies such as using attribute-based instance type selection, prioritizing operational flexibility, and Amazon EC2 Flex instances to help you get compute resources, when you need them, while keeping cost under control.
CMP211 | Smart savings: Amazon EC2 cost-optimization strategies
Discover the basics of cost optimization from AWS Savings Plans and Amazon EC2 Auto Scaling to more advanced strategies like Amazon EC2 Spot Instances, AWS Graviton, automation, and more.
CMP406 | Reduce costs and improve sustainability with AWS Graviton
Learn how using AWS Graviton based instances can provide up to 40% better price performance over comparable current-generation instances for a wide variety of workloads.

Check out workload-specific sessions

Amazon EC2 offers the broadest and deepest compute platform to help you best match the needs of your workload. More SAP, high performance computing (HPC), ML, and Windows workloads run on AWS than any other cloud. Join sessions focused around your specific workload to learn about how you can leverage AWS solutions to accelerate your innovations.

CMP103 | Bring your small business to the cloud with Amazon Lightsail
Join to learn how you can quickly deploy and configure a WordPress blog, point-of-sale system, ecommerce site, or remote Windows desktop on AWS for a simple monthly price.
CMP203 | Scaling SAP HANA workloads on AWS
Discover how AWS SAP HANA customers benefit from the flexibility and reliability of the AWS Cloud. learn how you can migrate your mission-critical SAP HANA workloads.
CMP213 | Confidently run your production HPC workloads on AWS
Explore the HPC portfolio of services and products available in the AWS Cloud. Learn how customers can scale simulations and modeling and other memory- and data-intensive technical workloads.
CMP318 | Build a spatial data lake with Visual Asset Management System
Get a hands-on look at how to use the open source Visual Asset Management System (VAMS), built by AWS, to deploy a foundation for spatial data lakes, helping business units better collaborate and use 3D data while avoiding redundancies.
CMP333 | Accelerate Apple application development with Amazon EC2 Mac instances
Dive deep into best practices to automate the provisioning, configuration, and scaling of your Amazon EC2 Mac instances.

Hear from AWS customers

AWS serves millions of customers of all sizes across thousands of use cases, every industry, and around the world. Hear customers dive into how AWS compute solutions have helped them transform their businesses.

CMP212 | Sustainable compute: Reducing costs and carbon emissions with AWS
Learn how to build sustainable and cost-efficient cloud operations that align with your business goals and see how Adobe is leveraging AWS Graviton processors to do the same.
CMP214 | HPC on AWS for semiconductors and healthcare life sciences
See a demo of how organizations like Arm are using AWS for demanding simulation workloads.
CMP320 | Powering Peloton’s journey to personalization with AWS
Learn how Peloton’s solution architecture and technology stack uses Amazon EC2, Amazon S3, and Amazon EKS to build and serve individualized class recommendations in real time with low latency for a great user experience.

Ready to unlock new possibilities?

The AWS Compute team looks forward to seeing you in Las Vegas. Come meet us at the Compute Booth in the Expo. And if you’re looking for more session recommendations, check-out additional re:Invent attendee guides curated by experts.

It’s About Time: Microsecond-Accurate Clocks on Amazon EC2 Instances

2023-11-16 Chris Munns

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/its-about-time-microsecond-accurate-clocks-on-amazon-ec2-instances/

This post is written by Josh Levinson, AWS Principal Product Manager and Julien Ridoux, AWS Principal Software Engineer

Today, we announced that we improved the Amazon Time Sync Service to microsecond-level clock accuracy on supported Amazon EC2 instances. This new capability adds a local reference clock to your EC2 instance and is designed to deliver clock accuracy in the low double-digit microsecond range within your instance’s guest OS software. This post shows you how to connect to the improved clocks on your EC2 instances. This post also demonstrates how you can measure your clock accuracy and easily generate and compare timestamps from your EC2 instances with ClockBound, an open source daemon and library.

In general, it’s hard to achieve high-fidelity clock synchronization due to hardware limitations and network variability. While customers have depended on the Amazon Time Sync Service to provide one millisecond clock accuracy, workloads that need microsecond-range accuracy, such as financial trading and broadcasting, required customers to maintain their own time infrastructure, which is a significant operational burden, and expensive. Other clock-sensitive applications that run on the cloud, including distributed databases and storage, have to incorporate message exchange delays with wait periods, data locks, or transaction journaling to maintain consistency at scale.

With global and reliable microsecond-range clock accuracy, you can now migrate and modernize your most time-sensitive applications in the cloud and retire your burdensome on-premises time infrastructure. You can also simplify your applications and increase their throughput by leveraging the high-accuracy timestamps to determine the ordering of events and transactions on workloads across instances, Availability Zones, and Regions. Additionally, you can audit the improved Amazon Time Sync Service to measure and monitor the expected microsecond-range accuracy.

New improvements to Amazon Time Sync Service

The new local clock source can be accessed over the existing Amazon Time Sync Service’s Network Time Protocol (NTP) IPv4 and IPv6 endpoints, or by configuring a new Precision Time Protocol (PTP) reference clock device, to get the best accuracy possible. It’s important to note that both NTP and the new PTP Hardware Clock (PHC) device share the same highly accurate source of time. The new PHC device is part of the AWS Nitro System, so it is directly accessible on supported bare metal and virtualized Amazon EC2 instances without using any customer resources.

A quick note about Leap Seconds

Leap seconds, introduced in 1972, are occasional one-second adjustments to UTC time to factor in irregularities in Earth’s rotation to UTC time in order to accommodate differences between International Atomic Time (TAI) and solar time (Ut1). To manage leap seconds on behalf of customers, we designed leap second smearing within the Amazon Time Sync Service (details on smearing time in “Look Before You Leap”).

Leap seconds are going away, and we are in full support of the decision made at the 27th General Conference on Weights and Measures to abandon leap seconds by or before 2035.

To support this transition, we still plan on smearing time when accessing the Amazon Time Sync Service over the local NTP connection or our Public NTP pools (time.aws.com). The new PHC device, however, will not provide a smeared time option. In the event of a leap seconds, PHC would add the leap seconds following UTC standards. Leap smeared and leap second time sources are the same in most cases. But, since they differ during a leap second event, we do not recommend mixing smeared and non-smeared time sources in your time client configuration during a leap second event.

Connect using NTP (automatic for most customers)

You can connect to the new, microsecond-accurate clocks over NTP the same way you use the Amazon Time Sync Service today at the 169.254.169.123 IPv4 address or the fd00:ec2::123 IPv6 address. This is already the default configuration on all Amazon AMIs and many partner AMIs, including RHEL, Ubuntu, and SUSE. You can verify this connection in your NTP daemon. The below example, using the chrony daemon, verifies that chrony is using the 169.254.169.123 IPv4 address of the Amazon Time Sync Service to synchronize the time:

[ec2-user@ ~]$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^- pacific.latt.net              3  10   377    69  +5630us[+5632us] +/-   90ms
^- edge-lax.txryan.com           2   8   377   224   -691us[ -689us] +/-   33ms
^* 169.254.169.123               1   4   377     2  -4487ns[-5914ns] +/-   85us
^- blotch.image1tech.net         2   9   377   327  -1710us[-1720us] +/-   64ms
^- 44.190.40.123                 2   9   377   161  +3057us[+3060us] +/-   84ms

The 169.254.169.123 IPv4 address of the Amazon Time Sync Service is designated with a *, showing it is the source of synchronization on this instance. See the EC2 User Guide for more details on configuring the Amazon Time Sync Service if it is not already configured by default.

Connect using the PTP Hardware Clock

First, you need to install the latest Elastic Network Adapter (ENA) driver. This driver will allow you to connect directly to the PHC. Connect to your instance and install the Linux kernel driver for Elastic Network Adapter (ENA) version 2.10.0 or later. For the installation instructions, see Linux kernel driver for Elastic Network Adapter (ENA) family on GitHub. To enable PTP support in the driver follow the instructions in the section “PTP Hardware Clock (PHC)“.

Once the driver is installed, you need to configure your NTP daemon to connect to the PHC. Below is an example on how to change the configuration in chrony by adding the PHC to your chrony configuration file. Then restart chrony for the change to take place:

[ec2-user ~]$ sudo sh -c 'echo "refclock PHC /dev/ptp0 poll 0 delay 0.000010 prefer" >> /etc/chrony.conf'
[ec2-user ~]$ sudo systemctl restart chronyd

This example uses a +/-5 microsecond range in receiving the reference signal from the PHC. These 10 microseconds are needed to account for operating system latency.

After changing your configuration, you can validate your daemon is correctly syncing to the PHC. Below is an example of output from the chronyc command. An asterisk will appear next to the PHC0 source indicating that you are now syncing to the PHC:

[ec2-user@ ~]$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
=============================================================================
#* PHC0                           0   0   377     1   +18ns[  +20ns] +/- 5032ns

The PHC0 device of the Amazon Time Sync Service is designated with a *, showing it is the source of synchronization on this instance

Your chrony tracking information will also show that you are syncing to the PHC:

[ec2-user@ ~]$ chronyc tracking
Reference ID    : 50484330 (PHC0)
Stratum         : 1
Ref time (UTC)  : Mon Nov 13 18:43:09 2023
System time     : 0.000000004 seconds fast of NTP time
Last offset     : -0.000000010 seconds
RMS offset      : 0.000000012 seconds
Frequency       : 7.094 ppm fast
Residual freq   : -0.000 ppm
Skew            : 0.004 ppm
Root delay      : 0.000010000 seconds
Root dispersion : 0.000001912 seconds
Update interval : 1.0 seconds
Leap status     : Normal

See the EC2 User Guide for more details on configuring the PHC.

Measuring your clock accuracy

Clock accuracy is a measure of clock error, typically defined as the offset to UTC. This clock error is the difference between the observed time on the computer and the reference time (also known as true time). If your instance is configured to use the Amazon Time Sync Service where the microsecond-accurate enhancement is available, you will typically see a clock error bound of under 100us using the NTP connection. When configured and synchronized correctly with the new PHC connection, you will typically see a clock error bound of under 40us.

We previously published a blog on measuring and monitoring clock accuracy over NTP, which still applies to the improved NTP connection.

If you are connected to the PHC, your time daemon, such as chronyd, will underestimate the clock error bound. This is because inherently, a PTP hardware clock device in Linux does not pass any “error bound” information to chrony, the way the NTP would. As a result, your clock synchronization daemon assumes the clock itself is accurate to UTC and thus has an “error bound” of 0. To get around this issue, the Nitro System calculates the error bound of the PTP Hardware Clock itself, and exposes it to your EC2 instance over the ENA driver sysfs filesystem. You can read this directly as a value in nanoseconds with the command cat /sys/devices/pci0000:00/0000:00:05.0/phc_error_bound. To get your clock error bound at some instant, you would need to add the clock error bound from chrony or ClockBound at the time that chronyd polls the PTP Hardware Clock and add it to this phc_error_bound value.

Below is how you would calculate the clock error incorporating the PHC clock error to get your true clock error bound:

CLOCK ERROR BOUND = SYSTEM TIME + (.5 * ROOT DELAY) + ROOTDISPERSION + PHC Error Bound

For the values in the example:

PHC Error Bound = cat /sys/devices/pci0000:00/0000:00:05.0/phc_error_bound

The System Time, Root Delay, and Root Dispersion are values taken from the chrony tracking information.

ClockBound

However accurate, a clock is never perfect. Instead of providing an estimate of the clock error, ClockBound provides a reliable confidence interval by automatically calculating the clock accuracy, using the calculations in which the reference time (true time) does exist. The open source ClockBound daemon provides a convenient way to retrieve this confidence interval, and work is continuing to make it easier to integrate into high performance workloads.

Conclusion

The Amazon Time Sync Service’s new microsecond-accurate clocks can be leveraged to migrate and modernize your most clock-sensitive applications in the cloud. In this post, we showed you how to can connect to the improved clocks on supported Amazon EC2 instances, how to measure your clock accuracy, and how to easily generate and compare timestamps from your Amazon EC2 instances with ClockBound. Launch a supported instance and get started today to build using this new capability.

To learn more about the Amazon Time Sync Service, see the EC2 UserGuide for Linux and Windows.

If you have questions about this post, start a new thread on the AWS Compute re:Post or contact AWS Support.

Hear about the Amazon Time Sync Service at re:Invent

We will speak in more detail about the Amazon Time Sync Service during re:invent 2023. Look for Session ID CMP220 in the AWS re:Invent session catalog to register.

Introducing instance maintenance policy for Amazon EC2 Auto Scaling

2023-11-16 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/introducing-instance-maintenance-policy-for-amazon-ec2-auto-scaling/

This post is written by Ahmed Nada, Principal Solutions Architect, Flexible Compute and Kevin OConnor, Principal Product Manager, Amazon EC2 Auto Scaling.

Amazon Web Services (AWS) customers around the world trust Amazon EC2 Auto Scaling to provision, scale, and manage Amazon Elastic Compute Cloud (Amazon EC2) capacity for their workloads. Customers have come to rely on Amazon EC2 Auto Scaling instance refresh capabilities to drive deployments of new EC2 Amazon Machine Images (AMIs), change EC2 instance types, and make sure their code is up-to-date.

Currently, EC2 Auto Scaling uses a combination of ‘launch before terminate’ and ‘terminate and launch’ behaviors depending on the replacement cause. Customers have asked for more control over when new instances are launched, so they can minimize any potential disruptions created by replacing instances that are actively in use. This is why we’re excited to introduce instance maintenance policy for Amazon EC2 Auto Scaling, an enhancement that provides customers with greater control over the EC2 instance replacement processes to make sure instances are replaced in a way that aligns with performance priorities and operational efficiencies while minimizing Amazon EC2 costs.

This post dives into varying ways to configure an instance maintenance policy and gives you tools to use it in your Amazon EC2 Auto Scaling groups.

Background

AWS launched Amazon EC2 Auto Scaling in 2009 with the goal of simplifying the process of managing Amazon EC2 capacity. Since then, we’ve continued to innovate with advanced features like predictive scaling, attribute-based instance selection, and warm pools.

A fundamental Amazon EC2 Auto Scaling capability is replacing instances based on instance health, due to Amazon EC2 Spot Instance interruptions, or in response to an instance refresh operation. The instance refresh capability allows you to maintain a fleet of healthy and high-performing EC2 instances in your Amazon EC2 Auto Scaling group. In some situations, it’s possible that terminating instances before launching a replacement can impact performance, or in the worst case, cause downtime for your applications. No matter what your requirements are, instance maintenance policy allows you to fine-tune the instance replacement process to match your specific needs.

Overview

Instance maintenance policy adds two new Amazon EC2 Auto Scaling group settings: minimum healthy percentage (MinHealthyPercentage) and maximum healthy percentage (MaxHealthyPercentage). These values represent the percentage of the group’s desired capacity that must be in a healthy and running state during instance replacement. Values for MinHealthyPercentage can range from 0 to 100 percent and from 100 to 200 percent for MaxHealthyPercentage. These settings are applied to all events that lead to instance replacement, such as Health-check based replacement, Max Instance Lifetime, EC2 Spot Capacity Rebalancing, Availability Zone rebalancing, Instance Purchase Option Rebalancing, and Instance refresh. You can also override the group-level instance maintenance policy during instance refresh operations to meet specific deployment use cases.

Before launching instance maintenance policy, an Amazon EC2 Auto Scaling group would use the previously described behaviors when replacing instances. By setting the MinHealthyPercentage of the instance maintenance policy to 100% and the MaxHealthyPercentage to a value greater than 100%, the Amazon EC2 Auto Scaling group first launches replacement instances and waits for them to become available before terminating the instances being replaced.

Setting up instance maintenance policy

You can add an instance maintenance policy to new or existing Amazon EC2 Auto Scaling groups using the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS SDK, AWS CloudFormation, and Terraform.

When creating or editing Amazon EC2 Auto Scaling groups in the Console, you are presented with four options to define the replacement behavior of your instance maintenance policy. These options include the No policy option, which allows you to maintain the default instance replacement settings that the Amazon EC2 Auto Scaling service uses today.

Image 1: The GUI for the instance maintenance policy feature within the “Create Auto Scaling group” wizard.

Using instance maintenance policy to increase application availability

The Launch before terminating policy is the right selection when you want to favor availability of your Amazon EC2 Auto Scaling group capacity. This policy setting temporarily increases the group’s capacity by launching new instances during replacement operations. In the Amazon EC2 console, you select the Launch before terminating replacement behavior, and then set your desired MaxHealthyPercentage value to determine how many more instances should be launched during instance replacement.

For example, if you are managing a workload that requires optimal availability during instance replacements, choose the Launch before terminating policy type with a MinHealthyPercentage set to 100%. If you set your MaxHealthyPercentage to 150%, then Amazon EC2 Auto Scaling launches replacement instances before terminating instances to be replaced. You should see the desired capacity increase by 50%, exceeding the group maximum capacity during the operation to provide you with the needed availability. The chart in the following figure illustrates what an instance refresh operation would behave like with a Launch before terminating policy.

Figure 1: A graph simulating the instance replacement process with a policy configured to launch before terminating.

Overriding a group’s instance maintenance policy during instance refresh

Instance maintenance policy settings apply to all instance replacement operations, but they can be overridden at the start of a new instance refresh operation. Overriding instance maintenance policy is helpful in situations like a bad code deployment that needs replacing without downtime. You could configure an instance maintenance policy to bring an entirely new group’s worth of instances into service before terminating the instances with the problematic code. In this situation, you set the MaxHealthyPercentage to 200% for the instance refresh operation and the replacement happens in a single cycle to promptly address the bad code issue. Setting the MaxHealthyPercentage to 200% will allow the replacement settings to breach the Auto Scaling Group’s Max capacity value, but would be constrained by any account level quotas, so be sure to factor these into application of this feature. See the following figure for a visualization of how this operation would behave.

Figure 2: A graph simulating the instance replacement process with a policy configured to accelerate a new deployment.

Controlling costs during replacements and deployments

The Terminate and launch policy option allows you to favor cost control during instance replacement. By configuring this policy type, Amazon EC2 Auto Scaling terminates existing instances and then launches new instances during the replacement process. To set a Terminate and launch policy, you must specify a MinHealthyPercentage to establish how low the capacity can drop, and keep your MaxHealthyPercentage set to 100%. This configuration keeps the Auto Scaling group’s capacity at or below the desired capacity setting.

The following figure shows behavior with the MinHealthyPercentage set to 80%. During the instance replacement process, the Auto Scaling group first terminates 20% of the instances and immediately launches replacement instances, temporarily reducing the group’s healthy capacity to 80%. The group waits for the new instances to pass its configured health checks and complete warm up before it moves on to replacing the remaining batches of instances.

Figure 3: A graph simulating the instance replacement process with a policy configured to terminate and launch.

Note that the difference between MinHealthyPercentage and MaxHealthyPercentage values impacts the speed of the instance replacement process. In the preceding figure, the Amazon EC2 Auto Scaling group replaces 20% of the instances in each cycle. The larger the gap between the MinHealthyPercentage and MaxHealthyPercentage, the faster the replacement process.

Using a custom policy for maximum flexibility

You can also choose to adopt a Custom behavior option, where you have the flexibility to set the MinHealthyPercentage and MinHealthyPercentage values to whatever you choose. Using this policy type allows you to fine-tune the replacement behavior and control the capacity of your instances within the Amazon EC2 Auto Scaling group to tailor the instance maintenance policy to meet your unique needs.

What about fractional replacement calculations?

Amazon EC2 Auto Scaling always favors availability when performing instance replacements. When instance maintenance policy is configured, Amazon EC2 Auto Scaling also prioritizes launching a new instance rather than going below the MinHealthyPercentage. For example, in an Amazon EC2 Auto Scaling group with a desired capacity of 10 instances and an instance maintenance policy with MinHealthyPercentage set to 99% and MaxHealthyPercentage set to 100%, your settings do not allow for a reduction in capacity of at least one instance. Therefore, Amazon EC2 Auto Scaling biases toward launch before terminating and launches one new instance before terminating any instances that need replacing.

Configuring an instance maintenance policy is not mandatory. If you don’t configure your Amazon EC2 Auto Scaling groups to use an instance maintenance policy, then there is no change in the behavior of your Amazon EC2 Auto Scaling groups’ existing instance replacement process.

You can set a group-level instance maintenance policy through your CloudFormation or Terraform templates. Within your templates, you must set values for both the MinHealthyPercentage and MaxHealthyPercentage settings to determine the instance replacement behavior that aligns with the specific requirements of your Amazon EC2 Auto Scaling group.

Conclusion

In this post, we introduced the new instance maintenance policy feature for Amazon EC2 Auto Scaling groups, explored its capabilities, and provided examples of how to use this new feature. Instance maintenance policy settings apply to all instance replacement processes with the option to override the settings on a per instance refresh basis. By configuring instance maintenance policies, you can control the launch and lifecycle of instances in your Amazon EC2 Auto Scaling groups, increase application availability, reduce manual intervention, and improve cost control for your Amazon EC2 usage.

To learn more about the feature and how to get started, refer to the Amazon EC2 Auto Scaling User Guide.

AWS Weekly Roundup—Reserve GPU capacity for short ML workloads, Finch is GA, and more—November 6, 2023

2023-11-06 Marcia Villalba

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-reserve-gpu-capacity-for-short-ml-workloads-finch-is-ga-and-more-november-6-2023/

The year is coming to an end, and there are only 50 days until Christmas and 21 days to AWS re:Invent! If you are in Las Vegas, come and say hi to me. I will be around the Serverlesspresso booth most of the time.

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon EC2 – Amazon EC2 announced Capacity Blocks for ML. This means that you can now reserve GPU compute capacity for your short-duration ML workloads. Learn more about this launch on the feature page and announcement blog post.

Finch – Finch is now generally available. Finch is an open source tool for local container development on macOS (using Intel or Apple Silicon). It provides a command line developer tool for building, running, and publishing Linux containers on macOS. Learn more about Finch in this blog post written by Phil Estes or on the Finch website.

AWS X-Ray – AWS X-Ray now supports W3C format trace IDs for distributed tracing. AWS X-Ray supports trace IDs generated through OpenTelemetry or any other framework that conforms to the W3C Trace Context specification.

Amazon Translate – Amazon Translate introduces a brevity customization to reduce translation output length. This is a new feature that you can enable in your real-time translations where you need a shorter translation to meet caption size limits. This translation is not literal, but it will preserve the underlying message.

AWS IAM – IAM increased the actions last accessed to 60 more services. This functionality is very useful when fine-tuning the permissions of the roles, identifying unused permissions, and granting the least amount of permissions that your roles need.

AWS IAM Access Analyzer – IAM Access Analyzer policy generator expanded support to identify over 200 AWS services to help you create fine-grained policies based on your AWS CloudTrail access activity.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Some other news and blog posts that you may have missed:

AWS Compute Blog – Daniel Wirjo and Justin Plock wrote a very interesting article about how you can send and receive webhooks on AWS using different AWS serverless services. This is a good read if you are working with webhooks on your application, as it not only shows you how to build these solutions but also what considerations you should have when building them.

AWS Storage Blog – Bimal Gajjar and Andrew Peace wrote a very useful blog post about how to handle event ordering and duplicate events with Amazon S3 Event Notifications. This is a common challenge for many customers.

Amazon Science Blog – David Fan wrote an article about how to build better foundation models for video representation. This article is based on a paper that Prime Video presented at a conference about this topic.

AWS open-source news and updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Ecuador (November 7), Mexico (November 11), Montevideo (November 14), Central Asia (Kazakhstan, Uzbekistan, Kyrgyzstan, and Mongolia on November 17–18), and Guatemala (November 18).

AWS re:Invent (November 27–December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Browse the session catalog and attendee guides and check out the highlights for generative artificial intelligence (AI).

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Marcia

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Amazon EC2 Instance Metadata Service IMDSv2 by default

2023-11-06 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-instance-metadata-service-imdsv2-by-default/

Effective mid-2024, newly released Amazon EC2 instance types will use only version 2 of the EC2 Instance Metadata Service (IMDSv2). We are also taking a series of steps to make IMDSv2 the default choice for AWS Management Console Quick Starts and other launch pathways.

Background
This service is accessible from within an EC2 instance at a fixed IP address (169.254.169.254 via IPv4 or fd00:ec2::254 via IPv6 on Nitro instances). It gives you (or the code running on the instance) access to a wealth of static and dynamic data including the ID of the AMI that was used to launch the instance, block device mappings, temporary IAM credentials for roles that are attached to the instance, network interface information, user data, and much more, as detailed in Instance Metadata Categories.

The v1 service uses a request/response access method and the v2 service uses a session-oriented method, as detailed in this blog post. Both services are fully secure, but v2 provides additional layers of protection for four types of vulnerabilities that could be used to try to access IMDS.

Many applications and instances are already using and benefiting from IMDSv2, but the full range of benefits become available only when IMDSv1 is disabled at the AWS account level.

Migration Plan
Here are the significant steps that we have taken, and those that plan to take, on the road to making IMDSv2 the default choice for new AWS infrastructure (allow a tiny bit of wiggle room on the 2023 and 2024 dates):

November 2019 – We launched IMDSv2 and showed you how to use it to add defense in depth.

February 2020 – We began to verify that all newly published products from AWS Marketplace sellers and AWS Partners support IMDSv2.

March 2023 – We launched Amazon Linux 2023, which uses IMDSv2 by default for all launches.

September 2023 – We published a blog post to show you how to Get the full benefits of IMDSv2 and disable IMDSv1 across your AWS infrastructure.

November 2023 – Starting today, all console Quick Start launches will use IMDSv2-only (all Amazon and Partner Quick Start AMIs support this). Here’s how this is specified in the EC2 Console within Advanced details when launching an instance:

February 2024 – We plan to introduce a new API function that will allow you to control the use of IMDSv1 as the default at the account level. You can already control IMDSv1 usage in an IAM policy (taking away and limiting existing permission), or as an SCP that is applied globally across an account, an organizational unit (OU), or an entire organization. For example IAM policies read Work with instance metadata.

Mid-2024 – Newly released Amazon EC2 instance types will use IMDSv2 only by default. For transition support, you will still be able to enable/turn on IMDSv1 at launch or after launch on an instance live without the need for a restart or stop/start.

What to Do
Now is the time to get started on your migration from IMDSv1 to IMDSv2 using the Get the full benefits.. blog post as a guide. You should also become familiar with the Tools for helping with the transition to IMDSv2, along with the recommended path on the same page. In addition to recommending tools, this page shows you how to set up an IAM policy that disables the use of IMDSv1 and shows you how to use the MetadataNoToken CloudWatch metric to detect any remaining usage:

Another helpful resource can be found on AWS re:Post: How can I use Systems Manager automation to enforce that only IMDSv2 is used to access instance metadata from my Amazon EC2 instance?

We want this transition to be as smooth as possible for you and for your customers. If you need any additional help, please contact AWS Support.

— Jeff;

Announcing Amazon EC2 Capacity Blocks for ML to reserve GPU capacity for your machine learning workloads

2023-11-01 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/announcing-amazon-ec2-capacity-blocks-for-ml-to-reserve-gpu-capacity-for-your-machine-learning-workloads/

Recent advancements in machine learning (ML) have unlocked opportunities for customers across organizations of all sizes and industries to reinvent new products and transform their businesses. However, the growth in demand for GPU capacity to train, fine-tune, experiment, and inference these ML models has outpaced industry-wide supply, making GPUs a scarce resource. Access to GPU capacity is an obstacle for customers whose capacity needs fluctuate depending on the research and development phase they’re in.

Today, we are announcing Amazon Elastic Compute Cloud (Amazon EC2) Capacity Blocks for ML, a new Amazon EC2 usage model that further democratizes ML by making it easy to access GPU instances to train and deploy ML and generative AI models. With EC2 Capacity Blocks, you can reserve hundreds of GPUs collocated in EC2 UltraClusters designed for high-performance ML workloads, using Elastic Fabric Adapter (EFA) networking in a peta-bit scale non-blocking network, to deliver the best network performance available in Amazon EC2.

This is an innovative new way to schedule GPU instances where you can reserve the number of instances you need for a future date for just the amount of time you require. EC2 Capacity Blocks are currently available for Amazon EC2 P5 instances powered by NVIDIA H100 Tensor Core GPUs in the AWS US East (Ohio) Region. With EC2 Capacity Blocks, you can reserve GPU instances in just a few clicks and plan your ML development with confidence. EC2 Capacity Blocks make it easy for anyone to predictably access EC2 P5 instances that offer the highest performance in EC2 for ML training.

EC2 Capacity Block reservations work similarly to hotel room reservations. With a hotel reservation, you specify the date and duration you want your room for and the size of beds you’d like─a queen bed or king bed, for example. Likewise, with EC2 Capacity Block reservations, you select the date and duration you require GPU instances and the size of the reservation (the number of instances). On your reservation start date, you’ll be able to access your reserved EC2 Capacity Block and launch your P5 instances. At the end of the EC2 Capacity Block duration, any instances still running will be terminated.

You can use EC2 Capacity Blocks when you need capacity assurance to train or fine-tune ML models, run experiments, or plan for future surges in demand for ML applications. Alternatively, you can continue using On-Demand Capacity Reservations for all other workload types that require compute capacity assurance, such as business-critical applications, regulatory requirements, or disaster recovery.

Getting started with Amazon EC2 Capacity Blocks for ML
To reserve your Capacity Blocks, choose Capacity Reservations on the Amazon EC2 console in the US East (Ohio) Region. You can see two capacity reservation options. Select Purchase Capacity Blocks for ML and then Get started to start looking for an EC2 Capacity Block.

Choose your total capacity and specify how long you need the EC2 Capacity Block. You can reserve an EC2 Capacity Block in the following sizes: 1, 2, 4, 8, 16, 32, or 64 p5.48xlarge instances. The total number of days that you can reserve EC2 Capacity Blocks is 1– 14 days in 1-day increments. EC2 Capacity Blocks can be purchased up to 8 weeks in advance.

EC2 Capacity Block prices are dynamic and depend on total available supply and demand at the time you purchase the EC2 Capacity Block. You can adjust the size, duration, or date range in your specifications to search for other EC2 Capacity Block options. When you select Find Capacity Blocks, AWS returns the lowest-priced offering available that meets your specifications in the date range you have specified. At this point, you will be shown the price for the EC2 Capacity Block.

After reviewing EC2 Capacity Blocks details, tags, and total price information, choose Purchase. The total price of an EC2 Capacity Block is charged up front, and the price does not change after purchase. The payment will be billed to your account within 12 hours after you purchase the EC2 Capacity Blocks.

All EC2 Capacity Blocks reservations start at 11:30 AM Coordinated Universal Time (UTC). EC2 Capacity Blocks can’t be modified or canceled after purchase.

You can also use AWS Command Line Interface (AWS CLI) and AWS SDKs to purchase EC2 Capacity Blocks. Use the describe-capacity-block-offerings API to provide your cluster requirements and discover an available EC2 Capacity Block for purchase.

$ aws ec2 describe-capacity-block-offerings \
          --instance-type p5.48xlarge \
          --instance-count 4 \
          --start-date-range 2023-10-30T00:00:00Z \
          --end-date-range 2023-11-01T00:00:00Z \
          –-capacity-duration 48

After you find an available EC2 Capacity Block with the CapacityBlockOfferingId and capacity information from the preceding command, you can use purchase-capacity-block-reservation API to purchase it.

$ aws ec2 purchase-capacity-block-reservation \
          --capacity-block-offering-id cbr-0123456789abcdefg \
          –-instance-platform Linux/UNIX

For more information about new EC2 Capacity Blocks APIs, see the Amazon EC2 API documentation.

Your EC2 Capacity Block has now been scheduled successfully. On the scheduled start date, your EC2 Capacity Block will become active. To use an active EC2 Capacity Block on your starting date, choose the capacity reservation ID for your EC2 Capacity Block. You can see a breakdown of the reserved instance capacity, which shows how the capacity is currently being utilized in the Capacity details section.

To launch instances into your active EC2 Capacity Block, choose Launch instances and follow the normal process of launching EC2 instances and running your ML workloads.

In the Advanced details section, choose Capacity Blocks as the purchase option and select the capacity reservation ID of the EC2 Capacity Block you’re trying to target.

As your EC2 Capacity Block end time approaches, Amazon EC2 will emit an event through Amazon EventBridge, letting you know your reservation is ending soon so you can checkpoint your workload. Any instances running in the EC2 Capacity Block go into a shutting-down state 30 minutes before your reservation ends. The amount you were charged for your EC2 Capacity Block does not include this time period. When your EC2 Capacity Block expires, any instances still running will be terminated.

Now available
Amazon EC2 Capacity Blocks are now available for p5.48xlarge instances in the AWS US East (Ohio) Region. You can view the price of an EC2 Capacity Block before you reserve it, and the total price of an EC2 Capacity Block is charged up-front at the time of purchase. For more information, see the EC2 Capacity Blocks pricing page.

To learn more, see the EC2 Capacity Blocks documentation and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

AWS Weekly Roundup – re:Post Selections, SNS and SQS FIFO improvements, multi-VPC ENI attachments, and more – October 30, 2023

2023-10-30 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-repost-selections-sns-and-sqs-fifo-improvements-multi-vpc-eni-attachments-and-more-october-30-2023/

It’s less than a month to AWS re:Invent, but interesting news doesn’t slow down in the meantime. This week is my turn to help keep you up to date!

Last week’s launches
Here are some of the launches that caught my attention last week:

AWS re:Post – With re:Post, you have access to a community of experts that helps you become even more successful on AWS. With Selections, community members can organize knowledge in an aggregated view to create learning paths or curated content sets.

Amazon SNS – First-in-First-out (FIFO) topics now support the option to store and replay messages without needing to provision a separate archival resource. This improves the durability of your event-driven applications and can help you recover from downstream failure scenarios. Find out more in this AWS Comput Blog post – Archiving and replaying messages with Amazon SNS FIFO. Also, you can now use custom data identifiers to protect not only common sensitive data (such as names, addresses, and credit card numbers) but also domain-specific sensitive data, such as your company’s employee IDs. You can find additional info on this feature in this AWS Security blog post – Mask and redact sensitive data published to Amazon SNS using managed and custom data identifiers.

Amazon SQS – With the increased throughput quota for FIFO high throughput mode, you can process up to 18,000 transactions per second, per API action. Note the throughput quota depends on the AWS Region.

Amazon OpenSearch Service – OpenSearch Serverless now supports automated time-based data deletion with new index lifecycle policies. To determine the best strategy to deliver accurate and low latency vector search queries, OpenSearch can now intelligently evaluate optimal filtering strategies, like pre-filtering with approximate nearest neighbor (ANN) or filtering with exact k-nearest neighbor (k-NN). Also, OpenSearch Service now supports Internet Protocol Version 6 (IPv6).

Amazon EC2 – With multi-VPC ENI attachments, you can launch an instance with a primary elastic network interface (ENI) in one virtual private cloud (VPC) and attach a secondary ENI from another VPC. This helps maintain network-level segregation, but still allows specific workloads (like centralized appliances and databases) to communicate between them.

AWS CodePipeline – With parameterized pipelines, you can dynamically pass input parameters to a pipeline execution. You can now start a pipeline execution when a specific git tag is applied to a commit in the source repository.

Amazon MemoryDB – Now supports Graviton3-based R7g nodes that deliver up to 28 percent increased throughput compared to R6g. These nodes also deliver higher networking bandwidth.

Other AWS news
Here are a few posts from some of the other AWS and cloud blogs that I follow:

Networking & Content Delivery Blog – Some of the technical management and hardware decisions we make when building AWS network infrastructure: A Continuous Improvement Model for Interconnects within AWS Data Centers

DevOps Blog – To help enterprise customers understand how many of developers use CodeWhisperer, how often they use it, and how often they accept suggestions: Introducing Amazon CodeWhisperer Dashboard and CloudWatch Metrics

Front-End Web & Mobile Blog – How to restrict access to your GraphQL APIs to consumers within a private network: Architecture Patterns for AWS AppSync Private APIs

Architecture Blog – Another post in this super interesting series: Let’s Architect! Designing systems for stream data processing

From Community.AWS: Load Testing WordPress Amazon Lightsail Instances and Future-proof Your .NET Apps With Foundation Model Choice and Amazon Bedrock.

Don’t miss the latest AWS open source newsletter by my colleague Ricardo.

Upcoming AWS events
Check your calendars and sign up for these AWS events

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Jaipur (November 4), Vadodara (November 4), Brasil (November 4), Central Asia (Kazakhstan, Uzbekistan, Kyrgyzstan, and Mongolia on November 17-18), and Guatemala (November 18).

AWS re:Invent (November 27 – December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Browse the session catalog and attendee guides and check out the highlights for generative AI.

Here you can browse all upcoming AWS-led in-person and virtual events and developer-focused events.

And that’s all from me for this week. On to the next one!

— Danilo

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Weekly Roundup – CodeWhisperer, CodeCatalyst, RDS, Route53, and more – October 24, 2023

2023-10-24 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-codewhisperer-codecatalyst-rds-route53-and-more-october-23-2023/

The entire AWS News Blog team is fully focused on writing posts to announce the new services and features during our annual customer conference in Las Vegas, AWS re:Invent! And while we prepare content for you to read, our services teams continue to innovate. Here is my summary of last week’s launches.

Last week’s launches
Here are some of the launches that captured my attention:

Amazon CodeCatalyst – You can now add a cron expression to trigger a CI/CD workflow, providing a way to start workflows at set times. CodeCatalyst is a unified development service that integrates a project’s collaboration tools, CI/CD pipelines, and development and deployment environments.

Amazon Route53 – You can now route your customer’s traffic to their closest AWS Local Zones to improve application performance for latency-sensitive workloads. Learn more about geoproximity routing in the Route53 documentation.

Amazon RDS – The root certificates we use to sign your databases’ TLS certificates will expire in 2024. You must generate new certificates for your databases before the expiration date. This blog post details the procedure step by step. The new root certificates we generated are valid for the next 40 years for RSA2048 and 100 years for the RSA4098 and ECC384. It is likely this is the last time in your professional career that you are obliged to renew your database certificates for AWS.

Amazon MSK – Replicating Kafka clusters at scale is difficult and often involves managing the infrastructure and the replication solution by yourself. We launched Amazon MSK Replicator, a fully managed replication solution for your Kafka clusters, in the same or across multiple AWS Regions.

Amazon CodeWhisperer – We launched a preview for an upcoming capability of Amazon CodeWhisperer Professional. You can now train CodeWhisperer on your private code base. It allows you to give your organization’s developers more relevant suggestions to better assist them in their day-to-day coding against your organization’s private libraries and frameworks.

Amazon EC2 – The seventh generation of memory-optimized EC2 instances is available (R7i). These instances use the 4th Generation Intel Xeon Scalable Processors (Sapphire Rapids). This family of instances provides up to 192 vCPU and 1,536 GB of memory. They are well-suited for memory-intensive applications such as in-memory databases or caches.

X in Y – We launched existing services and instance types in additional Regions:

Amazon Bedrock is now available in Europe (Frankfurt). This is important for customers in Europe because they often have to ensure their data stays in the European Union. You can now embed generative AI functionalities and access to large language models in your applications with the assurance that the prompts and customizations will stay in Europe.
Amazon EC2 extended its footprint for multiple families of instances: m6gd instances are now available in Canada (Central) and South America (São Paulo), c6a in Canada (Central), m6a in Canada (Central) and Europe (Milan), and r6a instances in US West (N. California) and Asia Pacific (Singapore). Finally, m6id instances are now available in Europe (Zurich).
Amazon EMR managed scaling is now available in Asia Pacific (Jakarta).

Other AWS news
Here are some other blog posts and news items that you might like:

The Community.AWS blog has new posts to teach you how to integrate Amazon Bedrock inside your Java and Go applications, and my colleague Brooke wrote a survival guide for re:Invent first-timers.

Some other great sources of AWS news include:

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Jaipur (November 4), Vadodara (November 4), and Brasil (November 4).

AWS Innovate: Every Application Edition – Join our free online conference to explore cutting-edge ways to enhance security and reliability, optimize performance on a budget, speed up application development, and revolutionize your applications with generative AI. Register for AWS Innovate Online Asia Pacific & Japan on October 26.

AWS re:Invent (November 27 – December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Browse the session catalog and attendee guides and check out the re:Invent highlights for generative AI.

You can browse all upcoming in-person and virtual events.

And that’s all for me today. I’ll go back writing my re:Invent blog posts.

Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Maintaining a local copy of your data in AWS Local Zones

2023-10-19 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/maintaining-a-local-copy-of-your-data-in-aws-local-zones/

This post is written by Leonardo Solano, Senior Hybrid Cloud SA and Obed Gutierrez, Solutions Architect, Enterprise.

This post covers data replication strategies to back up your data into AWS Local Zones. These strategies include database replication, file based and object storage replication, and partner solutions for Amazon Elastic Compute Cloud (Amazon EC2).

Customers running workloads in AWS Regions are likely to require a copy of their data in their operational location for either their backup strategy or data residency requirements. To help with these requirements, you can use Local Zones.

Local Zones is an AWS infrastructure deployment that places compute, storage, database, and other select AWS services close to large population and industry centers. With Local Zones, customers can build and deploy workloads to comply with state and local data residency requirements in sectors such as healthcare, financial services, gaming, and government.

Solution overview

This post assumes the database source is Amazon Relational Database Service (Amazon RDS). To backup an Amazon RDS database to Local Zones, there are three options:

Figure 1. Amazon RDS replication to Local Zones with AWS DMS

To replicate data, AWS DMS needs a source and a target database. The source database should be your existing Amazon RDS database. The target database is placed in an EC2 instance in the Local Zone. A replication job is created in AWS DMS, which maintains the source and target databases in sync. The replicated database in the Local Zone can be accessed through a VPN. Your database administrator can directly connect to the database engine with your preferred tool.

With this architecture, you can maintain a locally accessible copy of your databases, allowing you to comply with regulatory requirements.

Prerequisites

The following prerequisites are required before continuing:

An AWS Account with Administrator permissions;
Installation of the latest version of AWS Command Line Interface (AWS CLI v2);
An Amazon RDS database.

Walkthrough

1. Enabling Local Zones

First, you must enable Local Zones. Make sure that the intended Local Zone is parented to the AWS Region where the environment is running. Edit the commands to match your parameters, group-name makes reference to your local zone group and region to the region identifier to use.

aws ec2 modify-availability-zone-group \
  --region us-east-1 \
  --group-name us-east-1-qro-1\
  --opt-in-status opted-in

If you have an error when calling the ModifyAvailabilityZoneGroup operation, you must sign up for the Local Zone.

After enabling the Local Zone, you must extend the VPC to the Local Zone by creating a subnet in the Local Zone:

aws ec2 create-subnet \
  --region us-east-1 \
  --availability-zone us-east-1-qro-1a \
  --vpc-id vpc-02a3eb6585example \
  --cidr-block my-subnet-cidr

If you need a step-by-step guide, refer to Getting started with AWS Local Zones. Enabling Local Zones is free of charge. Only deployed services in the Local Zone incur billing.

2. Set up your target database

Now that you have the Local Zone enabled with a subnet, set up your target database instance in the Local Zone subnet that you just created.

You can use AWS CLI to launch it as an EC2 instance:

aws ec2 run-instances \
  --region us-east-1 \
  --subnet-id subnet-08fc749671example \
  --instance-type t3.medium \
  --image-id ami-0abcdef123example \
  --security-group-ids sg-0b0384b66dexample \
  --key-name my-key-pair

You can verify that your EC2 instance is running with the following command:

aws ec2 describe-instances --filters "Name=availability-zone,Values=us-east-1-qro-1a" --query "Reservations[].Instances[].InstanceId"

Output:

$ ["i-0cda255374example"]

Note that not all instance types are available in Local Zones. You can verify it with the following AWS CLI command:

aws ec2 describe-instance-type-offerings --location-type "availability-zone" \
--filters Name=location,Values=us-east-1-qro-1a --region us-east-1

Once you have your instance running in the Local Zone, you can install the database engine matching your source database. Here is an example of how to install MariaDB:

Updates all packages to the latest OS versionsudo yum update -y
Install MySQL server on your instance, this also creates a systemd servicesudo yum install -y mariadb-server
Enable the service created in previous stepsudo systemctl enable mariadb
Start the MySQL server service on your Amazon Linux instancesudo systemctl start mariadb
Set root user password and improve your DB securitysudo mysql_secure_installation

You can confirm successful installation with these commands:

mysql -h localhost -u root -p
SHOW DATABASES;

3. Configure databases for replication

In order for AWS DMS to replicate ongoing changes, you must use change data capture (CDC), as well as set up your source and target database accordingly before replication:

Source database:

Make sure that the binary logs are available to AWS DMS:

call mysql.rds_set_configuration('binlog retention hours', 24);

Set the binlog_format parameter to “ROW“.
Set the binlog_row_image parameter to “Full“.
If you are using Read replica as source, then set the log_slave_updates parameter to TRUE.

For detailed information, refer to Using a MySQL-compatible database as a source for AWS DMS, or sources for your migration if your database engine is different.

Target database:

Create a user for AWS DMS that has read/write privileges to the MySQL-compatible database. To create the necessary privileges, run the following commands.

CREATE USER ''@'%' IDENTIFIED BY '';
GRANT ALTER, CREATE, DROP, INDEX, INSERT, UPDATE, DELETE, SELECT ON .* TO 
''@'%';
GRANT ALL PRIVILEGES ON awsdms_control.* TO ''@'%';

Disable foreign keys on target tables, by adding the next command in the Extra connection attributes section of the AWS DMS console for your target endpoint.

Initstmt=SET FOREIGN_KEY_CHECKS=0;

Set the database parameter local_infile = 1 to enable AWS DMS to load data into the target database.

4. Set up AWS DMS

Now that you have our Local Zone enabled with the target database ready and the source database configured, you can set up AWS DMS Replication instance.

Go to AWS DMS in the AWS Management Console, and under Migrate data select Replication Instances, then select the Create Replication button:

This shows the Create replication Instance, where you should fill up the parameters required:

Note that High Availability is set to Single-AZ, as this is a test workload, while Multi-AZ is recommended for Production workloads.

Refer to the AWS DMS replication instance documentation for details about how to size your replication instance.

Important note

To allow replication, make sure that you set up the replication instance in the VPC that your environment is running, and configure security groups from and to the source and target database.

Now you can create the DMS Source and Target endpoints:

5. Set up endpoints

Source endpoint:

In the AWS DMS console, select Endpoints, select the Create endpoint button, and select Source endpoint option. Then, fill the details required:

Make sure you select your RDS instance as Source by selecting the check box as show in the preceding figure. Moreover, include access to endpoint database details, such as user and password.

You can test your endpoint connectivity before creating it, as shown in the following figure:

If your test is successful, then you can select the Create endpoint button.

Target endpoint:

In the same way as the Source in the console, select Endpoints, select the Create endpoint button, and select Target endpoint option, then enter the details required, as shown in the following figure:

In the Access to endpoint database section, select Provide access information manually option, next add your Local Zone target database connection details as shown below. Notice that Server name value, should be the IP address of your target database.

Make sure you go to the bottom of the page and configure Extra connection attributes in the Endpoint settings, as described in the Configure databases for replication section of this post:

Like the source endpoint, you can test your endpoint connection before creating it.

6. Create the replication task

Once the endpoints are ready, you can create the migration task to start the replication. Under the Migrate Data section, select Database migration tasks, hit the Create task button, and configure your task:

Select Migrate existing data and replicate ongoing changes in the Migration type parameter.

Enable Task logs under Task Settings. This is recommended as it can help you with troubleshooting purposes.

In Table mappings, include the schema you want to replicate to the Local Zone database:

Once you have defined Task Configuration, Task Settings, and Table Mappings, you can proceed to create your database migration task.

This will trigger your migration task. Now wait until the migration task completes successfully.

7. Validate replicated database

After the replication job completes the Full Load, proceed to validate at your target database. Connect to your target database and run the following commands:

USE example;
SHOW TABLES;

As a result you should see the same tables as the source database.

MySQL [example]> SHOW TABLES;
+----------------------------+
| Tables_in_example          |
+----------------------------+
| actor                      |
| address                    |
| category                   |
| city                       |
| country                    |
| customer                   |
| customer_list              |
| film                       |
| film_actor                 |
| film_category              |
| film_list                  |
| film_text                  |
| inventory                  |
| language                   |
| nicer_but_slower_film_list |
| payment                    |
| rental                     |
| sales_by_film_category     |
| sales_by_store             |
| staff                      |
| staff_list                 |
| store                      |
+----------------------------+
22 rows in set (0.06 sec)

If you get the same tables from your source database, then congratulations, you’re set! Now you can maintain and navigate a live copy of database in the Local Zone for data residency purposes.

Clean up

When you have finished this tutorial, you can delete all the resources that have been deployed. You can do this in the Console or by running the following commands in the AWS CLI:

Delete target DB:

aws ec2 terminate-instances --instance-ids i-abcd1234

Decommision AWS DMS

Replication Task:

aws dms delete-replication-task --replication-task-arn arn:aws:dms:us-east-1:111111111111:task:K55IUCGBASJS5VHZJIIEXAMPLE

Endpoints:

aws dms delete-endpoint --endpoint-arn arn:aws:dms:us-east-1:111111111111:endpoint:OUJJVXO4XZ4CYTSEG5XEXAMPLE

Replication instance:

aws dms delete-replication-instance --replication-instance-arn us-east-1:111111111111:rep:T3OM7OUB5NM2LCVZF7JEXAMPLE

Delete Local Zone subnet

aws ec2 delete-subnet --subnet-id subnet-9example

Conclusion

Local Zones is a useful tool for running applications with low latency requirements or data residency regulations. In this post, you have learned how to use AWS DMS to seamlessly replicate your data to Local Zones. With this architecture you can efficiently maintain a local copy of your data in Local Zones and access it securly.

If you are interested on how to automate your workloads deployments in Local Zones, make sure you check this workshop.

Enabling highly available connectivity from on premises to AWS Local Zones

2023-10-18 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/enabling-highly-available-connectivity-from-on-premises-to-aws-local-zones/

This post is written by Leonardo Solano, Senior Hybrid Cloud SA and Robert Belson SA Developer Advocate.

Planning your network topology is a foundational requirement of the reliability pillar of the AWS Well-Architected Framework. REL02-BP02 defines how to provide redundant connectivity between private networks in the cloud and on-premises environments using AWS Direct Connect for resilient, redundant connections using AWS Site-to-Site VPN, or AWS Direct Connect failing over to AWS Site-to-Site VPN. As more customers use a combination of on-premises environments, Local Zones, and AWS Regions, they have asked for guidance on how to extend this pillar of the AWS Well-Architected Framework to include Local Zones. As an example, if you are on an application modernization journey, you may have existing Amazon EKS clusters that have dependencies on persistent on-premises data.

AWS Local Zones enables single-digit millisecond latency to power applications such as real-time gaming, live streaming, augmented and virtual reality (AR/VR), virtual workstations, and more. Local Zones can also help you meet data sovereignty requirements in regulated industries such as healthcare, financial services, and the public sector. Additionally, enterprises can leverage a hybrid architecture and seamlessly extend their on-premises environment to the cloud using Local Zones. In the example above, you could extend Amazon EKS clusters to include node groups in a Local Zone (or multiple Local Zones) or on premises using AWS Outpost rack.

To provide connectivity between private networks in Local Zones and on-premises environments, customers typically consider Direct Connect or software VPNs available in the AWS Marketplace. This post provides a reference implementation to eliminate single points of failure in connectivity while offering automatic network impairment detection and intelligent failover using both Direct Connect and software VPNs in AWS Market place. Moreover, this solution minimizes latency by ensuring traffic does not hairpin through the parent AWS Region to the Local Zone.

Solution overview

In Local Zones, all architectural patterns based on AWS Direct Connect follow the same architecture as in AWS Regions and can be deployed using the AWS Direct Connect Resiliency Toolkit. As of the date of publication, Local Zones do not support AWS managed Site-to-Site VPN (view latest Local Zones features). Thus, for customers that have access to only a single Direct Connect location or require resiliency beyond a single connection, this post will demonstrate a solution using an AWS Direct Connect failover strategy with a software VPN appliance. You can find a range of third-party software VPN appliances as well as the throughput per VPN tunnel that each offering provides in the AWS Marketplace.

Prerequisites:

To get started, make sure that your account is opt-in for Local Zones and configure the following:

Extend a Virtual Private Cloud (VPC) from the Region to the Local Zone, with at least 3 subnets. Use Getting Started with AWS Local Zones as a reference.
1. Public subnet in Local Zone (public-subnet-1)
2. Private subnets in Local Zone (private-subnet-1 and private-subnet-2)
3. Private subnet in the Region (private-subnet-3)
4. Modify DNS attributes in your VPC, including both “enableDnsSupport” and “enableDnsHostnames”;
Attach an Internet Gateway (IGW) to the VPC;
Attach a Virtual Private Gateway (VGW) to the VPC;
Create an ec2 vpc-endpoint attached to the private-subnet-3;
Define the following routing tables (RTB):
1. Private-subnet-1 RTB: enabling propagation for VGW;
2. Private-subnet-2 RTB: enabling propagation for VGW;
3. Public-subnet-1 RTB: with a default route with IGW-ID as the next hop;
Configure a Direct Connect Private Virtual Interface (VIF) from your on-premises environment to Local Zones Virtual Gateway’s VPC. For more details see this post: AWS Direct Connect and AWS Local Zones interoperability patterns;
Launch any software VPN appliance from AWS Marketplace on Public-subnet-1. In this blog post on simulating Site-to-Site VPN customer gateways using strongSwan, you can find an example that provides the steps to deploy a third-party software VPN in AWS Region;
Capture the following parameters from your environment:
1. Software VPN Elastic Network Interface (ENI) ID
2. Private-subnet-1 RTB ID
3. Probe IP, which must be an on-premises resource that can respond to Internet Control Message Protocol (ICMP) requests.

High level architecture

This architecture requires a utility Amazon Elastic Compute Cloud (Amazon EC2) instance in a private subnet (private-subnet-2), sending ICMP probes over the Direct Connect connection. Once the utility instance detects lost packets to on-premises network from the Local Zone it initiates a failover by adding a static route with the on-premises CIDR range as the destination and the VPN Appliance ENI-ID as the next hop in the production private subnet (private-subnet-1), taking priority over the Direct Connect propagated route. Once healthy, this utility will revert back to the default route to the original Direct Connect connection.

On-premises considerations

To add redundancy in the on-premises environment, you can use two routers using any First Hop Redundancy Protocol (FHRP) as Hot Standby Router Protocol (HSRP) or Virtual Router Redundancy Protocol (VRRP). The router connected to the Direct Connect link has the highest priority, taking the Primary role in the FHRP process while the VPN router remain the Secondary router. The failover mechanism in the FHRP relies on interface or protocol state as BGP, which triggers the failover mechanism.

Figure 1. High level HA architecture for Software VPN

Figure 2. Failover by modifying the production subnet RTB

Step-by-step deployment

Create IAM role with permissions to create and delete routes in your private-subnet-1 route table:

Create ec2-role-trust-policy.json file on your local machine:

cat > ec2-role-trust-policy.json <<EOF
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ec2.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
EOF

Create your EC2 IAM role, such as my_ec2_role:

aws iam create-role --role-name my_ec2_role --assume-role-policy-document file://ec2-role-trust-policy.json

Create a file with the necessary permissions to attach to the EC2 IAM role. Name it ec2-role-iam-policy.json.

aws iam create-policy --policy-name my-ec2-policy --policy-document file://ec2-role-iam-policy.json

Create the IAM policy and attach the policy to the IAM role my_ec2_role that you previously created:

aws iam create-policy --policy-name my-ec2-policy --policy-document file://ec2-role-iam-policy.json

aws iam attach-role-policy --policy-arn arn:aws:iam::<account_id>:policy/my-ec2-policy --role-name my_ec2_role

Create an instance profile and attach the IAM role to it:

aws iam create-instance-profile –instance-profile-name my_ec2_instance_profile
aws iam add-role-to-instance-profile –instance-profile-name my_ec2_instance_profile –role-name my_ec2_role

Launch and configure your utility instance

Capture the Amazon Linux 2 AMI ID through CLI:

aws ec2 describe-images --filters "Name=name,Values=amzn2-ami-kernel-5.10-hvm-2.0.20230404.1-x86_64-gp2" | grep ImageId

Sample output:

"ImageId": "ami-069aabeee6f53e7bf",

Create an EC2 key for the utility instance:

aws ec2 create-key-pair --key-name MyKeyPair --query 'KeyMaterial' --output text > MyKeyPair.pem

Launch the utility instance in the Local Zone (replace the variables with your account and environment parameters):

aws ec2 run-instances --image-id ami-069aabeee6f53e7bf --key-name MyKeyPair --count 1 --instance-type t3.medium  --subnet-id <private-subnet-2-id> --iam-instance-profile Name=my_ec2_instance_profile_linux

Deploy failover automation shell script on the utility instance

Create the following shell script in your utility instance (replace the health check variables with your environment values):

cat > vpn_monitoring.sh <<EOF
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: MIT-0
# Health Check variables
Wait_Between_Pings=2
RTB_ID=<private-subnet-1-rtb-id>
PROBE_IP=<probe-ip>
remote_cidr=<remote-cidr>
GW_ENI_ID=<software-vpn-eni_id>
Active_path=DX

echo `date` "-- Starting VPN monitor"

while [ . ]; do
  # Check health of main VPN Appliance path to remote probe ip
  pingresult=`ping -c 3 -W 1 $PROBE_IP | grep time= | wc -l`
  # Check to see if any of the health checks succeeded
  if ["$pingresult" == "0"]; then
    if ["$Active_path" == "DX"]; then
      echo `date` "-- Direct Connect failed. Failing over vpn"
      aws ec2 create-route --route-table-id $RTB_ID --destination-cidr-block $remote_cidr --network-interface-id $GW_ENI_ID --region us-east-1
      Active_path=VPN
      DX_tries=10
      echo "probe_ip: unreachable – active_path: vpn"
    else
      echo "probe_ip: unreachable – active_path: vpn"
    fi
  else     
    if ["$Active_path" == "VPN"]; then
      let DX_tries=DX_tries-1
      if ["$DX_tries" == "0"]; then
        echo `date` "-- failing back to Direct Connect"
        aws ec2 delete-route --route-table-id $RTB_ID --destination-cidr-block $remote_cidr --region us-east-1
        Active_path=DX
        echo "probe_ip: reachable – active_path: Direct Connect"
      else
        echo "probe_ip: reachable – active_path: vpn"
      fi
    else
      echo "probe:ip: reachable – active_path: Direct Connect"	    
    fi
  fi    
done EOF

Modify permissions to your shell script file:

chmod +x vpn_monitoring.sh

Start the shell script:

./vpn_monitoring.sh

Test the environment

Figure 3. Failover process between Direct Connect and software VPN

Simulate failure of the Direct Connect link, breaking the available path from the Local Zone to the on-premises environment. You can simulate the failure using the failure test feature in Direct Connect console.

Figure 4. Bringing BGP session down

Figure 5. Setting the failure time

In the utility instance you will see the following logs:

Thu Sep 21 14:39:34 UTC 2023 -- Direct Connect failed. Failing over vpn

The shell script in action will detect packet loss by ICMP probes against a probe IP destination on premises, triggering the failover process. As a result, it will make an API call (aws ec2 create-route) to AWS using the EC2 interface endpoint.

The script will create a static route in the private-subnet-1-RTB toward on-premises CIDR with the VPN Elastic-Network ID as the next hop.

Figure 6. private-subnet-1-RTB during the test

The FHRP mechanisms detect the failure in the Direct Connect Link and then reduce the FHRP priority on this path, which triggers the failover to the secondary link through the VPN path.

Once you cancel the test or the test finishes, the failback procedure will revert the private-subnet-1 route table to its initial state, resulting in the following logs to be emitted by the utility instance:

Thu Sep 21 14:42:34 UTC 2023 -- failing back to Direct Connect

Figure 7. private-subnet-1 route table initial state

Cleaning up

To clean up your AWS based resources, run following AWS CLI commands:

aws ec2 terminate-instances --instance-ids <your-utility-instance-id>
aws iam delete-instance-profile --instance-profile-name my_ec2_instance_profile
aws iam delete-role my_ec2_role

Conclusion

This post demonstrates how to create a failover strategy for Local Zones using the same resilience mechanisms already established in the AWS Regions. By leveraging Direct Connect and software VPNs, you can achieve high availability in scenarios where you are constrained to a single Direct Connect location due to geographical limitations. In the architectural pattern illustrated in this post, the failover strategy relies on a utility instance with least-privileged permissions. The utility instance identifies network impairment and dynamically modify your production route tables to keep the connectivity established from a Local Zone to your on-premises location. This same mechanism provides capabilities to automatically failback from the software VPN to Direct Connect once the utility instance validates that the Direct Connect Path is sufficiently reliable to avoid network flapping. To learn more about Local Zones, you can visit the AWS Local Zones user guide.

Training machine learning models on premises for data residency with AWS Outposts rack

2023-10-17 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/training-machine-learning-models-on-premises-for-data-residency-with-aws-outposts-rack/

This post is written by Sumit Menaria, Senior Hybrid Solutions Architect, and Boris Alexandrov, Senior Product Manager-Tech.

In this post, you will learn how to train machine learning (ML) models on premises using AWS Outposts rack and datasets stored locally in Amazon S3 on Outposts. With the rise in data sovereignty and privacy regulations, organizations are seeking flexible solutions that balance compliance with the agility of cloud services. Healthcare and financial sectors, for instance, harness machine learning for enhanced patient care and transaction safety, all while upholding strict confidentiality. Outposts rack provide a seamless hybrid solution by extending AWS capabilities to any on-premises or edge location, providing you the flexibility to store and process data wherever you choose. Data sovereignty regulations are highly nuanced and vary by country. This blog post addresses data sovereignty scenarios where training datasets need to be stored and processed in a geographic location without an AWS Region.

Amazon S3 on Outposts

As you prepare datasets for ML model training, a key component to consider is the storage and retrieval of your data, especially when adhering to data residency and regulatory requirements.

You can store training datasets as object data in local buckets with Amazon S3 on Outposts. In order to access S3 on Outposts buckets for data operations, you need to create access points and route the requests via an S3 on Outposts endpoint associated with your VPC. These endpoints are accessible both from within the VPC as well as on premises via the local gateway.

Solution overview

Using this sample architecture, you are going to train a YOLOv5 model on a subset of categories of the Common Objects in Context (COCO) dataset. The COCO dataset is a popular choice for object detection tasks offering a wide variety of image categories with rich annotations. It is also available under the AWS Open Data Sponsorship Program via fast.ai datasets.

This example is based on an architecture using an Amazon Elastic Compute Cloud (Amazon EC2) g4dn.8xlarge instance for model training on the Outposts rack. Depending on your Outposts rack compute configuration, you can use different instance sizes or types and make adjustments to training parameters, such as learning rate, augmentation, or model architecture accordingly. You will be using the AWS Deep Learning AMI to launch your EC2 instance, which comes with frameworks, dependencies, and tools to accelerate deep learning in the cloud.

For the training dataset storage, you are going to use an S3 on Outposts bucket and connect to it from your on-premises location via the Outposts local gateway. The local gateway routing mode can be direct VPC routing or Customer-owned IP (CoIP) depending on your workload’s requirements. Your local gateway routing mode will determine the S3 on Outposts endpoint configuration that you need to use.

1. Download and populate training dataset

You can download the training dataset to your local client machine using the following AWS CLI command:

aws s3 sync s3://fast-ai-coco/ .

After downloading, unzip annotations_trainval2017.zip, val2017.zip and train2017.zip files.

$ unzip annotations_trainval2017.zip
$ unzip val2017.zip
$ unzip train2017.zip

In the annotations folder, the files which you need to use are instances_train2017.json and instances_val2017.json, which contain the annotations corresponding to the images in the training and validation folders.

2. Filtering and preparing training dataset

You are going to use the training, validation, and annotation files from the COCO dataset. The dataset contains over 100K images across 80 categories, but to keep the training simple, you can focus on 10 specific categories of popular food items in supermarket shelves: banana, apple, sandwich, orange, broccoli, carrot, hot dog, pizza, donut, and cake. (Because who doesn’t like a bite after a model training.) Applications for training such models could be self-stock monitoring, automatic checkouts, or product placement optimization using computer vision in retail stores. Since YOLOv5 uses a specific annotations (labels) format, you need to convert the COCO dataset annotation to the target annotation.

3. Load training dataset to S3 on Outposts bucket

In order to load the training data on S3 on Outposts you need to first create a new bucket using the AWS Console or CLI, as well as an access point and endpoint for the VPC. You can use a bucket style access point alias to load the data, using the following CLI command:

$ cd /your/local/target/upload/path/
$ aws s3 sync . s3://trainingdata-o0a2b3c4d5e6d7f8g9h10f--op-s3

Replace the alias in the above CLI command with corresponding bucket alias name for your environment. The s3 sync command syncs the folders in the same structure containing the images and labels for the training and validation data, which you will be using later for loading it to the EC2 instance for model training.

4. Launch the EC2 instance

You can launch the EC2 instance with the Deep Learning AMI based on this getting started tutorial. For this exercise, the Deep Learning AMI GPU PyTorch 2.0.1 (Ubuntu 20.04) has been used.

5. Download YOLOv5 and install dependencies

Once you ssh into the EC2 instance, activate the pre-configured PyTorch environment and clone the YOLOv5 repository.

$ ssh -i /path/key-pair-name.pem ubuntu@instance-ip-address
$ conda activate pytorch
$ git clone https://github.com/ultralytics/yolov5.git
$ cd yolov5

Then, and install its necessary dependencies.

$ pip install -U -r requirements.txt

To ensure the compatibility between various packages, you may need to modify existing packages on your instance running the AWS Deep Learning AMI.

6. Load the training dataset from S3 on Outposts to the EC2 instance

For copying the training dataset to the EC2 instance, use the s3 sync CLI command and point it to your local workspace.

aws s3 sync s3://trainingdata-o0a2b3c4d5e6d7f8g9h10f--op-s3 .

7. Prepare the configuration files

Create the data configuration files to reflect your dataset’s structure, categories, and other parameters.
data.yml

train: /your/ec2/path/to/data/images/train 
val: /your/ec2/path/to/data/images/val 
nc: 10 # Number of classes in your dataset 
names: ['banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake']

Create the model training parameter file using the sample configuration file from the YOLOv5 repository. You will need to update the number of classes to 10, but you can also change other parameters as you fine tune the model for performance.

parameters.yml:

# Parameters
nc: 10 # number of classes in your dataset
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32

# Backbone
backbone:
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]

# Head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13

[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)

[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)

[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)

[[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)

At this stage, the directory structure should look like below:

8. Train the model

You can run the following command to train the model. The batch-size and epochs can vary depending on your vCPU and GPU configuration and you can further modify these values or add weights as you try with additional rounds of training.

$ python3 train.py —img-size 640 —batch-size 32 —epochs 50 —data /your/path/to/configuation_files/dataconfig.yaml —cfg /your/path/to/configuation_files/parameters.yaml

You can monitor the model performance as it iterates through each epoch

Starting training for 50 epochs...

Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
0/49 6.7G 0.08403 0.05 0.04359 129 640: 100%|██████████| 455/455 [06:14<00:00,
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 9/9 [00:05<0
all 575 2114 0.216 0.155 0.0995 0.0338

Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
1/49 8.95G 0.07131 0.05091 0.02365 179 640: 100%|██████████| 455/455 [06:00<00:00,
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 9/9 [00:04<00:00, 1.97it/s]
all 575 2114 0.242 0.144 0.11 0.04

Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
2/49 8.96G 0.07068 0.05331 0.02712 154 640: 100%|██████████| 455/455 [06:01<00:00, 1.26it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 9/9 [00:04<00:00, 2.23it/s]
all 575 2114 0.185 0.124 0.0732 0.0273

Once the model training finishes, you can see the validation results against the batch of validation dataset and evaluate the model’s performance using standard metrics.

Validating runs/train/exp/weights/best.pt...
Fusing layers... 
YOLOv5 summary: 157 layers, 7037095 parameters, 0 gradients, 15.8 GFLOPs
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 9/9 [00:06<00:00,  1.48it/s]
                   all        575       2114      0.282      0.222       0.16     0.0653
                banana        575        280      0.189      0.143     0.0759      0.024
                 apple        575        186      0.206      0.085     0.0418     0.0151
              sandwich        575        146      0.368      0.404      0.343      0.146
                orange        575        188      0.265      0.149     0.0863     0.0362
              broccoli        575        226      0.239      0.226      0.138     0.0417
                carrot        575        310      0.182      0.203     0.0971     0.0267
               hot dog        575        108      0.242      0.111     0.0929     0.0311
                 pizza        575        208      0.405      0.418      0.333       0.15
                 donut        575        228      0.352      0.241       0.19     0.0973
                  cake        575        234      0.369      0.235      0.203     0.0853
Results saved to runs/train/exp

Use the model for inference

In order to test the model performance, you can test it by passing a new image which is from a shelf in a supermarket with some of the objects that you trained the model on.

(pytorch) ubuntu@ip-172-31-48-165:~/workspace/source/yolov5$ python3 detect.py --weights /home/ubuntu/workspace/source/yolov5/runs/train/exp/weights/best.pt —source /home/ubuntu/workspace/inference/Inference-image.jpg
<<omitted output>>
Fusing layers...
YOLOv5 summary: 157 layers, 7037095 parameters, 0 gradients, 15.8 GFLOPs
image 1/1 /home/ubuntu/workspace/inference/Inference-image.jpg: 640x640 4 apples, 6 oranges, 1 cake, 5.3ms
Speed: 0.6ms pre-process, 5.3ms inference, 1.1ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp7

The response from the preceding model inference indicates that it predicted 4 apples, 6 oranges, and 1 cake in the image. The prediction may differ based on the image type used, and while a single sample image can give you a sense of the model’s performance, it will not provide a comprehensive understanding. For a more complete evaluation, it’s always recommended to test the model on a larger and more diverse set of validation images. Additional training and tuning of your parameters or datasets may be required to achieve better prediction.

Clean Up

You can terminate the following resources used in this tutorial after you have successfully trained and tested the model:

Terminate the EC2 instance used for the model training;
Delete the S3 on Outposts resources if you no longer need the training dataset.

Conclusion

The seamless integration of compute on AWS Outposts with S3 on Outposts, coupled with on-premises ML model training capabilities, offers organizations a robust solution to tackle data residency requirements. By setting up this environment, you can ensure that your datasets remain within desired geographies while still utilizing advanced machine learning models and cloud infrastructure. In addition to this, it remains essential to diligently review and fine-tune your implementation strategies and guard rails in place to ensure your data remains within the boundaries of your regulatory requirements. You can read more about architecting for data residency in this blog post.

Reference

COCO – Common Objects in Context – fast.ai datasets was accessed on 20-08-2023 from https://registry.opendata.aws/fast-ai-coco.