Tag Archives: AWS re:Invent

New – Site-to-Site Connectivity with AWS Direct Connect SiteLink

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/new-site-to-site-connectivity-with-aws-direct-connect-sitelink/

We are launching AWS Direct Connect SiteLink, a new capability of AWS Direct Connect that lets you create connections between your on-premises networks through the AWS global network backbone.

Until today, when you needed direct connectivity between your data centers or branch offices, you had to rely on public internet or expensive and hard-to-deploy fixed networks. These are geographically constrained and can be tied to long-term contracts. This rigidity becomes a pain point as you expand your businesses globally. In turn, you’re required to create custom workarounds to interconnect networks from different providers, which increases your operating costs.

Starting today, you may connect your sites through Direct Connect locations, without sending your traffic through an AWS Region. We have 108 Direct Connect locations available in 32 countries as I am writing this post, located across Africa, Americas, Asia-Pacific, Europe, and the Middle East. Traffic flows from one Direct Connect location to another following the shortest possible path. You no longer need to connect through the closest AWS Region and manage and configure an AWS Transit Gateway for site-to-site network connectivity.

You can take advantage of Direct Connect’s reliability and global footprint to build a network that grows with your business, with no long-term contracts, flexible pay-as-you-go pricing, and a wide range of port-speeds, from 50 Mbps to 100 Gbps. SiteLink also integrates with other AWS services, letting you reach your VPCs, other AWS services, and your on-premises networks from your Direct Connect connections.

When talking about network topology, a small diagram is always more descriptive than long phrases.

The following diagram shows the way that you use Direct Connect today. Direct Connect is currently optimized to let you reach your AWS Resources running in any Region as quickly as possible. Sending data from one Direct Connect location to another is not possible.

Once you connect your locations (NY1, AM3, Paris, and TY2 in the diagram) to a Direct Connect gateway, those connections can reach any AWS Region (except the two AWS China Regions). No peering between Regions is necessary, because Direct Connect gateways are global resources.

Site-to-site connectivity without SiteLink

The following diagram shows how you connect multiple sites using SiteLink. The data flows between Direct Connect locations without going through an AWS Region.

Site-to-site connectivity with SiteLink

How to Get Started?
Configuring these connections is very similar to what you do today. The first step is to connect my network to Direct Connect locations. After that, SiteLink can be enabled or disabled in minutes.

Using the AWS Management Console, I navigate to the Direct Connect section, and I select Create virtual interface to create a virtual interface. Under the Additional Settings section, I make sure the SiteLink switch is turned on. Obviously, I repeat this on another virtual interface, once per site, to connect.

SiteLink - enable sitelink for VIF

I have access to similar monitoring dashboards and metrics published to CloudWatch. I select my virtual interface, and then navigate to the Monitoring tab (hopefully your ViF will have more data available than mine that was created just for this post).

SiteLink VIF Monitoring

Availability and Pricing
You can connect your on-premises networks or branch offices to any of our Direct Connect locations available today, except in China.

Pricing is pay-as-you-go, with no commitment or recurring fees. In addition to existing Direct Connect charges, your monthly bill will include a price-per-hour for SiteLink virtual interfaces, as well as the cost of SiteLink data transfer. Check the pricing page to get the details.

Go ahead an start connecting your on-premises locations together with Direct Connect SiteLink!

— seb

New – Enhanced Dead-letter Queue Management Experience for Amazon SQS Standard Queues

Post Syndicated from Alex Casalboni original https://aws.amazon.com/blogs/aws/enhanced-dlq-management-sqs/

Hundreds of thousands of customers use Amazon Simple Queue Service (SQS) to build message-based applications to decouple and scale microservices, distributed systems, and serverless apps. When a message cannot be successfully processed by the queue consumer, you can configure SQS to store it in a dead-letter queue (DLQ).

As a software developer or architect, you’d like to examine and review unconsumed messages in your DLQs to figure out why they couldn’t be processed, identify patterns, resolve code errors, and ultimately reprocess these messages in the original queue. The life cycle of these unconsumed messages is part of your error-handling workflow, which is often manual and time consuming.

Today, I’m happy to announce the general availability of a new enhanced DLQ management experience for SQS standard queues that lets you easily redrive unconsumed messages from your DLQ to the source queue.

This new functionality is available in the SQS console and helps you focus on the important phase of your error handling workflow, which consists of identifying and resolving processing errors. With this new development experience, you can easily inspect a sample of the unconsumed messages and move them back to the original queue with a click, and without writing, maintaining, and securing any custom code. This new experience also takes care of redriving messages in batches, reducing overall costs.

DLQ and Lambda Processor Setup
If you’re already comfortable with the DLQ setup, then skip the setup and jump into the new DLQ redrive experience.

First, I create two queues: the source queue and the dead-letter queue.

I edit the source queue and configure the Dead-letter queue section. Here, I pick the DLQ and configure the Maximum receives, which is the number of times after which a message is reprocessed before being sent to the DLQ. For this demonstration, I’ve set it to one. This means that every failed message goes to the DLQ immediately. In a real-world environment, you might want to set a higher number depending on your requirements and based on what a failure means with respect to your application.

I also edit the DLQ to make sure that only my source queue is allowed to use this DLQ. This configuration is optional: when this Redrive allow policy is disabled, any SQS queue can use this DLQ. There are cases where you want to reuse a single DLQ for multiple queues. But usually it’s considered best practices to setup independent DLQs per source queue to simplify the redrive phase without affecting cost. Keep in mind that you’re charged based on the number of API calls, not the number of queues.

Once the DLQ is correctly set up, I need a processor. Let’s implement a simple message consumer using AWS Lambda.

The Lambda function written in Python will iterate over the batch of incoming messages, fetch two values from the message body, and print the sum of these two values.

import json

def lambda_handler(event, context):
    for record in event['Records']:
        payload = json.loads(record['body'])

        value1 = payload['value1']
        value2 = payload['value2']

        value_sum = value1 + value2
        print("the sum is %s" % value_sum)
        
    return "OK"

The code above assumes that each message’s body contains two integer values that can be summed, without dealing with any validation or error handling. As you can imagine, this will lead to trouble later on.

Before processing any messages, you must grant this Lambda function enough permissions to read messages from SQS and configure its trigger. For the IAM permissions, I use the managed policy named AWSLambdaSQSQueueExecutionRole, which grants permissions to invoke sqs:ReceiveMessage, sqs:DeleteMessage, and sqs:GetQueueAttributes.

I use the Lambda console to set up the SQS trigger. I could achieve the same from the SQS console too.

Now I’m ready to process new messages using Send and receive messages for my source queue in the SQS console. I write {"value1": 10, "value2": 5} in the message body, and select Send message.

When I look at the CloudWatch logs of my Lambda function, I see a successful invocation.

START RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1 Version: $LATEST
the sum is 15
END RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1
REPORT RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1	Duration: 1.31 ms	Billed Duration: 2 ms	Memory Size: 128 MB	Max Memory Used: 39 MB	Init Duration: 116.90 ms	

Troubleshooting powered by DLQ Redrive
Now what if a different producer starts publishing messages with the wrong format? For example, {"value1": "10", "value2": 5}. The first number is a string and this is quite likely to become a problem in my processor.

In fact, this is what I find in the CloudWatch logs:

START RequestId: 542ac2ca-1db3-5575-a1fb-98ce9b30f4b3 Version: $LATEST
[ERROR] TypeError: can only concatenate str (not "int") to str
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 8, in lambda_handler
    value_sum = value1 + value2
END RequestId: 542ac2ca-1db3-5575-a1fb-98ce9b30f4b3
REPORT RequestId: 542ac2ca-1db3-5575-a1fb-98ce9b30f4b3	Duration: 1.69 ms	Billed Duration: 2 ms	Memory Size: 128 MB	Max Memory Used: 39 MB	

To figure out what’s wrong in the offending message, I use the new SQS redrive functionality, selecting DLQ redrive in my dead-letter queue.

I use Poll for messages and fetch all unconsumed messages from the DLQ.

And then I inspect the unconsumed message by selecting it.

The problem is clear, and I decide to update my processing code to handle this case properly. In the ideal world, this is an upstream issue that should be fixed in the message producer. But let’s assume that I can’t control that system and it’s critically important for the business that I process this new type of messages.

Therefore, I update the processing logic as follows:

import json

def lambda_handler(event, context):
    for record in event['Records']:
        payload = json.loads(record['body'])
        value1 = int(payload['value1'])
        value2 = int(payload['value2'])
        value_sum = value1 + value2
        print("the sum is %s" % value_sum)
        # do some more stuff
        
    return "OK"

Now that my code is ready to process the unconsumed message, I start a new redrive task from the DLQ to the source queue.

By default, SQS will redrive unconsumed messages to the source queue. But you could also specify a different destination and provide a custom velocity to set the maximum number of messages per second.

I wait for the redrive task to complete by monitoring the redrive status in the console. This new section always shows the status of most recent redrive task.

The message has been moved back to the source queue and successfully processed by my Lambda function. Everything looks fine in my CloudWatch logs.

START RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1 Version: $LATEST
the sum is 15
END RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1
REPORT RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1	Duration: 1.31 ms	Billed Duration: 2 ms	Memory Size: 128 MB	Max Memory Used: 39 MB	Init Duration: 116.90 ms	

Available Today at No Additional Cost
Today you can start leveraging the new DLQ redrive experience to simplify your development and troubleshooting workflows, without any additional cost. This new console experience is available in all AWS Regions where SQS is available, and we’re looking forward to hearing your feedback.

Alex

Network Address Management and Auditing at Scale with Amazon VPC IP Address Manager

Post Syndicated from Steve Roberts original https://aws.amazon.com/blogs/aws/network-address-management-and-auditing-at-scale-with-amazon-vpc-ip-address-manager/

Managing, monitoring, and auditing IP address allocation for at-scale networks, as the growth in cloud workloads and connected devices continues at a rapid pace, is a complex, time-consuming, and potentially error-prone task. Traditionally, network administrators have resorted to using combinations of spreadsheets, home-grown tools, and scripts to track address assignments across multiple accounts, virtual private clouds (VPCs), and Regions. Manually updating spreadsheets when application development teams request IP address assignments takes time, and care, to avoid errors. Errors which, should they go unnoticed, can lead to address conflicts and subsequent downtime, causing serious operational and business issues. In turn, the time taken to make these updates, sometimes several days, causes delays in onboarding new applications or expanding existing applications, impacting the velocity of development teams. The need to keep those home-grown tools and scripts up-to-date and error-free also results in taking staff hours away from more strategic and business-impacting projects.

Today, I’m happy to announce Amazon VPC IP Address Manager, a new feature that provides network administrators with an automated IP management workflow. IPAM makes it easier for network administrators to organize, assign, monitor, and audit IP addresses in at-scale networks, lowering the management and monitoring burden and eliminating the manual processes that can lead to delays and unintended errors.

Amazon VPC IP Address Manager dashboard homepage

Introducing Amazon VPC IP Address Manager
IPAM enables management and auditing of IP address assignments across an organization’s accounts, Amazon Virtual Private Cloud (VPC)‘s, and AWS Regions, using a single operational dashboard. From this centralized view, you can manage your IP addresses across AWS.

In each Region in which you have resources needing IP addresses, you create a regional pool. Pools are collections of CIDRs and help you to organize your IP space. Unused address space from your top-level pools can be used to fill your regional pools. Further, if you have applications or environments with different security needs, you can create additional pools. For example, you could create different pools for ‘dev’ and ‘prod’ environments if they are subject to different connectivity requirements. The screenshots below illustrate the process of creating a global pool and, from it, three regional pools. Although my example stops after configuring regional pools, in production, you would continue subdividing the regional pools further as needed.

Creating the global IPAM pool

Next, I configure a set of regional pools. Below, I’m creating a regional pool for my US East (N. Virginia) Region resources, scoped within my global pool.

Creating a regional pool, step 1

As part of configuring a regional pool, I must specify the CIDRs to provision from the global pool and can optionally enable automatic discovery of resources and rules for allocation.

Configuring a regional pool

After repeating the process of creating and configuring regional pools for my two remaining Regions, US East (Ohio) and Europe (Ireland) in this example, this is my final pool hierarchy. As I noted above, this hierarchy ends at a regional set of pools but could be subdivided further.

IPAM pool hierarchy

Once the IPAM pools have been configured, development teams and resources needing new IP address assignments are able to make use of an automated, self-service process, unblocking the developers, and eliminating errors from using manual processes that can lead to connectivity issues. To govern IP address assignments, you can make use of automated and simple business rules. With IPAM‘s self-service model, developers can now directly create resources and receive IP addresses based on business rules in seconds, removing the delays in onboarding applications and improving the velocity of the development team. In the screenshot below, I’m referencing my pools to set the address ranges to be used when creating a new VPC.

Assigning address ranges for a new VPC from IPAM pools

You can also share your IPAM with your organization, created using AWS Organizations, and AWS Resource Access Manager (RAM). When you share your IPAM, you gain fully automated CIDR allocation to your Amazon VPCs across member accounts in your organization and Regions.

For network administrators, IPAM provides observability and auditing capabilities, helping to speed up troubleshooting, and providing oversight and monitoring of the used and unused addresses across an organization’s global network address pool using a single dashboard. For each assigned address, IPAM tracks critical information, for example, the AWS account, the VPC, routing, and the security domain, eliminating the bookkeeping work that burdens administrators. Having used IPAM to eliminate IP assignment errors, customers can use IPAM to monitor assigned addresses and receive alerts when potential issues are detected – for example, depleting IP addresses that can stall their network’s growth or overlapping IP addresses that can result in erroneous routing. You can proactively act on those alerts and fix issues before they can become major outages.

The screenshot below illustrates monitoring pool utilization across a set of VPCs.

Monitoring an IPAM pool

Utilization of address space within a pool can also be monitored. You can add Amazon CloudWatch Alarms that you can configure to trigger at your chosen utilization percentage value so that you can take proactive action before the address space is exhausted.

Monitoring pool utilization with alarms

Overlapping address spaces are another headache that network administrators need to manage, usually discovered after the fact during an outage. IPAM can help lower the burden here, too, providing a view of resources that warns of overlapping address ranges.

Detecting overlapping address spaces

To further help troubleshoot network issues and audits of network security and routing policies, network administrators can also take advantage of the current and historical data that IPAM makes available to gain usage insights.

IPAM historical insights

IPAM works with any VPC resource where an IP address needs to be assigned, including public and private addresses and Elastic IP Addresses (EIP), and also supports bring your own IP (BYOIP) for both IPv4 and IPv6 addresses.

Start managing and auditing your IP addresses at scale today
Amazon VPC IP Address Manager is available today in all commercial AWS Regions. Get started today, first creating your IPAM for all Regions and accounts, then creating your pools, and finally setting application policy. Then, you can take advantage of IPAM to automate IP address assignment, monitor, troubleshoot, and audit your network addresses assignments.

For those of you with existing VPCs, after you create IPAM it will start monitoring, without any action on your part, to create an inventory of all your VPCs and EIPs. Once you create pools, IPAM will then backfill your VPCs into the pool. This means you can create VPCs today, using your existing workflow, and use IPAM for monitoring and audit only. Later on, you can switch your workflow to IPAM-based automated VPC assignment.

— Steve

New – Amazon VPC Network Access Analyzer

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-vpc-network-access-analyzer/

If you are a member of your organization’s networking, cloud operations, or security teams, you are going to love this new feature. The new Amazon VPC Network Access Analyzer helps you identify network configurations that lead to unintended network access. As you will see in a moment, it will point out ways that you can improve your security posture while still letting you and your organization be agile and flexible. In contrast to manual checking of network configurations, which is error prone and hard to scale, this tool lets you analyze your AWS networks of any size and complexity.

Introducing Network Access Analyzer
Network Access Analyzer takes advantage of our automated reasoning technology that already powers AWS IAM Access Analyzer, Amazon VPC Reachability Analyzer, Amazon Inspector Network Reachability, and other provable security tools.

This new tool uses Network Access Scopes to specify the desired connectivity between your AWS resources. You can get started with a set of Amazon-created scopes, and then either copy & customize them, or create your own from scratch. The scopes are high-level and independent of any particular network architecture or configuration, and can be thought of as a language for specifying the proper level of access & connectivity for your network. You can, for example, create a scope to verify that all web apps use a firewall to access Internet resources, or to indicate that AWS resources used by your Finance team are separate, distinct, and unreachable from the resources used by your Development team.

To evaluate your network against a particular scope, you select it and initiate an analysis. It runs for a few minutes and then generates a set of findings, each of which indicates an unexpected network path between the AWS resources defined in the scope. You can analyze the findings, adjust your configuration or modify the scope in response to the findings, and re-run the analysis, all in just a few minutes.

The analysis process examines a very wide range of AWS resources including Security Groups, CIDR blocks, prefix lists, Elastic Network Interfaces, EC2 instances, Load Balancers, VPC, VPC subnets, VPC endpoints, VPC endpoint services, Transit Gateways, NAT Gateways, Internet Gateways, VPN Gateways, Peering Connections, and Network Firewalls. Your scopes can use Resource Groups to reference all resources that are tagged in a particular way.

Using Network Access Analyzer
To get started, I open the VPC Console, find the Network Analysis section on the left-side navigation, and click Network Access Analyzer:

I can see all of my scopes. Initially, I have four, all created by Amazon and ready to use:

To conduct an analysis, I select a scope (AWS-VPC-Ingress (Amazon created)) and click Analyze. The scope’s description reads:

“Identify ingress paths into your VPCs from Internet Gateways, Peering Connections, VPC Service Endpoints, VPN and Transit Gateways.”

The analysis runs for a couple of minutes and displays the findings as soon as it is done:

There’s a lot of very useful information here! The spectrum chart provides an overview of the resources that are in the findings. I can hover my mouse over any of the segments to learn more, or click on one in order to filter the findings and show only those that reference a particular resource or resource type:

For example, I click VPC Peering Connections and I can see all of the findings that reference the VPC peering connection:

As you can see, the Path details highlight the VPC peering connection in the path! The next step is to examine the findings, decide which ones are expected, and to add them to the scope so that they are excluded from future findings (more on that in a bit).

Inside a Network Access Scope
Let’s take a quick look inside of the Network Access Scope that I used above, and then build another scope from scratch using the visual builder. Each scope is represented in JSON format, and indicates what is considered in-scope (acceptable) traffic between sources and destinations:

{
          "networkInsightsAccessScopeId": "nis-070dc1d37ca315e86",
          "matchPaths": [
                    {
                              "source": {
                                        "resourceStatement": {
                                                  "resources": [],
                                                  "resourceTypes": [
                                                            "AWS::EC2::InternetGateway",
                                                            "AWS::EC2::VPCPeeringConnection",
                                                            "AWS::EC2::VPCEndpointService",
                                                            "AWS::EC2::TransitGatewayAttachment",
                                                            "AWS::EC2::VPNGateway"
                                                  ]
                                        }
                              },
                              "destination": {
                                        "resourceStatement": {
                                                  "resources": [],
                                                  "resourceTypes": [
                                                            "AWS::EC2::NetworkInterface"
                                                  ]
                                        }
                              }
                    }
          ],
          "excludePaths": []
}

The matchPaths element contains source and destination elements. Each of these elements, in turn, identifies AWS resource types and specific resources. While not shown here, scopes can also contain source and destination IP addresses, ports, prefix lists, and traffic types (TCP or UDP). The excludePaths can contain resource types, specific resources, and so forth. I could, for example, define sources and destinations that match all Internet Gateway ingress traffic, but exclude traffic that flows through a Load Balancer, or I could exclude SSH traffic destined for my bastion instances.

Building a Network Access Scope
I can build a new scope in three ways. I can Duplicate and modify an existing one, I can start from scratch and use the visual builder, or I can write my own JSON and use either the CLI or the API to create a scope. I click Create Network Access Scope to use the builder:

I can start with one of five predefined templates, or I can build my own:

I enter a name and a description:

Then I define the source and destinations by resource type, id, traffic type, and so forth:

I have many options for matching the traffic type. This allows me to create scopes for very specific purposes:

I can use a similar interface to add any optional exclusions.

Things to Know
This is a very powerful tool and one that I think you are going to love. Here are a couple of things to know about it:

Pricing – You pay $0.002 for each Elastic Network Interface (ENI) analyzed as part of an assessment.

Regions – Network Access Analyzer is available in the US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Africa (Cape Town), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Milan), Europe (Paris), Europe (Stockholm), South America (São Paulo), and Middle East (Bahrain) Regions.

In the Works – We have lots of additional features on the product roadmap including support for AWS Organizations, the ability to run your analyses on a regular schedule, and support for IPv6 address ranges and resources.

Jeff;

AWS Shield Advanced Update – Automatic Application Layer DDoS Mitigation

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-shield-advanced-update-automatic-application-layer-ddos-mitigation/

In 2016, we launched AWS Shield, a managed Distributed Denial of Service (DDoS) protection service that safeguards applications running on AWS. AWS Shield provides always-on detection and automatic inline mitigations that minimize application downtime and latency without needing to contact AWS Support.

There are two tiers of AWS Shield: Standard and Advanced. All AWS customers benefit from the automatic network layer protections of AWS Shield Standard and at no cost. AWS Shield Standard defends against the most common, frequently occurring network and transport layer (Layer 3 and 4) DDoS attacks to maximize the availability of AWS services.

For customized protection against sophisticated (Layer 3 to 7) threats targeting your applications, you can subscribe to AWS Shield Advanced. AWS Shield Advanced provides more sensitive detection and tailored mitigations against large and complex DDoS attacks, near real-time visibility into attacks, and integration with AWS WAF, a web application firewall for defense against Layer 7 attacks. AWS Shield Advanced also gives you 24-7 access to the AWS Shield Response Team (SRT) and cost protection against scaling costs stemming from DDoS attacks.

AWS Shield Advanced establishes a traffic baseline for each protected resource. Significant deviations from this baseline are flagged as DDoS events and trigger alerts through Amazon CloudWatch. However, mitigating these events still requires manually crafting an AWS WAF rule that isolates the malicious traffic, deploying it through the AWS WAF console or API, and evaluating the rule’s effectiveness. AWS Shield Advanced customers can utilize the SRT to create such AWS WAF rules or rely on their own expertise, but the process is time-consuming, which increases the time it takes to mitigate a DDoS attack and prevent availability impact to applications.

Today, we are announcing Automatic Application Layer DDoS Mitigation for AWS Shield Advanced. This is a new set of capabilities included for all Shield Advanced customers that automatically mitigate malicious web traffic that threatens to impact application availability. This feature automatically creates, tests, and deploys AWS WAF rules to mitigate layer 7 DDoS events on behalf of customers.

Enabling Automatic Application Layer DDoS Mitigation
Visit the AWS Shield console to get started with automatic application layer DDoS mitigation. To get the benefits of Shield Advanced, you must subscribe to an annual subscription.

After you subscribe to AWS Shield Advanced, you specify the resources that you want to protect, configure a layer 7 DDoS mitigation, AWS SRT supports, and a dashboard in CloudWatch to monitor DDoS events. To learn more, see Getting started with AWS Shield Advanced in the AWS documentation.

To enable Shield Advanced automatic application layer DDoS mitigation, select your layer 7 AWS resources (e.g. CloudFront), and choose Configure protections from the drop down list.

Next, in Edit protection, choose if you would like to enable automatic mitigation of layer 7 events and select if whether WAF rules should be created in Count or Block mode in Automatic response. Placing WAF rules in Count mode allows you to observe how resource traffic would be affected before deploying them in Block mode. Please note that a WebACL must be associated with a Shield protected resource in order to enable automatic layer 7 mitigation.

Mitigation actions can be changed to count or block mode at any time. Navigate to the Events tab of the console to view detected DDoS events, and select a detected event to see detection, mitigation, and top contributor metrics.

How to Mitigate Application Layer DDoS Automatically
When you want to protect layer 7 resources, such as CloudFront distributions, AWS Shield Advanced will establish a 30-day traffic baseline into each protected resource.

When automatic mitigation is enabled, only then will we create a Shield managed rule group in which AWS Shield Advanced will create AWS WAF rules in response to DDoS events.

Traffic that significantly deviates from the established baseline will be flagged as a potential DDoS event. After an event is detected, Shield Advanced will attempt to identify a signature based on offending request patterns. If a signature is identified, WAF rules will be created to mitigate traffic with that signature.

Once rules are confirmed to be safe, they will be added to the Shield-managed rule group, and customers can choose whether the rules are deployed in count or block mode. Customers can also create CloudWatch alerts based on when requests are being blocked or counted.

Customers can change the action that automatic mitigation takes (count or block) or disable it entirely at any time. Shield Advanced will automatically remove AWS WAF rules after it has determined that an event has fully subsided. To learn more, see Shield Advanced automatic application layer DDoS mitigation in the AWS Shield Developer Guide.

Available Now
Automatic Application Layer DDoS Mitigation is now available in all AWS regions where AWS Shield Advanced is available, and it can be enabled at no additional cost.

You can send feedback to the AWS forum for AWS Shield or through your usual AWS Support contacts.

Channy

New AWS Scholarship Program Helps Underrepresented and Underserved Students Prep for Careers in AI and ML

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/new-aws-scholarship-program-helps-underrepresented-and-underserved-students-prep-for-careers-in-ai-and-ml/

As a woman working in information technology (IT) for many years, it has always been close to my heart to challenge long-standing gender stereotypes and inspire more young learners to consider a career in tech. With artificial intelligence (AI) and machine learning (ML) defining the future of technology, this future also depends on diverse representation.

The World Economic Forum estimates that technological advances and automation will create 97 million new technology jobs by 2025, including in the field of AI and ML. Yet, according to their research, women make up just 32% of AI jobs globally. The Pew Research Center found that Black and Hispanic workers in the U.S. comprise just 9% and 8% of workers in the science, technology, engineering, and mathematics (STEM) careers respectively.

At Amazon, we believe that technology should be built in a way that’s inclusive, diverse, and equitable. We already offer STEM programs to enable underrepresented groups to build or advance their technical skills and open up new career possibilities. Programs such as We Power Tech, Amazon Future Engineer, AWS Girls’ Tech Day, and AWS GetIT aim to build a future of tech that is inclusive, diverse, and accessible.

Today, I am excited to announce the launch of the AWS AI & ML Scholarship program in collaboration with Intel and Udacity, designed to prepare underrepresented and underserved students globally for careers in ML.

AWS AI & ML Scholarship Program Debuts with AWS DeepRacer Student
The AWS AI & ML Scholarship program is launching as part of the all-new AWS DeepRacer Student service and Student League. This is a new student division of the popular AWS DeepRacer program, a cloud-based 3D car racing simulator that provides a fun way to learn about ML and reinforcement learning (RL). Through DeepRacer Student, you have access to free online trainings to learn the ML and RL basics. You also have access to 10 hours of model training and 5 GB of storage per month to participate in the DeepRacer Student League, a global autonomous racing competition exclusively for AWS AI & ML students.

As part of DeepRacer Student, the AWS AI & ML Scholarship program in collaboration with Intel and Udacity is geared toward underserved and underrepresented high school and college students globally who are at least 16 years old. These are students who may have faced financial barriers growing up or are part of underrepresented groups, including women, people with disabilities, LGBTQ+ persons, as well as people of color. Students in these groups who take part in the DeepRacer Student League are eligible for a chance to win one or two of 2,500 annual scholarships from Udacity, an online learning platform focused on technology skills.

How to Qualify for the AWS AI & ML Scholarship
AWS DeepRacer Student League LogoTo be considered for a scholarship, you must successfully finish all AWS DeepRacer Student learning modules and achieve a score of at least 80% on all course quizzes, reach a certain lap time performance with your DeepRacer car in the Student League, and submit an essay.

Each year, 2,000 students will win a scholarship to the AI Programming with Python Udacity Nanodegree program. Udacity Nanodegrees are massive open online courses (MOOCs) designed to bridge the gap between learning and career goals. This aims to equip students with programming and ML fundamentals to solve real world problems with ML. The top 500 participants in this first Nanodegree will be eligible to join a second customized Nanodegree program curated specifically for AWS AI & ML Scholarship program recipients.

Coaching and Mentorship Opportunities for Scholarship Recipients
Scholarship recipients not only get access to the educational content, but also receive up to 85 hours of support from Udacity instructors to make sure that students successfully learn and progress through the Nanodegree. Instructors and students meet weekly in small groups, review the content for the week, answer questions, and work on a case study as a group exercise.

The top 500 participants who advance into the second Nanodegree will receive 12 months of mentorship from tenured technology leaders at Amazon and Intel to help prepare for a tech career.

All scholarship recipients will be given exclusive access to Ask-Me-Anythings (AMAs), fireside chats, and office hours with AI/ML professionals and diversity experts from Amazon, Intel, and AWS collaborators such as Girls in Tech. These will help familiarize students with different job functions in the AI/ML field.

Enroll via AWS DeepRacer Student
To enroll in the AWS AI & ML Scholarship program, first sign up at the AWS DeepRacer Student service with a valid email address. Note that this student player account is separate from an AWS account and doesn’t require you to provide any billing or credit card information.

AWS DeepRacer Student Sign Up

You will receive a verification code via email. Once you have verified your email address, log in to the DeepRacer Student service and complete the sign-up process.

AWS DeepRacer Student Complete Sign Up

The form will ask for some additional information, including your school, major, and planned year of graduation. You must also self-certify that you are a student enrolled in either high school, university, or community college.

AWS DeepRacer Student Complete Sign Up with Personal Information

Next, enroll in the AWS AI & ML Scholarship program by selecting the corresponding checkbox. The scholarship qualification will start on March 1, 2022.

AWS DeepRacer Student Opt In To Scholarship

Welcome to AWS DeepRacer Student! Start learning the fundamentals of ML through the provided online trainings. You can also participate in the pre-season DeepRacer Student League before the qualifying races start on March 1, 2022.

AWS DeepRacer Student Dashboard

Available Now
Start preparing for the AWS AI & ML Scholarship program by signing up for AWS DeepRacer Student, available globally today, and opt in to the scholarship program. Start learning with learning modules, train your first AWS DeepRacer model, and see how it performs in AWS DeepRacer Student League Pre-season happening now. Join the AWS Slack community to connect with experts and ask questions.

Sign up for AWS DeepRacer Student and the AWS AI & ML Scholarship program today.

Antje

Now in Preview – Amazon SageMaker Studio Lab, a Free Service to Learn and Experiment with ML

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/now-in-preview-amazon-sagemaker-studio-lab-a-free-service-to-learn-and-experiment-with-ml/

Our mission at AWS is to make machine learning (ML) more accessible. Through many conversations over the past years, I learned about barriers that many ML beginners face. Existing ML environments are often too complex for beginners, or too limited to support modern ML experimentation. Beginners want to quickly start learning and not worry about spinning up infrastructure, configuring services, or implementing billing alarms to avoid going over budget. This emphasizes another barrier for many people: the need to provide billing and credit card information at sign-up.

What if you could have a predictable and controlled environment for hosting your Jupyter notebooks in which you can’t accidentally run up a big bill? One that doesn’t require billing and credit card information at all at sign-up?

Today, I am extremely happy to announce the public preview of Amazon SageMaker Studio Lab, a free service that enables anyone to learn and experiment with ML without needing an AWS account, credit card, or cloud configuration knowledge.

At AWS, we believe technology has the power to solve the world’s most pressing issues. And, we proudly support the new and innovative ways that our customers are using these technologies to deliver social impacts.

This is why I am also excited to announce the launch of the AWS Disaster Response Hackathon using Amazon SageMaker Studio Lab. The hackathon, starting today and running through February 7, 2022, is a great way to start learning ML while doing good in the world. I will share more details on how to get involved at the end of the post.

Getting Started with Amazon SageMaker Studio Lab
Studio Lab is based on open-source JupyterLab and gives you free access to AWS compute resources to quickly start learning and experimenting with ML. Studio Lab is also simple to set up. In fact, the only configuration you have to do is one click to choose whether you need a CPU or GPU instance for your project. Let me show you.

The first step is to request a free Studio Lab account here.

Amazon SageMaker Studio Lab

When your account request is approved, you will receive an email with a link to the Studio Lab account registration page. You can now create your account with your approved email address and set a password and your username. This account is separate from an AWS account and doesn’t require you to provide any billing information.

Amazon SageMaker Studio Lab - Create Account

Once you have created your account and verified your email address, you can sign in to Studio Lab. Now, you can select the compute type for your project. You can choose between 12 hours of CPU or 4 hours of GPU per user session, with an unlimited number of user sessions available to you. Furthermore, you get a minimum of 15 GB of persistent storage per project. When your session expires, Studio Lab will take a snapshot of your environment. This enables you to pick up right where you left off. Let’s select CPU for this demo, and choose Start runtime.

Amazon SageMaker Studio Lab - Select Compute

Once the instance is running, select Open project to go to your free Studio Lab environment and start building. No additional configuration is required.

Amazon SageMaker Studio Lab - Open Project

Amazon SageMaker Studio Lab Environment

Customize your environment
Studio Lab comes with a Python base image to get you started. The image only has a few libraries pre-installed to save the available space for the frameworks and libraries that you actually need.

Amazon SageMaker Studio Lab - Select Kernel

You can customize the Conda environment and install additional packages using the %conda install <package> or %pip install <package> command right from within your notebook. You can also create entirely new, custom Conda environments, or install open-source JupyterLab and Jupyter Server extensions. For detailed instructions, see the Studio Lab documentation.

GitHub integration
Studio Lab is tightly integrated with GitHub and offers full support for the Git command line. This lets you easily clone, copy, and save your projects. Moreover, you can add an Open in Studio Lab badge to the README.md file or notebooks in your public GitHub repo to share your work with others.

Open in Amazon SageMaker Studio Lab Badge

This will let everyone open and view the notebook in Studio Lab. If they have a Studio Lab account, then they can also run the notebook. Add the following markdown to the top of your README.md file or notebook to add the Open in Studio Lab badge:

[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/org/repo/blob/master/path/to/notebook.ipynb)

Replace org, repo, path and the notebook filename with those for your repo. Then, when you click the Open in Studio Lab badge, it will preview the notebook in Studio Lab. If your repo is private within a GitHub account or organization and you would like other people to use it, then you must additionally install the Amazon SageMaker GitHub App at the GitHub account or organization level.

Amazon SageMaker Studio Lab Notebook Preview

If you have a Studio Lab account, you can click Copy to project and choose to either copy just the notebook or to clone the entire repo into your Studio Lab account. Moreover, Studio Lab can check if the repository contains a Conda environment file and build the custom Conda environment for you.

Learn the Fundamentals of ML
If you are new to ML, then Studio Lab provides access to free, educational content to get you started. Dive into Deep Learning (D2L) is a free interactive book that teaches the ideas, the math, and the code behind ML and DL. The AWS Machine Learning University (MLU) gives you access to the same ML courses used to train Amazon’s own developers on ML. Hugging Face is a large open source community and a hub for pre-trained deep learning (DL) models. This is mainly aimed at natural language processing. In just a few clicks, you can import the relevant notebooks from D2L, MLU, and Hugging Face into your Studio Lab environment.

Join the AWS Disaster Response Hackathon using Amazon SageMaker Studio Lab
The frequency and severity of natural disasters are increasing. This year alone, we have seen significant wildfires across the Western United States and in countries like Greece and Turkey; major floods across Europe; and Hurricane Ida’s impact to the coast of Louisiana. In response, governments, businesses, nonprofits, and international organizations are placing more emphasis on disaster preparedness and response than ever before.

AWS Disaster Response Hackathon

Through the AWS Disaster Response Hackathon offering a total of $54,000 USD in prices, we hope to stimulate ways of applying ML to solve pressing challenges in natural disaster preparedness and response.

Join the hackathon today, start building, and don’t forget to submit your project before February 7, 2022. This hackathon is also an attempt to set the Guinness World Record for the “largest machine learning competition.”

Join the Preview
You can request a free Amazon SageMaker Studio Lab account starting today. The number of new account registrations will be limited to ensure a high quality of experience for all customers. You can find sample notebooks in the Studio Lab GitHub repository. Give it a try and let us know your feedback.

Request a free Amazon SageMaker Studio Lab account.

Antje

Announcing Amazon SageMaker Inference Recommender

Post Syndicated from Sean M. Tracey original https://aws.amazon.com/blogs/aws/announcing-amazon-sagemaker-inference-recommender/

Today, we’re pleased to announce Amazon SageMaker Inference Recommender — a brand-new Amazon SageMaker Studio capability that automates load testing and optimizes model performance across machine learning (ML) instances. Ultimately, it reduces the time it takes to get ML models from development to production and optimizes the costs associated with their operation.

SageMaker Inference Recommender Banner Image

Until now, no service has provided MLOps Engineers with a means to pick the optimal ML instances for their model. To optimize costs and maximize instance utilization, MLOps Engineers would have to use their experience and intuition to select an ML instance type that would serve them and their model well, given the requirements to run them. Moreover, given the vast array of ML instances available, and the practically infinite nuances of each model, choosing the right instance type could take more than a few attempts to get it right. SageMaker Inference Recommender now lets MLOps Engineers and get recommendations for the best available instance type to run their model. Once an instance has been selected, their model can be instantly deployed to the selected instance type with only a few clicks. Gone are the days of writing custom scripts to run performance benchmarks and load testing.

For MLOps Engineers who want to get data on how their model will perform ahead of pushing to a production environment, SageMaker Inference Recommender also lets them run a load test against their model in a simulated environment. Ahead of deployment, they can specify parameters, such as required throughput, sample payloads, and latency constraints, and test their model against these constraints on a selected set of instances. This lets MLOps Engineers gather data on how well their model will perform in the real world, thereby enabling them to feel confident in pushing it to production—or highlighting potential issues that must be addressed before putting it out into the world.

SageMaker Inference Recommender has even more tricks up its sleeve to make the lives of MLOps Engineers easier and make sure that their models continue to operate optimally. MLOps Engineers can use SageMaker Inference Recommender benchmarking features to perform custom load tests that estimate model performance when accessed under load in a production environment given certain requirements. Results from these tests can be loaded with either SageMaker Studio or the AWS SDK or AWS CLI, giving MLOps Engineers an overview of model performance, comparisons of numerous configurations, and the ability to share the results with any stakeholders.

Find Out More
MLOps Engineers can get started with Amazon SageMaker Inference Recommender through Amazon SageMaker Studio, AWS SDKs and CLI . Amazon SageMaker Inference Recommender is available in all AWS commercial regions where SageMaker is available (except for KIX). To find out more information, you can visit the Amazon SageMaker Inference Recommender landing page.

To get started, see the SageMaker Inference Recommender documentation.

New – Introducing SageMaker Training Compiler

Post Syndicated from Sean M. Tracey original https://aws.amazon.com/blogs/aws/new-introducing-sagemaker-training-compiler/

An image explaining the benefits of using Amazon SageMaker Training CompilerToday, we’re pleased to announce Amazon SageMaker Training Compiler, a new Amazon SageMaker capability that can accelerate the training of deep learning (DL) models by up to 50%.

As DL models grow in complexity, so too does the time it can take to optimize and train them. For example, it can take 25,000 GPU-hours to train popular natural language processing (NLP) model “RoBERTa“. Although there are techniques and optimizations that customers can apply to reduce the time it can take to train a model, these also take time to implement and require a rare skillset. This can impede innovation and progress in the wider adoption of artificial intelligence (AI).

How has this been done to date?
Typically, there are three ways to speed up training:

  1. Using more powerful, individual machines to process the calculations
  2. Distributing compute across a cluster of GPU instances to train the model in parallel
  3. Optimizing model code to run more efficiently on GPUs by utilizing less memory and compute.

In practice, optimizing machine learning (ML) code is difficult, time-consuming, and a rare skill set to acquire. Data scientists typically write their training code in a Python-based ML framework, such as TensorFlow or PyTorch, relying on ML frameworks to convert their Python code into mathematical functions that can run on GPUs, commonly known as kernels. However, this translation from the Python code of a user is often inefficient because ML frameworks use pre-built, generic GPU kernels, instead of creating kernels specific to the code and model of the user.

It can take even the most skilled GPU programmers months to create custom kernels for each new model and optimize them. We built SageMaker Training Compiler to solve this problem.

Today’s launch lets SageMaker Training Compiler automatically compile your Python training code and generate GPU kernels specifically for your model. Consequently, the training code will use less memory and compute, and therefore train faster. For example, when fine-tuning Hugging Face’s GPT-2 model, SageMaker Training Compiler reduced training time from nearly 3 hours to 90 minutes.

Automatically Optimizing Deep Learning Models
So, how have we achieved this acceleration? SageMaker Training Compiler accelerates training jobs by converting DL models from their high-level language representation to hardware-optimized instructions that train faster than jobs with off-the-shelf frameworks. Under the hood, SageMaker Training Compiler makes incremental optimizations beyond what the native PyTorch and TensorFlow frameworks offer to maximize compute and memory utilization on SageMaker GPU instances.

More specifically, SageMaker Training Compiler uses graph-level optimization (operator fusion, memory planning, and algebraic simplification), data flow-level optimizations (layout transformation, common sub-expression elimination), and back-end optimizations (memory latency hiding, loop oriented optimizations) to produce an optimized model that efficiently uses hardware resources. As a result, training is accelerated by up to 50%, and the returned model is the same as if SageMaker Training Compiler had not been used.

But how do you use SageMaker Training Compiler with your models? It can be as simple as adding two lines of code!

SageMaker Training Compiler Code Changes

The shortened training times mean that customers gain more time for innovating and deploying their newly-trained models at a reduced cost and a greater ability to experiment with larger models and more data.

Getting the most from SageMaker Training Compiler
Although many DL models can benefit from SageMaker Training Compiler, larger models with longer training will realize the greatest time and cost savings. For example, training time and costs fell by 30% on a long-running RoBERTa-base fine-tuning exercise.

Jorge Lopez Grisman, a Senior Data Scientist at Quantum Health – an organization on a mission to “make healthcare navigation smarter, simpler, and more cost-effective for everyone” – said:

“Iterating with NLP models can be a challenge because of their size: long training times bog down workflows and high costs can discourage our team from trying larger models that might offer better performance. Amazon SageMaker Training Compiler is exciting because it has the potential to alleviate these frictions. Achieving a speedup with SageMaker Training Compiler is a real win for our team that will make us more agile and innovative moving forward.”

Further Resources
To learn more about how Amazon SageMaker Training Compiler can benefit you, you can visit our page here. And to get started see our technical documentation here.

New – Create and Manage EMR Clusters and Spark Jobs with Amazon SageMaker Studio

Post Syndicated from Sean M. Tracey original https://aws.amazon.com/blogs/aws/new-create-and-manage-emr-clusters-and-spark-jobs-with-amazon-sagemaker-studio/

Today, we’re very excited to offer three new enhancements to our Amazon SageMaker Studio service.

As of now, users of SageMaker Studio can create, terminate, manage, discover, and connect to Amazon EMR clusters running within a single AWS account and in shared accounts across an organization—all directly from SageMaker Studio. Furthermore, SageMaker Studio Notebook users can able to utilize SparkUI to monitor and debug Spark jobs running on an Amazon EMR cluster—directly from the SageMaker Studio Notebooks!

The story so far…
Before today, SageMaker Studio users had some ability to find and connect with EMR clusters, provided that they were running in the same account as SageMaker Studio. While useful in many circumstances, if a cluster did not exist that would suit the requirements of the model or analysis being run, then data scientists would have to leave their development environment and manually configure a cluster that suited their needs. As well as being disruptive to workflow of data scientists, there are no guarantees that the data scientists would have either the permissions or depth of knowledge required to provision a cluster that would enable them to continue with their work. Additionally, being restricted to creating and managing clusters in a single account could become prohibitive in organizations working across many AWS accounts.

What’s new?
Data scientists can:

  • Discover, manage, create, terminate, and connect to Amazon EMR clusters from within SageMaker Studio
  • Utilize “templates” – a new way to configure and provision clusters for your workload needs with support from seasoned DevOps practitioners
  • Connect to, debug, and monitor Spark jobs running on an Amazon EMR cluster from within a SageMaker Studio Notebook

Creating, Connecting to, and Managing EMR Clusters

Connecting to an EMR Cluster from a SageMaker Studio Notebook

With the ability to connect to and manage EMR clusters from within SageMaker Studio, data scientists no longer have to leave their familiar environment to create, configure and provision the EMR clusters where they run their workloads.

Introducing Templates
A template is a collection of off-the-shelf cluster configurations optimized for numerous workloads. Templates can be created and managed by DevOps administrators and made available through the AWS Service Catalog to data scientists within SageMaker Studio. This lets them quickly spin up a cluster to meet their needs, all while safe in the knowledge that a trusted DevOps admin has correctly configured a cluster per the project’s requirements. Furthermore, this lets data scientists get on with the work they do best, and it gives DevOps administrators within these teams greater ability to manage the types of provisioned infrastructure.

Managing EMR Clusters from within SageMaker Studio Notebooks

Directly Connect to and monitor Spark Jobs
Finally, to make the job of data scientists even simpler, we’ve built the ability to connect to, debug, and monitor Spark jobs running on an Amazon EMR cluster from within a SageMaker Studio Notebook. Before now, to access the monitoring UI of a Spark Job, one needed to configure secure tunnels and web proxies to gain direct access to currently executing jobs, adding friction to the workflow of a data scientist trying to observe and debug their workloads. Now, with these new features, users will have one-click access directly from the interface that they already know. This enables them to build and put their workloads to work, rather than spending time on configuring infrastructure and workloads.

Connecting to a Spark Job from within a SageMaker Studio Notebook

These new features let data scientists can use a simple, consistent UI to provision and manage infrastructure as needed without ever having to leave SageMaker Studio or dive into the minutiae of the provisioning of such hardware – Moreover, they won’t have to spend time configuring proxies and SSH tunnels to debug and monitor ongoing Spark jobs.

Find out more
These features are generally available in all AWS Regions where SageMaker Studio is available, and there are no additional charges to use this capability. For complete information on pricing and regional availability, please refer to the SageMaker Studio pricing page .

To learn more, see our documentation.

Announcing Amazon SageMaker Ground Truth Plus

Post Syndicated from Sean M. Tracey original https://aws.amazon.com/blogs/aws/announcing-amazon-sagemaker-ground-truth-plus/

Today, we’re pleased to announce the latest service in the Amazon SageMaker suite that will make labeling datasets easier than ever before. Ground Truth Plus is a turn-key service that uses an expert workforce to deliver high-quality training datasets fast, and reduces costs by up to 40 percent.

The Challenges of Machine Learning Model Creation
One of the biggest challenges in building and training machine learning (ML) models is sourcing enough high-quality, labeled data at scale to feed into and train those models so that they can make an accurate prediction.

On the face of it, labeling data might seem like a fairly straightforward task…

  • Step 1: Get data
  • Step 2: Label it

…but this is far from the reality.

Even before you have labelers begin annotations, you need a custom labeling workflow and user interface specific to your project so that you get a high-quality dataset. This relies on a combination of robust tooling and skilled workers, and the effort spent can be significant.

Once the data labeling workflow and user interface has been constructed, a workforce to use those systems must be organized and trained – and this is all before a single point of data has been labeled!

Finally, once the labeling systems have been built, the workflows designed, and the workforce trained and deployed, the process of passing data through that system must be monitored and checked to ensure a consistent, high-quality output. After enough data has been passed through and labeled by the system, you have arrived at the point you’ve been trying to get to all along: you finally have enough data to train the ML model.

Each of these steps represents a significant investment in time, costs, and energy. You could be spending these resources building ML models instead of labeling and managing data, and using Ground Truth Plus can help free you up to do just that.

Introducing Amazon SageMaker Ground Truth Plus
Amazon SageMaker Ground Truth Plus enables you to easily create high-quality training datasets without having to build labeling applications and manage the labeling workforce on your own. Which means you don’t even need to have deep ML expertise or extensive knowledge of workflow design and quality management. You simply provide data along with labeling requirements and Ground Truth Plus sets up the data labeling workflows and manages them on your behalf in accordance with your requirements.

For example, if you need medical experts to label radiology images, you can specify that in the guidelines you provide to Ground Truth Plus. The service will then automatically select labelers trained in radiology to label your data, and from there an expert workforce that is trained on a variety of ML tasks will start labeling the data. Ground Truth Plus brings ML-powered automation to data labeling, which increases the quality of the output dataset and decreases the data labeling costs.

Amazon SageMaker Ground Truth Plus uses a multi-step labeling workflow including ML techniques for active learning, pre-labeling, and machine validation. This reduces the time required to label datasets for a variety of use cases including computer vision and natural language processing. Finally, Ground Truth Plus provides transparency into data labeling operations and quality management through interactive dashboards and user interfaces. This lets you monitor the progress of training datasets across multiple projects, track project metrics such as daily throughput, inspect labels for quality, and provide feedback on the labeled data.

How Does It Work?
A SageMaker Ground Truth Plus screenshot showing the request formFirst, let’s head to the new Ground Truth Plus console and fill out a form outlining the requirements for the data labeling project. Following that, our team of AWS Experts will schedule a call to discuss your data labeling project.

After the call, you simply upload data in an Amazon Simple Storage Service (Amazon S3) bucket for labeling.

Once the data has been uploaded, our experts will set-up the data labeling workflow per your requirements and create a team of labelers with the expertise necessary to label your data effectively. This helps make sure that you have the best people possible working on your projects.

These expert labelers use the Ground Truth Plus tools we’ve built to label these datasets quickly and effectively.

Initially, labelers will annotate the data you’ve uploaded, much like the following example image that we’ve uploaded from the CBCL StreetScenes dataset. However, as the labelers start to submit examples of labeled data, something cool begins happening: our ML systems kick in and start to pre-label the images on behalf of the expert workforce!

An example of the raw dataset used to demonstrate Amazon SageMaker Ground Truth Plus functionality

As more and more data is labeled by the expert workforce, the ML model becomes better at pre-labeling those images. This means that there’s less need for a human to spend as much time creating each individual label for every object of interest in a dataset. Less time spent on labeling means lower costs for you, and it also means a quicker turnaround in creating a dataset that can be used for training a model – all without sacrificing quality.

A screenshot showing one of the labelling interfaces for SageMaker Ground Truth Plus

As the process continues, these ML models will also start to highlight potential areas of interest that the labeling workforce may have missed or incorrectly labeled through machine validation (indicated below by the purple box). Once an area of interest has been highlighted, a human labeler can view and either confirm or delete the suggestion that the model has made. This iteratively improves the pre-labeling and machine validation stages, further reducing the time needed by a labeler to manually label the data, and ensures a high-quality output throughout the process.

A screenshot showing one of the labelling interfaces filed in my a machine learning model for SageMaker Ground Truth Plus

While this is all going on, you can monitor the progress and output of the project using the Ground Truth Plus Project Portal. Within this portal, you can track the amount of data labeled on a day-by-day basis, and make sure that the project is progressing at an acceptable rate.

A screenshot showing the metrics dashboard enabling users to track the progress of their labelling jobs in SageMaker Ground Truth Plus

With each batch of images uploaded and labeled, you can decide whether to accept them or send them back for relabeling if something has been missed.

Finally, when the labeling process has completed, you can retrieve the labeled data from a secure S3 bucket and get to the business of training models.

Find out more
Today, Amazon SageMaker Ground Truth Plus is available in the N. Virginia (us-east-1) region.

To learn more:

New DynamoDB Table Class – Save Up To 60% in Your DynamoDB Costs

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/new-dynamodb-table-class-save-up-to-60-in-your-dynamodb-costs/

Today we are announcing Amazon DynamoDB Standard-Infrequent Access (DynamoDB Standard-IA). A new table class for DynamoDB that reduces storage costs by 60 percent compared to existing DynamoDB Standard tables, and that delivers the same performance, durability, and scaling.

Nowadays, many customers are moving their infrequently accessed data between DynamoDB and Amazon Simple Storage Service (Amazon S3). This means that customers are developing a process to migrate the data and build complex applications that must support two different APIs—one for DynamoDB and another for Amazon S3.

DynamoDB Standard-IA table class is designed for customers who want a cost-optimized solution for storing infrequently accessed data in DynamoDB without changing any application code. Using this new table class, you get the single-digit millisecond read and write performance from DynamoDB and use all of the same APIs.

When you use DynamoDB Standard-IA table class, you will save up to 60 percent in storage costs as compared to using the DynamoDB Standard table class. However, DynamoDB reads and writes for this new table class are priced higher than the Standard tables. Therefore, it is important to understand your use cases before applying this new table class to your tables.

DynamoDB Standard-IA is a great solution if you must store terabytes of data for several years where the data must be highly available, but it is not frequently accessed. An example is social media applications where end users rarely access their old posts. However, these posts remain stored, because if someone scrolls on a profile to see an old photo from 2009, they should be able to retrieve it as fast as if it was a newer post.

E-commerce sites are another good use case. These sites might have a lot of products that are not frequently accessed, but administrators of the site still want to have them available in their store just in case someone wants to buy them. Furthermore, this is a good solution for storing a customer’s previous orders. DynamoDB Standard-IA table offers the ability to retain historical orders at a lower cost.

Get started using DynamoDB Standard-IA
Get started using DynamoDB Standard-IA by evaluating the best class for your existing tables.

Go to the table page and select Update the table class in the Actions dropdown to change the table class. Then, choose the new table class and save the changes. You can change the table class for an existing table to be Standard-IA or Standard twice every 30-days with no impact on performance or availability. All of the features of DynamoDB are available when using a table in the Standard-IA table class.

Moreover, you can also create a new table with the DynamoDB Standard-IA table class.

Update table class

Availability and Pricing
DynamoDB Standard-IA is available in all of the AWS Regions, except the China Regions and AWS GovCloud.

For example, DynamoDB Standard-IA storage pricing in US East (N. Virginia) is now $0.10 per GB (60 percent less than DynamoDB Standard), while reads and writes are 25 percent higher.

For more information about this feature and its pricing, see the DynamoDB Standard-IA Feature page and the DynamoDB pricing page.

Marcia

New – Amazon RDS Custom for SQL Server Is Generally Available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-rds-custom-for-sql-server-is-generally-available/

On October 26, 2021, we launched Amazon RDS Custom for Oracle, a managed database service for applications that require customization of the underlying operating system and database environment. RDS Custom lets you access and customize your database server host and operating system, for example, by applying special patches and changing the database software settings to support third-party applications that require privileged access.

Today, I am happy to announce the general availability of Amazon RDS Custom for SQL Server to support applications that have dependencies on specific configurations and third-party applications that require customizations in corporate, e-commerce, and content management systems, such as Microsoft SharePoint.

With RDS Custom for SQL Server, you can enable features that require elevated privileges like SQL Common Language Runtime (CLR), install specific drivers to enable heterogenous linked servers, or have more than 100 databases per instance.

Through the time-saving benefits of a managed service, RDS Custom for SQL Server frees you up to focus on more business-impacting, strategic activities. The use of automating backups and other operational tasks let you rest easy, knowing your data is safe and ready to be recovered if needed.

Getting Started with RDS Custom for SQL Server
Get started by creating a DB instance of RDS Custom for SQL Server from an orderable engine version offered by RDS Custom. You can optionally access the server host to customize your software via AWS Systems Manager or a remote desktop client. Your application connects to the RDS Custom DB instance endpoint.

Before creating and connecting your custom DB instance for SQL Server, make sure that you meet some prerequisites, such as configuring the AWS Identity and Access Management (IAM) role and Amazon Virtual Private Cloud  (Amazon VPC).

Choose Create database in the Databases menu to create your custom DB instance for SQL Server in the RDS Console. When you choose a database creation method, select Standard create. You can set Engine options to Microsoft SQL Server and choose Amazon RDS Custom in the database management type.

For Edition, choose the DB engine edition that you want to use in the choices of Enterprise, Standard, and Web with the Version of default SQL Server 2019.

For Settings, enter your favorite unique name for the DB instance identifier and your master username and password. By default, the new instance uses an automatically generated password for the master user.

In DB instance size, choose a DB instance class optimized to each DB engine edition.

SQL Server edition RDS Custom support
Enterprise Edition db.r5.xlarge – db.r5.24xlarge
db.m5.xlarge – db.m5.24xlarge
Standard Edition db.r5.large – db.r5.24xlarge
db.m5.large – db.m5.24xlarge
Web Edition db.r5.large – db.r5.4xlarge
db.m5.large – db.m5.4xlarge

See Settings for DB instances in the Amazon RDS User Guide to learn more about the remaining settings. Choose Create database. After creating the DB instance, the details for the new RDS Custom DB instance appear on the RDS console.

Alternatively, you can create an RDS Custom DB instance by using the create-db-instance command in the AWS Command Line Interface (AWS CLI).

$ aws rds create-db-instance \
	--engine custom-sqlserver-se \
	--engine-version 15.00.4073.23.v1 \
	--db-instance-identifier channy-custom-db \
	--db-instance-class db.m5.xlarge \
	--allocated-storage 20 \
	--db-subnet-group mydbsubnetgroup \
	--master-username myuser \
	--master-user-password mypassword \
	--backup-retention-period 3 \
	--no-multi-az \
	--port 8200 \
	--kms-key-id mykmskey \
	--custom-iam-instance-profile AWSRDSCustomInstanceProfile

After you create your RDS Custom DB instance, you can connect to it using AWS Systems Manager Session Manager or an RDP client. Make sure that the Amazon VPC security group associated with your DB instance permits inbound connections on port 3389 for TCP to allow RDP connections.

You need the key pair associated with the instance to connect to the custom DB instance via RDP. RDS Custom creates the key pair for you. The pair name uses the prefix do-not-delete-rds-custom-DBInstanceIdentifier. AWS Secrets Manager stores your private key as a secret. Choose the secret that has the same name as your key pair and retrieve the secret value to decrypt the password later.

In the EC2 console, look for the name of your EC2 instance, and then choose the instance ID associated with your DB instance ID, for example, channy-custom-db-*. Select your custom DB instance, and then choose Connect. On the Connect to instance page, choose the RDP client tab, and then choose Get password with your private key as a secret.

When you connect an RDP client with a downloaded remote desktop file and decrypted password, you can log in to the Windows Server and customize your SQL Server.

You can use AWS Systems Manager Session Manager to start a session with an instance in your account. After the session is started, you can run PowerShell commands as you would for any other connection type. See Connect to your Windows instance in the Amazon EC2 User Guide for more information.

Things to Know
Here are a couple of things to keep in mind about managing your DB instance:

Pausing RDS Custom Automation: RDS Custom for SQL Server automatically provides monitoring and instance recovery for your RDS Custom DB instance. If you need to customize the instance, then pause RDS Custom automation for a specified period. The pause makes sure that your customizations don’t interfere with RDS Custom automation. To pause or resume RDS Custom automation, you can set RDS Custom automation mode to Paused with the pause duration that you want (in minutes, default 60 minutes to 1,440 minutes maximum).

High Availability (HA): To support replication between RDS Custom for SQL Server instances, you can configure HA with Always On Availability Groups (AGs). We recommend that you set up the primary DB instance to synchronously replicate data to the standby instances in different Availability Zones (AZs) to be resilient to AZ failures. Moreover, you can migrate data by configuring HA for your on-premises instance and then failing over or switching over to the RDS Custom standby database.

Custom DB Management: Just like Amazon RDS, RDS Custom for SQL Server creates automated backups taking a snapshot of an Amazon RDS DB instance. Incremental snapshots are used to restore DB instances to a specific point in time. Furthermore, all changes and customizations to the underlying operating system are automatically logged for audit purposes using Systems Manager and AWS CloudTrail. See Troubleshooting an Amazon RDS Custom for DB instance in the Amazon RDS User Guide to learn more.

Available Now
Amazon RDS Custom for SQL Server is now available in the US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), EU (Frankfurt), EU (Ireland), and EU (Stockholm) Regions.

Look at the product page and documentation of Amazon RDS Custom to learn more. Please send us feedback either in the AWS forum for Amazon RDS or through your usual AWS support contacts.

Channy

New – Amazon DevOps Guru for RDS to Detect, Diagnose, and Resolve Amazon Aurora-Related Issues using ML

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/new-amazon-devops-guru-for-rds-to-detect-diagnose-and-resolve-amazon-aurora-related-issues-using-ml/

Today we are announcing Amazon DevOps Guru for RDS, a new capability for Amazon DevOps Guru. It allows developers to easily detect, diagnose, and resolve performance and operational issues in Amazon Aurora.

Hundreds of thousands of customers nowadays are using Amazon Aurora because it is highly available, scalable, and durable. But as applications grow in size and complexity, it becomes more challenging for these customers to detect and resolve operational and performance issues quickly.

During last year’s re:Invent, we announced DevOps Guru, a service that uses machine learning (ML) to automatically detect and alert customers of application issues, including database problems. Today we are announcing DevOps Guru for RDS to help developers using Amazon Aurora databases to detect, diagnose, and resolve database performance issues fast and at scale. Now developers will have enough information to determine the exact cause for a database performance issue. This launch will save developers and engineers many hours of work trying to uncover and remediate the performance-related database issues.

DevOps Guru for RDS uses ML to automatically identify and analyze a wide range of performance-related database issues, such as over-utilization of host resources, database bottlenecks, or misbehavior of SQL queries. It also recommends solutions to remediate the issues it finds. To use this capability, you don’t need to be a database or ML expert.

When an issue is detected, DevOps Guru for RDS displays the finding in the DevOps Guru console and sends notifications using Amazon EventBridge or Amazon Simple Notification Service (SNS). This allows developers to automatically manage and take real-time action on the issues.

How DevOps Guru for RDS Works
DevOps Guru for RDS uses anomaly detection on the database load (DB load) performance metric to detect issues. DB load is measured in units of Average Active Sessions (AAS). DB load measures the level of activity in your database, making it a great metric to understand the health of your database. If the DB load is high, this can result in performance issues. This metric can be compared to the number of virtual CPUs (vCPUs), and if the DB load is higher than that number, issues can arise.

The most useful dimensions for this metric are the wait events and the top SQL. The wait event describes what the system conditions that are currently running SQLs are waiting on. The most common reasons why a statement is waiting is that it is waiting for the CPU, waiting for a read or write, or waiting for a locked resource. The top SQL dimension shows which queries are contributing the most to DB load.

The following image is an example of a finding that DevOps Guru for RDS reported. The graph shows that from the AAS, most of them were waiting for access to a table or for CPU.

Example of anomaly detection
If you continue scrolling on the DevOps Guru for RDS analysis page, you can discover the cause for the problem and some recommendations to fix it. In this particular example, two problems were detected: high-load wait events and CPU capacity exceeded.

DevOps Guru for RDS looks more in-depth into these problems. First, it looks at the high-load wait events, where there were 27 AAS for the IO and CPU wait types, which is 99 percent of the total DB load.

Second, it tells us that the running tasks exceeded six processes. This database only has two vCPUs, and the recommended number of running processes should be a maximum of four (2x vCPUs). DevOps Guru for RDS also makes recommendations to fix these issues.

Recommendations

In another anomaly, the graph shows that there was a high load of wait events, and one SQL query was found to require further investigation. You can even see the exact SQL query if you click on the SQL digest IDs. The insight’s analysis and recommendation section is full of information on how to investigate further and fix the issue. You can get a lot of detailed information by clicking on the wait event, for example, on the wait event wait/io/table/sql/handler or in the View troubleshooting doc link.

Analysis and recommendations

Get started with DevOps Guru for RDS
To get started with this new capability of DevOps Guru, make sure that Performance Insights is enabled for your Amazon Aurora DB instances. It supports Amazon Aurora with MySQL- and PostgreSQL-compatibility. For instructions on how to enable Performance Insights, see Enabling and disabling Performance Insights.

The next step is to enable DevOps Guru to start monitoring your AWS resources. You can specify the resources you want to be covered by DevOps Guru.

If you are already using DevOps Guru, whenever there is a new insight for an Amazon Aurora database resource, you will see it in the console.

To see the detailed database analysis, navigate to the Insight page and select the new View analysis button under the DB load aggregated metric. That button will take you to the detailed analysis by DevOps Guru for RDS.

View analysis

Pricing and Availability
DevOps Guru for RDS is offered to customers at no additional charge, as part of the existing price that DevOps Guru charges customers for RDS resources.

DevOps Guru for RDS is available in all Regions where DevOps Guru is available, US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).

Learn more about DevOps Guru for RDS and check out the talk at AWS re:Invent “Automatically detect and resolve performance issues with Amazon DevOps Guru for RDS” (Session Id 15877).

Marcia

Enhanced Amazon S3 Integration for Amazon FSx for Lustre

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/enhanced-amazon-s3-integration-for-amazon-fsx-for-lustre/

Today, we are announcing two additional capabilities of Amazon FSx for Lustre. First, a full bi-directional synchronization of your file systems with Amazon Simple Storage Service (Amazon S3), including deleted files and objects. Second, the ability to synchronize your file systems with multiple S3 buckets or prefixes.

Lustre is a large scale, distributed parallel file system powering the workloads of most of the largest supercomputers. It is popular among AWS customers for high-performance computing workloads, such as meteorology, life-science, and engineering simulations. It is also used in media and entertainment, as well as the financial services industry.

I had my first hands-on Lustre file systems when I was working for Sun Microsystems. I was a pre-sales engineer and worked on some deals to sell multimillion-dollar compute and storage infrastructure to financial services companies. Back then, having access to a Lustre file system was a luxury. It required expensive compute, storage, and network hardware. We had to wait weeks for delivery. Furthermore, it required days to install and configure a cluster.

Fast forward to 2021, I may create a petabyte-scale Lustre cluster and attach the file system to compute resources running in the AWS cloud, on-demand, and only pay for what I use. There is no need to know about Storage Area Networks (SAN), Fiber Channel (FC) fabric, and other underlying technologies.

Modern applications use different storage options for different workloads. It is common to use S3 object storage for data transformation, preparation, or import/export tasks. Other workloads may require POSIX file-systems to access the data. FSx for Lustre lets you synchronize objects stored on S3 with the Lustre file system to meet these requirements.

When you link your S3 bucket to your file system, FSx for Lustre transparently presents S3 objects as files and lets you to write results back to S3.

Full Bi-Directional Synchronization with Multiple S3 Buckets
If your workloads require a fast, POSIX-compliant file system access to your S3 buckets, then you can use FSx for Lustre to link your S3 buckets to a file system and keep data synchronized between the file system and S3 in both directions. However, until today, there were a couple limitations. First, you had to manually configure a task to export data back from FSx for Lustre to S3. Second, deleted files on S3 were not automatically deleted from the file system. And third, an FSx for Lustre file system was synchronized with one S3 bucket only. We are addressing these three challenges with this launch.

Starting today, when you configure an automatic export policy for your data repository association, files on your FSx for Lustre file system are automatically exported to your data repository on S3. Next, deleted objects on S3 are now deleted from the FSx for Lustre file system. The opposite is also available: deleting files on FSx for Lustre triggers the deletion of corresponding objects on S3. Finally, you may now synchronize your FSx for Lustre file system with multiple S3 buckets. Each bucket has a different path at the root of your Lustre file system. For example your S3 bucket logs may be mapped to /fsx/logs and your other financial_data bucket may be mapped to /fsx/finance.

These new capabilities are useful when you must concurrently process data in S3 buckets using both a file-based and an object-based workflow, as well as share results in near real time between these workflows. For example, an application that accesses file data can do so by using an FSx for Lustre file system linked to your S3 bucket, while another application running on Amazon EMR may process the same files from S3.

Moreover, you may link multiple S3 buckets or prefixes to a single FSx for Lustre file system, thereby enabling a unified view across multiple datasets. Now you can create a single FSx for Lustre file system and easily link multiple S3 data repositories (S3 buckets or prefixes). This is convenient when you use multiple S3 buckets or prefixes to organize and manage access to your data lake, access files from a public S3 bucket (such as these hundreds of public datasets) and write job outputs to a different S3 bucket, or when you want to use a larger FSx for Lustre file system linked to multiple S3 datasets to achieve greater scale-out performance.

How It Works
Let’s create an FSx for Lustre file system and attach it to an Amazon Elastic Compute Cloud (Amazon EC2) instance. I make sure that the file system and instance are in the same VPC subnet to minimize data transfer costs. The file system security group must authorize access from the instance.

I open the AWS Management Console, navigate to FSx, and select Create file system. Then, I select Amazon FSx for Lustre. I am not going through all of the options to create a file system here, you can refer to the documentation to learn how to create a file system. I make sure that Import data from and export data to S3 is selected.

Lustre - enable S3 synchronizationIt takes a few minutes to create the file system. Once the status is ✅ Available, I navigate to the Data repository tab, and then select Create data repository association.

I choose a Data Repository path (my source S3 bucket) and a file system path (where in the file system that bucket will be imported).

FsX Lustre Data repository

Then, I choose the Import policy and Export policy. I may synchronize the creation of file/objects, their updates, and when they are deleted. I select Create.

FsX Lustre Data repository import policies

When I use automatic import, I also make sure to provide an S3 bucket in the same AWS Region as the FSx for Lustre cluster. FSx for Lustre supports linking to an S3 bucket in a different AWS Region for automatic export and all other capabilities.

Using the console, I see the list of Data repository associations. I wait for the import task status to become ✅ Succeeded. If I link the file system to an S3 bucket with a large number of objects, then I may choose to skip Importing metadata from repository while creating the data repository association, and then load metadata from selected prefixes in my S3 buckets that are required for my workload using an Import task.

FsX for lustre - meta data repository tasks

I create an EC2 instance in the same VPC subnet. Furthermore, I make sure that the FSx for Lustre cluster security group authorizes ingress traffic from the EC2 instance. I use SSH to connect to the instance, and then type the following commands (commands are prefixed with the $ sign that is part of my shell prompt).

# check kernel version, minimum version 4.14.104-95.84 is required 
$ uname -r
4.14.252-195.483.amzn2.aarch64

# install lustre client 
$ sudo amazon-linux-extras install -y lustre2.10
Installing lustre-client
...
Installed:
  lustre-client.aarch64 0:2.10.8-5.amzn2                                                                                                                        

Complete!

# create a mount point 
$ sudo mkdir /fsx

# mount the file system 
$ sudo mount -t lustre -o noatime,flock fs-00...9d.fsx.us-east-1.amazonaws.com@tcp:/ny345bmv /fsx

# verify mount succeeded
$ mount 
...
172.0.0.0@tcp:/ny345bmv on /fsx type lustre (rw,noatime,flock,lazystatfs)

Then, I verify that the file system contains the S3 objects, and I create a new file using the touch command.

Fsx Lustre - check file system

I switch to the AWS Console, under S3 and then my bucket name, and I verify that the file has been synchronized.

Fsx Lustre - check s3

Using the console, I delete the file from S3. And, unsurprisingly, after a few seconds, the file is also deleted from the FSx file system.

Fsx Lustre - check file systems - deleted

Pricing and Availability
These new capabilities are available at no additional cost on Amazon FSx for Lustre file systems. Automatic export and multiple repositories are only available on Persistent 2 file systems in US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland). Automatic import with support for deleted and moved objects in S3 is available on file systems created after July 23, 2020 in all regions where FSx for Lustre is available.

You can configure your file system to automatically import S3 updates by using the AWS Management Console, the AWS Command Line Interface (CLI), and AWS SDKs.

Learn more about using S3 data repositories with Amazon FSx for Lustre file systems.

One More Thing
One more thing while you are reading. Today, we also launched the next generation of FSx for Lustre file systems. FSx for Lustre next-gen file systems are built on AWS Graviton processors. They are designed to provide you with up to 5x higher throughput per terabyte (up to 1 GB/s per terabyte) and reduce your cost of throughput by up to 60% as compared to previous generation file systems. Give it a try today!

— seb

PS : my colleague Michael recorded a demo video to show you the enhanced S3 integration for FSx for Lustre in action. Check it out today.

New – Offline Tape Migration Using AWS Snowball Edge

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-offline-tape-migration-using-aws-snowball-edge/

Over the years, we have given you a succession of increasingly powerful tools to help you migrate your data to the AWS Cloud. Starting with AWS Import/Export back in 2009, followed by Snowball in 2015, Snowmobile and Snowball Edge in 2016, and Snowcone in 2020, each new device has given you additional features to simplify and expedite the migration process. All of the devices are designed to operate in environments that suffer from network constraints such as limited bandwidth, high connections costs, or high latency.

Offline Tape Migration
Today, we are taking another step forward by making it easier for you to migrate data stored offline on physical tapes. You can get rid of your large and expensive storage facility, send your tape robots out to pasture, and eliminate all of the time & effort involved in moving archived data to new formats and mediums every few years, all while retaining your existing tape-centric backup & recovery utilities and workflows.

This launch brings a tape migration capability to AWS Snowball Edge devices, and allows you to migrate up to 80 TB of data per device, making it suitable for your petabyte-scale migration efforts. Tapes can be stored in the Amazon S3 Glacier Flexible Retrieval or Amazon S3 Glacier Deep Archive storage classes, and then accessed from on-premises and cloud-based backup and recovery utilities.

Back in 2013 I showed you how to Create a Virtual Tape Library Using the AWS Storage Gateway. Today’s launch builds on that capability in two different ways. First, you create a Virtual Tape Library (VTL) on a Snowball Edge and copy your physical tapes to it. Second, after your tapes are in the cloud, you create a VTL on a Storage Gateway and use it to access your virtual tapes.

Getting Started
To get started, I open the Snow Family Console and create a new job. Then I select Import virtual tapes into AWS Storage Gateway and click Next:

Then I go through the remainder of the ordering sequence (enter my shipping address, name my job, choose a KMS key, and set up notification preferences), and place my order. I can track the status of the job in the console:

When my device arrives I tell the somewhat perplexed delivery person about data transfer, carry it down to my basement office, and ask Luna to check it out:

Back in the Snow Family console, I download the manifest file and copy the unlock code:

I connect the Snowball Edge to my “corporate” network:

Then I install AWS OpsHub for Snow Family on my laptop, power on the Snowball Edge, and wait for it to obtain & display an IP address:

I launch OpsHub, sign in, and accept the default name for my device:

I confirm that OpsHub has access to my device, and that the device is unlocked:

I view the list of services running on the device, and note that Tape Gateway is not running:

Before I start Tape Gateway, I create a Virtual Network Interface (VNI):

And then I start the Tape Gateway service on the Snow device:

Now that the service is running on the device, I am ready to create the Storage Gateway. I click Open Storage Gateway console from within OpsHub:

I select Snowball Edge as my host platform:

Then I give my gateway a name (MyTapeGateway), select my backup application (Veeam Backup & Replication in this case), and click Activate Gateway:

Then I configure CloudWatch logging:

And finally, I review the settings and click Finish to activate my new gateway:

The activation process takes a few minutes, just enough time to take Luna for a quick walk. When I return, the console shows that the gateway is activated and running, and I am all set:

Creating Tapes
The next step is to create some virtual tapes. I click Create tapes and enter the requested information, including the pool (Deep Archive or Glacier), and click Create tapes:

The next step is to copy data from my physical tapes to the Snowball Edge. I don’t have a data center and I don’t have any tapes, so I can’t show you how to do this part. The data is stored on the device, and my Internet connection is used only for management traffic between the Snowball Edge device and AWS. To learn more about this part of the process, check out our new animated explainer.

After I have copied the desired tapes to the device, I prepare it for shipment to AWS. I make sure that all of the virtual tapes in the Storage Gateway Console have the status In Transit to VTS (Virtual Tape Shelf), and then I power down the device.

The display on the device updates to show the shipping address, and I wait for the shipping company to pick up the device.

When the device arrives at AWS, the virtual tapes are imported, stored in the S3 storage class associated with the pool that I chose earlier, and can be accessed by retrieving them using an online tape gateway. The gateway can be deployed as a virtual machine or a hardware appliance.

Now Available
You can use AWS Snowball Edge for offline tape migration in the US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Sydney) Regions. Start migrating petabytes of your physical tape data to AWS, today!

Jeff;

Preview – AWS Backup Adds Support for Amazon S3

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/preview-aws-backup-adds-support-for-amazon-s3/

Starting today, you can preview AWS Backup for Amazon Simple Storage Service (Amazon S3).

AWS Backup is a fully managed, policy-based service that lets you to centralize and automate the backup and restore of your applications spanning across 12 AWS services: Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Block Store (EBS) volumes, Amazon Relational Database Service (RDS) databases (including Amazon Aurora clusters), Amazon DynamoDB tables, Amazon Neptune databases, Amazon DocumentDB (with MongoDB compatibility) databases, Amazon Elastic File System (Amazon EFS) file systems, Amazon FSx for Lustre file systems, Amazon FSx for Windows File Server file systems, AWS Storage Gateway volumes, and now Amazon S3 (in preview).

Modern workloads and systems are leveraging different storage options for different functionalities. In the 21st century, it is normal to build applications relying on non-relational and relational databases, shared file storage, and object storage, just to name of few. When operating and managing these applications, you told us that you wanted centralized protection and provable compliance for application data stored in S3 alongside other AWS services for storage, compute, and databases.

I can see three benefits when integrating Amazon Simple Storage Service (Amazon S3) with your data protection policies in AWS Backup.

First, it lets you centrally manage your applications backups: AWS Backup provides an automated solution to centrally configure backup policies, thereby helping you simplify backup lifecycle management. This also makes it easy to ensure that your application data across AWS services (including S3) is centrally backed up.

Second, it lets you easily restore your data: AWS Backup provides a single-click-restore experience for your S3 data. This lets you perform point-in-time restores of your S3 buckets and objects to a new or existing S3 bucket.

Finally, it improves backup compliance: AWS Backup provides built-in dashboards that let you to track backup and restore operations for S3.

AWS Backup for S3 (Preview) lets you create continuous point-in-time backups along with periodic backups of S3 buckets, including object data, object tags, access control lists (ACLs), and user-defined metadata. The first backup is a full snapshot, while subsequent backups are incremental. If there is a data disruption event, then you choose a backup from the backup vault, and restore an S3 bucket (or individual S3 objects) to a new or existing S3 bucket. AWS Backup is integrated with AWS Organizations, which let you use a single policy across AWS accounts (within your Organizations) to automate backup creation and backup access management.

Furthermore, you can turn on AWS Backup Vault Lock to enable delete protection of the data that you protect with AWS Backup, and thereby improving protection of your immutable backups from accidental deletion or malicious re-encryption.

How to Get Started
AWS Backup works with versioned S3 buckets. Before you get started, turn on S3 Versioning on your buckets to backup.

I must enable S3 in AWS Backup Settings when I use this feature for the first time. Using the AWS Management Console, I navigate to AWS Backup, then select Settings and Configure resources. I enable S3, and select Confirm. This is a one-time operation.

AWS Backup - optin S3

For this demo, I already have an existing backup plan, and I want to add an S3 bucket to this plan. If you want to create a new backup plan, then you can refer to AWS Backup‘s technical documentation.

To start including my S3 objects in my backup plan, I open the AWS Management Console, navigate to Backup plans, and select Assign resources.

AWS Backup Add Resources

I give a name to my Resource assignment. I select Include specific resources types, then I select S3 as Resource type and one or several S3 Bucket names. When I am done, I select Assign resources.

Alternatively, I may use tags or resource IDs to assign S3 resources.

If you have thousands of S3 buckets, I recommend using tags to assign the S3 buckets to a backup plan. AWS Backup matches the tags in S3 buckets to the ones assigned to the backup plan, and it centrally backs up the S3 resources along with other AWS services that your application uses.

The other options are not different from what you know already.

AWS Backup - backup plan for S3

The Bucket names list in the previous screenshot only shows the S3 buckets in the same Region.

Alternatively, I may also create on-demand backups. I navigate to the Protected resources section, and select Create on-demand backup.

I select S3 as the Resource type, and select the Bucket name. As per usual, I choose a Backup Window, a Retention period, a Backup vault, and an IAM role. Then, I select Create on-demand backup.

AWS Backup - on-demand backup for S3After a while, depending on the size of my bucket, the backup is ✅ Completed.

AWS Backup for S3 - Backup completed

All of the backups are encrypted and stored securely in a backup vault that I selected in the backup plan.

A backup vault (or backup storage vault) is an encrypted logical construct in my AWS account that stores and organizes my backups (recovery points). I may create new backup vaults in every AWS Region where AWS Backup is available. I may enable AWS Backup Vault Lock (delete-protection capability) on the backup vault to avoid accidental deletions and prevent malicious actors from re-encrypting my data. AWS Backup stores my continuous backups and periodic snapshots in the backup vault of my preference, and it lets me browse and restore as per my requirements.

How to Restore Objects
Let’s try to restore this backup.

The restore operation is very flexible. I may restore entire S3 buckets or individual S3 objects. I may restore the backups to the source S3 bucket, or to another existing bucket. Furthermore, I may create a new S3 bucket during restore. The S3 buckets must have Versioning enabled. Also, I may change the encryption key during restore.

I navigate to Backup vaults to restore the S3 bucket I just backed up. In the Backups section, I select the Recovery point ID that I want to restore, and I select Restore from the Actions menu.

AWS Backup for S3 - restore

Before starting the restore, I may select a few options:

  • The Restore time: I may restore my continuous backup to a point-in-time in the last 35 days, while I can restore my periodic backups to their original state.
  • The Restore type: I may choose to restore the entire bucket or a subset of objects within it.
  • The Restore destination: I may choose to restore on the same bucket, on another one, or create a new bucket during restore.
  • The Restored object encryption: this lets me select the key I want to use to encrypt the restored objects in the bucket.

I select Restore backup to start the restore.

AWS Backup for S3 - restore optionsI can monitor the progress in the Jobs section, under the Restore jobs tab.

AWS Backup S3 - restore Jobs

When the status turns green to ✅ Completed, my objects are ready to use!

Generally, the most comprehensive data-protection strategies include regular testing and validation of your restore procedures before you need them. Testing your restores also helps to prepare and maintain recovery runbooks. In turn, that ensures operational readiness during a disaster recovery exercise, or an actual data loss scenario.

Availability and Pricing
The preview is available in the US West (Oregon) Region only.

During the preview, there are no charges for creating and storing backups. You will pay the AWS charges for underlying resources, such as S3 storage, API usage, and versioning.

Send us an email at [email protected] including your AWS account ID to register for the preview.

Go ahead and apply to the preview program today.

— seb

Amazon S3 Glacier is the Best Place to Archive Your Data – Introducing the S3 Glacier Instant Retrieval Storage Class

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/amazon-s3-glacier-is-the-best-place-to-archive-your-data-introducing-the-s3-glacier-instant-retrieval-storage-class/

Today we are announcing the Amazon S3 Glacier Instant Retrieval storage class. This new archive storage class delivers the lowest cost storage for long-lived data that is rarely accessed and requires millisecond retrieval.

We are also excited to announce that S3 Intelligent-Tiering now automatically optimizes storage costs for rarely accessed data that needs immediate retrieval with the new Archive Instant Access tier, which is ideal for data with unknown or changing access patterns. For existing customers, this will provide an immediate savings of 68 percent for data that hasn’t been accessed for more than 90 days, with no action needed. The Frequent, Infrequent, and now Archive Instant Access tiers are designed for the same milliseconds access time and high-throughput performance.

In addition, we are announcing the new name for the existing Amazon S3 Glacier storage class and several price reductions.

Amazon S3 Glacier Instant Retrieval
The Amazon S3 Glacier storage classes are extremely low-cost and built for data archiving. They are secure and durable, and they are designed to provide the lowest cost for data that does not require immediate access, with retrieval options from minutes to hours.

Many customers need to store rarely accessed data for several years. However the data must be highly available and immediately accessible. Today, these customers use the S3 Standard-Infrequent Access (S3 Standard-IA) storage class. This storage class offers low cost for storage and allows customers to retrieve their data instantly.

S3 Glacier Instant Retrieval is a new storage class that delivers the fastest access to archive storage, with the same low latency and high-throughput performance as the S3 Standard and S3 Standard-IA storage classes. You can save up to 68 percent on storage costs as compared with using the S3 Standard-IA storage class when you use the S3 Glacier Instant Retrieval storage class and pay a low price to retrieve data. For example, in the US East (N. Virginia) Region, S3 Glacier Instant Retrieval storage pricing is $0.004 per GB-month and data retrieval is $0.03 per GB. Learn more about pricing for your Region.

Media archives, medical images, or user-generated content are just a few examples of ideal use cases for S3 Glacier Instant Retrieval. Once created, this content is rarely accessed, but when it is needed it must be available in milliseconds.

To get started using the new storage class from the Amazon S3 console, upload an object as you would normally, and select the S3 Glacier Instant Retrieval storage class.

Upload object with the new storage class

This feature is available programmatically from AWS SDKs, AWS Command Line Interface (CLI), and AWS CloudFormation.

In my opinion, the easiest way to store data in S3 Glacier Instant Retrieval is to use the S3 PUT API using the CLI. When using this API, set the storage class to GLACIER_IR.

aws s3api put-object --bucket <bucket-name> --key <object-key> --body <name-file> --storage-class GLACIER_IR

When the object is uploaded to Amazon S3, verify the storage class in the list of objects or on the object details page.

Storage classes

For data that already exists in Amazon S3, you can use S3 Lifecycle to transition data from the S3 Standard and S3 Standard-IA storage classes into S3 Glacier Instant Retrieval.

New Archive Instant Access Tier in S3 Intelligent-Tiering
S3 Intelligent-Tiering is a storage class that automatically moves objects between access tiers to optimize costs. This is the recommended storage class for data with unpredictable or changing access patterns, such as in data lakes, analytics, or user-generated content.

Until today, there were two low latency access tiers optimized for frequent and infrequent access, and two optional archive access tiers designed for asynchronous access optimized for rare access at a low cost.

Beginning today, the Archive Instant Access tier is added as a new access tier in the S3 Intelligent-Tiering storage class. You will start seeing automatic costs savings for your storage in S3 Intelligent-Tiering for rarely accessed objects.

The Archive Instant Access tier joins the group of low latency access tiers. This new tier is optimized for data that is not accessed for months at a time but, when it is needed, is available within milliseconds.

S3 Intelligent-Tiering automatically stores objects in three access tiers that deliver the same performance as the S3 Standard storage class:

  • Frequent Access tier
  • Infrequent Access tier
  • Archive Instant Access (new)

For a small monitoring and automation charge, S3 Intelligent-Tiering monitors access patterns and moves objects between the different access tiers. Objects that have not been accessed for 30 consecutive days are moved from the Frequent Access tier to the Infrequent Access tier for savings of 40 percent. When an object hasn’t been accessed for 90 consecutive days, S3 Intelligent-Tiering will move the object from the Infrequent Access tier to the Archive Instant Access tier, with a savings of 68 percent. If the data is accessed later, it is automatically moved back to the Frequent Access tier. No tiering charges apply when objects are moved between access tiers within the S3 Intelligent-Tiering storage class.

S3 Intelligent-Tiering access tiers

To get started with this new access tier, select Intelligent-Tiering as the storage class for an object when uploading an object using the S3 console. After 90 days of inactivity (30 days in Frequent Access tier and 60 days in Infrequent Access tier), S3 Intelligent-Tiering will automatically move the object to the Archive Instant Access tier. The introduction of the new Archive Instant Access tier has no impact on performance when you retrieve objects.

New name for the Amazon S3 Glacier storage class – S3 Glacier Flexible Retrieval
The existing Amazon S3 Glacier storage class is now named S3 Glacier Flexible Retrieval. This storage class now has free bulk retrievals in 5 to 12 hours, and the storage price has been reduced by 10 percent in all Regions, effective December 1, 2021. S3 Glacier Flexible Retrieval is now even more cost-effective, and the free bulk retrievals make it ideal for retrieving large data volumes.

These are the Amazon S3 archive storage classes:

  • S3 Glacier Instant Retrieval: The newest storage class is optimized for long-lived data that is rarely accessed (typically once per quarter). However when data is needed, it is available within milliseconds. For example, medical images and news media assets are perfect for this storage class.
  • S3 Glacier Flexible Retrieval: This newly renamed storage class is optimized for archiving data that can be retrieved in minutes or with free bulk retrievals in 5 to 12 hours. This storage class is ideal for backups and disaster recovery use cases, where you have large amounts of long-term, rarely accessed data, and you don’t want to worry about retrieval costs when you need the data.
  • S3 Glacier Deep Archive: This storage class is the lowest-cost storage in the cloud and is optimized for archiving data that can be restored in at least 12 hours. It’s great for storing your compliance archives or for digital media preservation.

Amazon S3 has reduced storage prices!
We are excited to announce that Amazon S3 has reduced storage prices of up to 31 percent in the S3 Standard-IA and S3 One Zone-IA storage classes across 9 AWS Regions: US West (N. California), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Osaka), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and South America (São Paulo). These price reductions are effective December 1, 2021.

Learn more about price reduction details.

Available Now
The new storage class, S3 Glacier Instant Retrieval, and the new Archive Instant Access tier in S3 Intelligent-Tiering are available today (November 30, 2021) in all AWS Regions.

The price cut for S3 Glacier and free bulk retrievals in all AWS Regions, and the S3 Standard-Infrequent Access/One Zone-Infrequent storage class in nine Regions will be effective on December 1, 2021.

Learn more about the storage classes changes and all the storage classes.

Marcia

New – Simplify Access Management for Data Stored in Amazon S3

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/new-simplify-access-management-for-data-stored-in-amazon-s3/

Today, we are introducing a couple new features that simplify access management for data stored in Amazon Simple Storage Service (Amazon S3). First, we are introducing a new Amazon S3 Object Ownership setting that lets you disable access control lists (ACLs) to simplify access management for data stored in Amazon S3. Second, the Amazon S3 console policy editor now reports security warnings, errors, and suggestions powered by IAM Access Analyzer as you author your S3 policies.

Since launching 15 years ago, Amazon S3 buckets have been private by default. At first, the only way to grant access to objects was using ACLs. In 2011, AWS Identity and Access Management (IAM) was announced, which allowed the use of policies to define permissions and control access to buckets and objects in Amazon S3. Nowadays, you have several ways to control access to your data in Amazon S3, including IAM policies, S3 bucket policies, S3 Access Points policies, S3 Block Public Access, and ACLs.

ACLs are an access control mechanism in which each bucket and object has an ACL attached to it. ACLs define which AWS accounts or groups are granted access as well as the type of access. When an object is created, the ownership of it belongs to the creator.  This ownership information is embedded in the object ACL. When you upload an object to a bucket owned by another AWS account, and you want the bucket owner to access the object, then permissions need to be granted in the ACL. In many cases, ACLs and other kinds of policies are used within the same bucket.

The new Amazon S3 Object Ownership setting, Bucket owner enforced, lets you disable all of the ACLs associated with a bucket and the objects in it. When you apply this bucket-level setting, all of the objects in the bucket become owned by the AWS account that created the bucket, and ACLs are no longer used to grant access. Once applied, ownership changes automatically, and applications that write data to the bucket no longer need to specify any ACL. As a result, access to your data is based on policies. This simplifies access management for data stored in Amazon S3.

With this launch, when creating a new bucket in the Amazon S3 console, you can choose whether ACLs are enabled or disabled. In the Amazon S3 console, when you create a bucket, the default selection is that ACLs are disabled. If you wish to keep ACLs enabled, you can choose other configurations for Object Ownership, specifically:

  • Bucket owner preferred: All new objects written to this bucket with the bucket-owner-full-controlled canned ACL will be owned by the bucket owner. ACLs are still used for access control.
  • Object writer: The object writer remains the object owner. ACLs are still used for access control.

Options for object ownership

For existing buckets, you can view and manage this setting in the Permissions tab.

Before enabling the Bucket owner enforced setting for Object Ownership on an existing bucket, you must migrate access granted to other AWS accounts from the bucket ACL to the bucket policy. Otherwise, you will receive an error when enabling the setting. This helps you ensure applications writing data to your bucket are uninterrupted. Make sure to test your applications after you migrate the access.

Policy validation in the Amazon S3 console
We are also introducing policy validation in the Amazon S3 console to help you out when writing resource-based policies for Amazon S3. This simplifies authoring access control policies for Amazon S3 buckets and access points with over 100 actionable policy checks powered by IAM Access Analyzer.

To access policy validation in the Amazon S3 console, first go to the detail page for a bucket. Then, go to the Permissions tab and edit the bucket policy.

Accessing the IAM Policy Validation in S3 consoleWhen you start writing your policy, you see that, as you type, different findings appear at the bottom of the screen. Policy checks from IAM Access Analyzer are designed to validate your policies and report security warnings, errors, and suggestions as findings based on their impact to help you make your policy more secure.

You can also perform these checks and validations using the IAM Access Analyzer’s ValidatePolicy API.

Example of policy suggestion

Availability
Amazon S3 Object Ownership is available at no additional cost in all AWS Regions, excluding the AWS China Regions and AWS GovCloud Regions. IAM Access Analyzer policy validation in the Amazon S3 console is available at no additional cost in all AWS Regions, including the AWS China Regions and AWS GovCloud Regions.

Get started with Amazon S3 Object Ownership through the Amazon S3 console, AWS Command Line Interface (CLI), Amazon S3 REST API, AWS SDKs, or AWS CloudFormation. Learn more about this feature on the documentation page.

And to learn more and get started with policy validation in the Amazon S3 console, see the Access Analyzer policy validation documentation.

Marcia

New for AWS Backup – Support for VMware and VMware Cloud on AWS

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-backup-support-for-vmware-and-vmware-cloud-on-aws/

Today, I am happy to announce AWS Backup support for VMware, a new capability that enables you to centralize and automate data protection of virtual machines (VMs) running on VMware on premises and VMware CloudTM on AWS. You can now use a single, centrally managed policy in AWS Backup to protect these VMware environments together with 12 AWS compute, storage, and database services already supported by AWS Backup. You can then use AWS Backup to restore VMware workloads to on-premises data centers and VMware Cloud on AWS.

While doing so, AWS Backup Audit Manager lets you consistently demonstrate compliance by monitoring backup, copy, and restore operations and generating auditor-ready reports to satisfy your data governance and regulatory requirements.

Let’s see how this works in practice.

Using AWS Backup Support for VMware
There are three steps to back up VMware virtual machines (VMs) with AWS Backup:

  1. Create a gateway to connect AWS Backup to your hypervisor.
  2. Connect to your hypervisor through the gateway.
  3. Assign virtual machines managed by your hypervisor to a backup plan.

AWS Back Support for VMware diagram

On the left pane of the AWS Backup console, there is a new External resources section. There, I choose Gateways and then Create gateway. This AWS Backup gateway helps with discovery of the on-premises VMware environment and acts as a cloud gateway to send and receive data.

I download the Open Virtualization Format (OVF) file of the AWS Backup gateway and follow the instructions to deploy the gateway using the VMware vSphere client. I am using an internal test and development VMware environment for this walkthrough.

VMware vCenter screenshot.

After deploying the gateway in my VMware environment, I come back to the AWS Backup console. I write a name for the gateway (for simplicity, I use the same name of the gateway VM) and the IP address of the gateway VM. Optionally, I can add tags to help organize and track my setup. I go on and create the gateway.

Console screenshot.

Now, I choose Add hypervisor. I write a name for the hypervisor and the IP address of the VMware vCenter server host.

Console screenshot.

I enter the username and password of a service account that I created for AWS Backup on the Active Directory domain. The username should include the domain (for example, username@domain). Then, I choose the encryption key to protect the service account credentials. If I don’t choose my own AWS Key Management Service (KMS) key, AWS Backup encrypts the username and password using a key that AWS owns and manages.

Console screenshot.

I select the gateway to connect to the hypervisor and choose Test gateway connection. This test helps ensure that the gateway can communicate with the hypervisor before I complete the configuration. Optionally, I can add tags to help organize and track my setup. I go on and add the hypervisor.

Console screenshot.

After a few minutes, the hypervisor is online, and I see the VMs managed by vCenter in the AWS Backup console. I can now use these virtual machines as resources in my backup plans in the same way as the other AWS compute, storage, and database resources supported by AWS Backup.

Console screenshot.

I create a new backup plan and start with a template. The rules of the template enforce daily backups with five weeks of retention and monthly backups with one year of retention. I can customize these rules based on my requirements.

Console screenshot.

Then, I choose to assign resources to the backup plan, and I select three VMs.

Console screenshot.

If you need, you can create an on-demand backup in the Protected resources section of the console. For example, here I am starting the on-demand backup for one of the VMs.

Console screenshot.

When a backup is complete, VMs are added to the list of the protected resources, and I can initiate a restore.

Console screenshot.

I select the backup and choose Restore. Then, I enter the restore location, which can be the same VMware environment I used for the backup or another (for example, on VMware Cloud on AWS). Below, I specify name, path, compute resource name, and datastore to use for the restore. Then, I choose Restore backup.

Console screenshot.

I monitor the status of my backup and restore jobs from the AWS Backup console. To monitor backup and restore metrics over a period of time, I can use Amazon CloudWatch metrics, logs, and alarms. I can also send events to Amazon EventBridge to receive notifications once a job completes or fails.

Availability and Pricing
AWS Backup support for VMware is available in the US East (N. Virginia, Ohio), US West (N. California, Oregon), GovCloud (US-East, US-West), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), South America (São Paulo), Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo, Osaka), Middle East (Bahrain), and Africa (Cape Town) Regions. Please see the AWS Regional Services List for more information.

AWS Backup supports VMware ESXi 6.7.x and 7.0.x VMs running on NFS, VMFS, and VSAN data stores on premises and in VMware Cloud on AWS. In addition, AWS Backup supports both SCSI Hot-Add and Network Block Device (NBD) transport modes for copying data from source VMs to AWS.

With AWS Backup support for VMware, you pay using the same dimensions that AWS Backup uses today: backup storage, restore, and cross-region data transfer. For more information, see the AWS Backup pricing page.

Your VM backups are stored in a backup vault. All backups stored and managed by AWS Backup are replicated to 3 Availability Zones (AZs) in the Region and designed for 99.999999999 percent (11 9s) durability and 99.99 percent (4 9s) of service availability.

AWS Backup supports first full, then incremental-forever, backups of VMs that you can create on-demand or via a schedule configured in your backup plan. AWS Backup always does full restores even though backups are stored as incremental, enabling you to benefit from storage efficiency cost savings while easily performing restores.

Centrally protect your VMware environments and your AWS compute, storage, and database resources with AWS Backup.

Danilo