Tag Archives: Networking & Content Delivery*

Scale AWS Glue jobs by optimizing IP address consumption and expanding network capacity using a private NAT gateway

2024-03-19 Sushanth Kothapally

Post Syndicated from Sushanth Kothapally original https://aws.amazon.com/blogs/big-data/scale-aws-glue-jobs-by-optimizing-ip-address-consumption-and-expanding-network-capacity-using-a-private-nat-gateway/

As businesses expand, the demand for IP addresses within the corporate network often exceeds the supply. An organization’s network is often designed with some anticipation of future requirements, but as enterprises evolve, their information technology (IT) needs surpass the previously designed network. Companies may find themselves challenged to manage the limited pool of IP addresses.

For data engineering workloads when AWS Glue is used in such a constrained network configuration, your team may sometimes face hurdles running many jobs simultaneously. This happens because you may not have enough IP addresses to support the required connections to databases. To overcome this shortage, the team may get more IP addresses from your corporate network pool. These obtained IP addresses can be unique (non-overlapping) or overlapping, when the IP addresses are reused in your corporate network.

When you use overlapping IP addresses, you need an additional network management to establish connectivity. Networking solutions can include options like private Network Address Translation (NAT) gateways, AWS PrivateLink, or self-managed NAT appliances to translate IP addresses.

In this post, we will discuss two strategies to scale AWS Glue jobs:

Optimizing the IP address consumption by right-sizing Data Processing Units (DPUs), using the Auto Scaling feature of AWS Glue, and fine-tuning of the jobs.
Expanding the network capacity using additional non-routable Classless Inter-Domain Routing (CIDR) range with a private NAT gateway.

Before we dive deep into these solutions, let us understand how AWS Glue uses Elastic Network Interface (ENI) for establishing connectivity. To enable access to data stores inside a VPC, you need to create an AWS Glue connection that is attached to your VPC. When an AWS Glue job runs in your VPC, the job creates an ENI inside the configured VPC for each data connection, and that ENI uses an IP address in the specified VPC. These ENIs are short-lived and active until job is complete.

Now let us look at the first solution that explains optimizing the AWS Glue IP address consumption.

Strategies for efficient IP address consumption

In AWS Glue, the number of workers a job uses determines the count of IP addresses used from your VPC subnet. This is because each worker requires one IP address that maps to one ENI. When you don’t have enough CIDR range allocated to the AWS Glue subnet, you may observe IP address exhaustion errors. The following are some best practices to optimize AWS Glue IP address consumption:

Right-sizing the job’s DPUs – AWS Glue is a distributed processing engine. It works efficiently when it can run tasks in parallel. If a job has more than the required DPUs, it doesn’t always run quicker. So, finding the right number of DPUs will make sure you use IP addresses optimally. By building observability in the system and analyzing the job performance, you can get insights into ENI consumption trends and then configure the appropriate capacity on the job for the right size. For more details, refer to Monitoring for DPU capacity planning. The Spark UI is a helpful tool to monitor AWS Glue jobs’ workers usage. For more details, refer to Monitoring jobs using the Apache Spark web UI.
AWS Glue Auto Scaling – It’s often difficult to predict a job’s capacity requirements upfront. Enabling the Auto Scaling feature of AWS Glue will offload some of this responsibility to AWS. At runtime based on the workload requirements, the job automatically scales worker nodes upto the defined maximum configuration. If there is no additional need, AWS Glue will not overprovision workers, thereby saving resources and reducing cost. The Auto Scaling feature is available in AWS Glue 3.0 and later. For more information, refer to Introducing AWS Glue Auto Scaling: Automatically resize serverless computing resources for lower cost with optimized Apache Spark.
Job-level optimization – Identify job-level optimizations by using AWS Glue job metrics , and apply best practices from Best practices for performance tuning AWS Glue for Apache Spark jobs.

Next let us look at the second solution that elaborates network capacity expansion.

Solutions for network size (IP address) expansion

In this section, we will discuss two possible solutions to expand network size in more detail.

Expand VPC CIDR ranges with routable addresses

One solution is to add more private IPv4 CIDR ranges from RFC 1918 to your VPC. Theoretically, each AWS account can be assigned to some or all these IP address CIDRs. Your IP Address Management (IPAM) team often manages the allocation of IP addresses that each business unit can use from RFC1918 to avoid overlapping IP addresses across multiple AWS accounts or business units. If your current routable IP address quota allocated by the IPAM team is not sufficient, then you can request for more.

If your IPAM team issues you an additional non-overlapping CIDR range, then you can either add it as a secondary CIDR to your existing VPC or create a new VPC with it. If you are planning to create a new VPC, then you can inter-connect the VPCs via VPC peering or AWS Transit Gateway.

If this additional capacity is sufficient to run all your jobs within defined the timeframe, then it is a simple and cost-effective solution. Otherwise, you can consider adopting overlapping IP addresses with a private NAT gateway, as described in the following section. With the following solution you must use Transit Gateway to connect VPCs as VPC peering is not possible when there are overlapping CIDR ranges in those two VPCs.

Configure non-routable CIDR with a private NAT gateway

As described in the AWS whitepaper Building a Scalable and Secure Multi-VPC AWS Network Infrastructure, you can expand your network capacity by creating a non-routable IP address subnet and using a private NAT gateway that is located in a routable IP address space (non-overlapping) to route traffic. A private NAT gateway translates and routes traffic between non-routable IP addresses and routable IP addresses. The following diagram demonstrates the solution with reference to AWS Glue.

High level architecture

As you can see in the above diagram, VPC A (ETL) has two CIDR ranges attached. The smaller CIDR range 172.33.0.0/24 is routable because it not reused anywhere, whereas the larger CIDR range 100.64.0.0/16 is non-routable because it is reused in the database VPC.

In VPC B (Database), we have hosted two databases in routable subnets 172.30.0.0/26 and 172.30.0.64/26. These two subnets are in two separate Availability Zones for high availability. We also have two additional unused subnet 100.64.0.0/24 and 100.64.1.0/24 to simulate a non-routable setup.

You can choose the size of the non-routable CIDR range based on your capacity requirements. Since you can reuse IP addresses, you can create a very large subnet as needed. For example, a CIDR mask of /16 would give you approximately 65,000 IPv4 addresses. You can work with your network engineering team and size the subnets.

In short, you can configure AWS Glue jobs to use both routable and non-routable subnets in your VPC to maximize the available IP address pool.

Now let us understand how Glue ENIs that are in a non-routable subnet communicate with data sources in another VPC.

Call flow

The data flow for the use case demonstrated here is as follows (referring to the numbered steps in figure above):

When an AWS Glue job needs to access a data source, it first uses the AWS Glue connection on the job and creates the ENIs in the non-routable subnet 100.64.0.0/24 in VPC A. Later AWS Glue uses the database connection configuration and attempts to connect to the database in VPC B 172.30.0.0/24.
As per the route table VPCA-Non-Routable-RouteTable the destination 172.30.0.0/24 is configured for a private NAT gateway. The request is sent to the NAT gateway, which then translates the source IP address from a non-routable IP address to a routable IP address. Traffic is then sent to the transit gateway attachment in VPC A because it’s associated with the VPCA-Routable-RouteTable route table in VPC A.
Transit Gateway uses the 172.30.0.0/24 route and sends the traffic to the VPC B transit gateway attachment.
The transit gateway ENI in VPC B uses VPC B’s local route to connect to the database endpoint and query the data.
When the query is complete, the response is sent back to VPC A. The response traffic is routed to the transit gateway attachment in VPC B, then Transit Gateway uses the 172.33.0.0/24 route and sends traffic to the VPC A transit gateway attachment.
The transit gateway ENI in VPC A uses the local route to forward the traffic to the private NAT gateway, which translates the destination IP address to that of ENIs in non-routable subnet.
Finally, the AWS Glue job receives the data and continues processing.

The private NAT gateway solution is an option if you need extra IP addresses when you can’t obtain them from a routable network in your organization. Sometimes with each additional service there is an additional cost incurred, and this trade-off is necessary to meet your goals. Refer to the NAT Gateway pricing section on the Amazon VPC pricing page for more information.

Prerequisites

To complete the walk-through of the private NAT gateway solution, you need the following:

An AWS account with a role that has sufficient access to provision the required resources.
An AWS Identity and Access Management (IAM) user with access to AWS resources, including Amazon Virtual Private Cloud (Amazon VPC), Transit Gateway, a private NAT gateway, and an Amazon Relational Database Service (Amazon RDS) for MySQL database.

Deploy the solution

To implement the solution, complete the following steps:

Sign in to your AWS management console.
Deploy the solution by clicking . This stack defaults to us-east-1, you can select your desired Region.
Click next and then specify the stack details. You can retain the input parameters to the prepopulated default values or change them as needed.
For DatabaseUserPassword, enter an alphanumeric password of your choice and ensure to note it down for further use.
For S3BucketName, enter a unique Amazon Simple Storage Service (Amazon S3) bucket name. This bucket stores the AWS Glue job script that will be copied from an AWS public code repository.
Click next.
Leave the default values and click next again.
Review the details, acknowledge the creation of IAM resources, and click submit to start the deployment.

You can monitor the events to see resources being created on the AWS CloudFormation console. It may take around 20 minutes for the stack resources to be created.

After the stack creation is complete, go to the Outputs tab on the AWS CloudFormation console and note the following values for later use:

DBSource
DBTarget
SourceCrawler
TargetCrawler

Connect to an AWS Cloud9 instance

Next, we need to prepare the source and target Amazon RDS for MySQL tables using an AWS Cloud9 instance. Complete the following steps:

On the AWS Cloud9 console page, locate the aws-glue-cloud9 environment.
In the Cloud9 IDE column, click on Open to launch your AWS Cloud9 instance in a new web browser.

Prepare the source MySQL table

Complete the following steps to prepare your source table:

From the AWS Cloud9 terminal, install the MySQL client using the following command: sudo yum update -y && sudo yum install -y mysql
Connect to the source database using the following command. Replace the source hostname with the DBSource value you captured earlier. When prompted, enter the database password that you specified during the stack creation. mysql -h <Source Hostname> -P 3306 -u admin -p

Run the following scripts to create the source emp table, and load the test data:

-- connect to source database
USE srcdb;
-- Drop emp table if it exists
DROP TABLE IF EXISTS emp;
-- Create the emp table
CREATE TABLE emp (empid INT AUTO_INCREMENT,
                  ename VARCHAR(100) NOT NULL,
                  edept VARCHAR(100) NOT NULL,
                  PRIMARY KEY (empid));
-- Create a stored procedure to load sample records into emp table
DELIMITER $$
CREATE PROCEDURE sp_load_emp_source_data()
BEGIN
DECLARE empid INT;
DECLARE ename VARCHAR(100);
DECLARE edept VARCHAR(50);
DECLARE cnt INT DEFAULT 1; -- Initialize counter to 1 to auto-increment the PK
DECLARE rec_count INT DEFAULT 1000; -- Initialize sample records counter
TRUNCATE TABLE emp; -- Truncate the emp table
WHILE cnt <= rec_count DO -- Loop and load the required number of sample records
SET ename = CONCAT('Employee_', FLOOR(RAND() * 100) + 1); -- Generate random employee name
SET edept = CONCAT('Dept_', FLOOR(RAND() * 100) + 1); -- Generate random employee department
-- Insert record with auto-incrementing empid
INSERT INTO emp (ename, edept) VALUES (ename, edept);
-- Increment counter for next record
SET cnt = cnt + 1;
END WHILE;
COMMIT;
END$$
DELIMITER ;
-- Call the above stored procedure to load sample records into emp table
CALL sp_load_emp_source_data();

Check the source emp table’s count using the below SQL query (you need this at later step for verification). select count(*) from emp;
Run the following command to exit from the MySQL client utility and return to the AWS Cloud9 instance’s terminal: quit;

Prepare the target MySQL table

Complete the following steps to prepare the target table:

Connect to the target database using the following command. Replace the target hostname with the DBTarget value you captured earlier. When prompted enter the database password that you specified during the stack creation. mysql -h <Target Hostname> -P 3306 -u admin -p

Run the following scripts to create the target emp table. This table will be loaded by the AWS Glue job in the subsequent step.

-- connect to the target database
USE targetdb;
-- Drop emp table if it exists 
DROP TABLE IF EXISTS emp;
-- Create the emp table
CREATE TABLE emp (empid INT AUTO_INCREMENT,
                  ename VARCHAR(100) NOT NULL,
                  edept VARCHAR(100) NOT NULL,
                  PRIMARY KEY (empid)
);

Verify the networking setup (Optional)

The following steps are useful to understand NAT gateway, route tables, and the transit gateway configurations of private NAT gateway solution. These components were created during the CloudFormation stack creation.

On the Amazon VPC console page, navigate to Virtual private cloud section and locate NAT gateways.
Search for NAT Gateway with name Glue-OverlappingCIDR-NATGW and explore it further. As you can see in the following screenshot, the NAT gateway was created in VPC A (ETL) on the routable subnet.
In the left side navigation pane, navigate to Route tables under virtual private cloud section.
Search for VPCA-Non-Routable-RouteTable and explore it further. You can see that the route table is configured to translate traffic from overlapping CIDR using the NAT gateway.
In the left side navigation pane, navigate to Transit gateways section and click on Transit gateway attachments. Enter VPC- in the search box and locate the two newly created transit gateway attachments.
You can explore these attachments further to learn their configurations.

Run the AWS Glue crawlers

Complete the following steps to run the AWS Glue crawlers that are required to catalog the source and target emp tables. This is a prerequisite step for running the AWS Glue job.

On the AWS Glue Console page, under Data Catalog section in the navigation pane, click on Crawlers.
Locate the source and target crawlers that you noted earlier.
Select these crawlers and click Run to create the respective AWS Glue Data Catalog tables.
You can monitor the AWS Glue crawlers for the successful completion. It may take around 3–4 minutes for both crawlers to complete. When they’re done, the last run status of the job changes to Succeeded, and you can also see there are two AWS Glue catalog tables created from this run.

Run the AWS Glue ETL job

After you set up the tables and complete the prerequisite steps, you are now ready to run the AWS Glue job that you created using the CloudFormation template. This job connects to the source RDS for MySQL database, extracts the data, and loads the data into the target RDS for MySQL database. This job reads data from a source MySQL table and loads it to the target MySQL table using private NAT gateway solution. To run the AWS Glue job, complete the following steps:

On the AWS Glue console, click on ETL jobs in the navigation pane.
Click on the job glue-private-nat-job.
Click Run to start it.

The following is the PySpark script for this ETL job:

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)

# Script generated for node AWS Glue Data Catalog
AWSGlueDataCatalog_node = glueContext.create_dynamic_frame.from_catalog(
    database="glue_cat_db_source",
    table_name="srcdb_emp",
    transformation_ctx="AWSGlueDataCatalog_node",
)

# Script generated for node Change Schema
ChangeSchema_node = ApplyMapping.apply(
    frame=AWSGlueDataCatalog_node,
    mappings=[
        ("empid", "int", "empid", "int"),
        ("ename", "string", "ename", "string"),
        ("edept", "string", "edept", "string"),
    ],
    transformation_ctx="ChangeSchema_node",
)

# Script generated for node AWS Glue Data Catalog
AWSGlueDataCatalog_node = glueContext.write_dynamic_frame.from_catalog(
    frame=ChangeSchema_node,
    database="glue_cat_db_target",
    table_name="targetdb_emp",
    transformation_ctx="AWSGlueDataCatalog_node",
)

job.commit()

Based on the job’s DPU configuration, AWS Glue creates a set of ENIs in the non-routable subnet that is configured on the AWS Glue connection. You can monitor these ENIs on the Network Interfaces page of the Amazon Elastic Compute Cloud (Amazon EC2) console.

The below screenshot shows the 10 ENIs that were created for the job run to match the requested number of workers configured on the job parameters. As expected, the ENIs were created in the non-routable subnet of VPC A, enabling scalability of IP addresses. After the job is complete, these ENIs will be automatically released by AWS Glue. Execution ENIs

When the AWS Glue job is running, you can monitor its status. Upon successful completion, the job’s status changes to Succeeded. Job successful completition

Verify the results

After the AWS Glue job is complete, connect to the target MySQL database. Verify if the target record count matches to the source. You can use the below SQL query in AWS Cloud9 terminal.

USE targetdb;
SELECT count(*) from emp;

Finally, exit from the MySQL client utility using the following command and return to the AWS Cloud9 terminal: quit;

You can now confirm that AWS Glue has successfully completed a job to load data to a target database using the IP addresses from a non-routable subnet. This concludes end to end testing of the private NAT gateway solution.

Clean up

To avoid incurring future charges, delete the resource created via CloudFormation stack by completing the following steps:

On the AWS CloudFormation console, click Stacks in the navigation pane.
Select the stack AWSGluePrivateNATStack.
Click on Delete to delete the stack. When prompted confirm the stack deletion.

Conclusion

In this post, we demonstrated how you can scale AWS Glue jobs by optimizing IP addresses consumption and expanding your network capacity by using a private NAT gateway solution. This two-fold approach helps you to get unblocked in an environment that has IP address capacity constraints. The options discussed in the AWS Glue IP address optimization section are complimentary to the IP address expansion solutions, and you can iteratively build to mature your data platform.

Learn more about AWS Glue job optimization techniques from Monitor and optimize cost on AWS Glue for Apache Spark and Best practices to scale Apache Spark jobs and partition data with AWS Glue.

About the authors

Sushanth Kothapally is a Solutions Architect at Amazon Web Services supporting Automotive and Manufacturing customers. He is passionate about designing technology solutions to meet business goals and has keen interest in serverless and event-driven architectures.

Author2 Senthil Kamala Rathinam is a Solutions Architect at Amazon Web Services specializing in Data and Analytics. He is passionate about helping customers to design and build modern data platforms. In his free time, Senthil loves to spend time with his family and play badminton.

Free data transfer out to internet when moving out of AWS

2024-03-05 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/free-data-transfer-out-to-internet-when-moving-out-of-aws/

You told us one of the primary reasons to adopt Amazon Web Services (AWS) is the broad choice of services we offer, enabling you to innovate, build, deploy, and monitor your workloads. AWS has continuously expanded its services to support virtually any cloud workload. It now offers over 200 fully featured services for compute, storage, databases, networking, analytics, machine learning (ML) and artificial intelligence (AI), and many more. For example, Amazon Elastic Compute Cloud (Amazon EC2) offers over 750 generally available instances—more than any other major cloud provider—and you can choose from numerous relational, analytics, key-value, document, or graph databases.

We believe this choice must include the one to migrate your data to another cloud provider or on-premises. That’s why, starting today, we’re waiving data transfer out to the internet (DTO) charges when you want to move outside of AWS.

Over 90 percent of our customers already incur no data transfer expenses out of AWS because we provide 100 gigabytes per month free from AWS Regions to the internet. This includes traffic from Amazon EC2, Amazon Simple Storage Service (Amazon S3), Application Load Balancer, among others. In addition, we offer one terabyte of free data transfer out of Amazon CloudFront every month.

If you need more than 100 gigabytes of data transfer out per month while transitioning, you can contact AWS Support to ask for free DTO rates for the additional data. It’s necessary to go through support because you make hundreds of millions of data transfers each day, and we generally do not know if the data transferred out to the internet is a normal part of your business or a one-time transfer as part of a switch to another cloud provider or on premises.

We will review requests at the AWS account level. Once approved, we will provide credits for the data being migrated. We don’t require you to close your account or change your relationship with AWS in any way. You’re welcome to come back at any time. We will, of course, apply additional scrutiny if the same AWS account applies multiple times for free DTO.

We believe in customer choice, including the choice to move your data out of AWS. The waiver on data transfer out to the internet charges also follows the direction set by the European Data Act and is available to all AWS customers around the world and from any AWS Region.

Freedom of choice is not limited to data transfer rates. AWS also supports Fair Software Licensing Principles, which make it easy to use software with other IT providers of your choice. You can read this blog post for more details.

You can check the FAQ for more information, or you can contact AWS Customer Support to request credits for DTO while switching.

But I sincerely hope you will not.

— seb

Generative AI Infrastructure at AWS

2024-01-31 Betsy Chernoff

Post Syndicated from Betsy Chernoff original https://aws.amazon.com/blogs/compute/generative-ai-infrastructure-at-aws/

Building and training generative artificial intelligence (AI) models, as well as predicting and providing accurate and insightful outputs requires a significant amount of infrastructure.

There’s a lot of data that goes into generating the high-quality synthetic text, images, and other media outputs that large-language models (LLMs), as well as foundational models (FMs), create. To start, the data set generally has somewhere around one billion variables present in the model that it was trained on (also known as parameters). To process that massive amount of data (think: petabytes), it can take hundreds of hardware accelerators (which are incorporated into purpose-built ML silicon or GPUs).

Given how much data is required for an effective LLM, it becomes costly and inefficient if an organization can’t access the data for these models as quickly as their GPUs/ML silicon are processing it. Selecting infrastructure for generative AI workloads impacts everything from cost to performance to sustainability goals to the ease of use. To successfully run training and inference for FMs organizations need:

Price-performant accelerated computing (including the latest GPUs and dedicated ML Silicon) to power large generative AI workloads.
High-performance and low-latency cloud storage that’s built to keep accelerators highly utilized.
The most performant and cutting-edge technologies, networking, and systems to support the infrastructure for a generative AI workload.
The ability to build with cloud services that can provide seamless integration across generative AI applications, tools, and infrastructure.

Overview of compute, storage, & networking for generative AI

Amazon Elastic Compute Cloud (Amazon EC2) accelerated computing portfolio (including instances powered by GPUs and purpose-built ML silicon) offers the broadest choice of accelerators to power generative AI workloads.

To keep the accelerators highly utilized, they need constant access to data for processing. AWS provides this fast data transfer from storage (up to hundreds of GBs/TBs of data throughput) with Amazon FSx for Lustre and Amazon S3.

Accelerated computing instances combined with differentiated AWS technologies such as the AWS Nitro System, up to 3,200 Gbps of Elastic Fabric Adapter (EFA) networking, as well as exascale computing with Amazon EC2 UltraClusters helps to deliver the most performant infrastructure for generative AI workloads.

Coupled with other managed services such as Amazon SageMaker HyperPod and Amazon Elastic Kubernetes Service (Amazon EKS), these instances provide developers with the industry’s best platform for building and deploying generative AI applications.

This blog post will focus on highlighting announcements across Amazon EC2 instances, storage, and networking that are centered around generative AI.

AWS compute enhancements for generative AI workloads

Training large FMs requires extensive compute resources and because every project is different, a broad set of options are needed so that organization of all sizes can iterate faster, train more models, and increase accuracy. In 2023, there were a lot of launches across the AWS compute category that supported both training and inference workloads for generative AI.

One of those launches, Amazon EC2 Trn1n instances, doubled the network bandwidth (compared to Trn1 instances) to 1600 Gbps of Elastic Fabric Adapter (EFA). That increased bandwidth delivers up to 20% faster time-to-train relative to Trn1 for training network-intensive generative AI models, such as LLMs and mixture of experts (MoE).

Watashiha offers an innovative and interactive AI chatbot service, “OGIRI AI,” which uses LLMs to incorporate humor and offer a more relevant and conversational experience to their customers. “This requires us to pre-train and fine-tune these models frequently. We pre-trained a GPT-based Japanese model on the EC2 Trn1.32xlarge instance, leveraging tensor and data parallelism,” said Yohei Kobashi, CTO, Watashiha, K.K. “The training was completed within 28 days at a 33% cost reduction over our previous GPU based infrastructure. As our models rapidly continue to grow in complexity, we are looking forward to Trn1n instances which has double the network bandwidth of Trn1 to speed up training of larger models.”

AWS continues to advance its infrastructure for generative AI workloads, and recently announced that Trainium2 accelerators are also coming soon. These accelerators are designed to deliver up to 4x faster training than first generation Trainium chips and will be able to be deployed in EC2 UltraClusters of up to 100,000 chips, making it possible to train FMs and LLMs in a fraction of the time, while improving energy efficiency up to 2x.

AWS has continued to invest in GPU infrastructure over the years, too. To date, NVIDIA has deployed 2 million GPUs on AWS, across the Ampere and Grace Hopper GPU generations. That’s 3 zetaflops, or 3,000 exascale super computers. Most recently, AWS announced the Amazon EC2 P5 Instances that are designed for time-sensitive, large-scale training workloads that use NVIDIA CUDA or CuDNN and are powered by NVIDIA H100 Tensor Core GPUs. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. P5 instances help you iterate on your solutions at a faster pace and get to market more quickly.

And to offer easy and predictable access to highly sought-after GPU compute capacity, AWS launched Amazon EC2 Capacity Blocks for ML. This is the first consumption model from a major cloud provider that lets you reserve GPUs for future use (up to 500 deployed in EC2 UltraClusters) to run short duration ML workloads.

AWS is also simplifying training with Amazon SageMaker HyperPod, which automates more of the processes required for high-scale fault-tolerant distributed training (e.g., configuring distributed training libraries, scaling training workloads across thousands of accelerators, detecting and repairing faulty instances), speeding up training by as much as 40%. Customers like Perplexity AI elastically scale beyond hundreds of GPUs and minimize their downtime with SageMaker HyperPod.

Deep-learning inference is another example of how AWS is continuing its cloud infrastructure innovations, including the low-cost, high-performance Amazon EC2 Inf2 instances powered by AWS Inferentia2. These instances are designed to run high-performance deep-learning inference applications at scale globally. They are the most cost-effective and energy-efficient option on Amazon EC2 for deploying the latest innovations in generative AI.

Another example is with Amazon SageMaker, which helps you deploy multiple models to the same instance so you can share compute resources—reducing inference cost by 50%. SageMaker also actively monitors instances that are processing inference requests and intelligently routes requests based on which instances are available—achieving 20% lower inference latency (on average).

AWS invests heavily in the tools for generative AI workloads. For AWS ML silicon, AWS has focused on AWS Neuron, the software development kit (SDK) that helps customers get the maximum performance from Trainium and Inferentia. Neuron supports the most popular publicly available models, including Llama 2 from Meta, MPT from Databricks, Mistral from mistral.ai, and Stable Diffusion from Stability AI, as well as 93 of the top 100 models on the popular model repository Hugging Face. It plugs into ML frameworks like PyTorch and TensorFlow, and support for JAX is coming early this year. It’s designed to make it easy for AWS customers to switch from their existing model training and inference pipelines to Trainium and Inferentia with just a few lines of code.

Cloud storage on AWS enhancements for generative AI

Another way AWS is accelerating the training and inference pipelines is with improvements to storage performance—which is not only critical when thinking about the most common ML tasks (like loading training data into a large cluster of GPUs/accelerators), but also for checkpointing and serving inference requests. AWS announced several improvements to accelerate the speed of storage requests and reduce the idle time of your compute resources—which allows you to run generative AI workloads faster and more efficiently.

To gather more accurate predictions, generative AI workloads are using larger and larger datasets that require high-performant storage at scale to handle the sheer volume in of data.

With Amazon S3 Express One Zone a new storage class purpose-built to high-performance and low-latency object storage for an organizations most frequently accessed data, making it ideal for request-intensive operations like ML training and inference. Amazon S3 Express One Zone is the lowest-latency cloud object storage available, with data access speed up to 10x faster and request costs up to 50% lower than Amazon S3 Standard, from any AWS Availability Zone within an AWS Region.

AWS continues to optimize data access speeds for ML frameworks too. Recently, Amazon S3 Connector for PyTorch launched, which loads training data up to 40% faster than with the existing PyTorch connectors to Amazon S3. While most customers can meet their training and inference requirements using Mountpoint for Amazon S3 or Amazon S3 Connector for PyTorch, some are also building and managing their own custom data loaders. To deliver the fastest data transfer speeds between Amazon S3, and Amazon EC2 Trn1, P4d, and P5 instances, AWS recently announced the ability to automatically accelerate Amazon S3 data transfer in the AWS Command Line Interface (AWS CLI) and Python SDK. Now, training jobs download training data from Amazon S3 up to 3x faster and customers like Scenario are already seeing great results, with a 5x throughput improvement to model download times without writing a single line of code.

To meet the changing performance requirements that training generative AI workloads can require, Amazon FSx for Lustre announced throughput scaling on-demand. This is particularly useful for model training because it enables you to adjust the throughput tier of your file systems to meet these requirements with greater agility and lower cost.

EC2 networking enhancements for generative AI

Last year, AWS introduced EC2 UltraCluster 2.0, a flatter and wider network fabric that’s optimized specifically for the P5 instance and future ML accelerators. It allows us to reduce latency by 16% and supports up to 20,000 GPUs, with up to 10x the overall bandwidth. In a traditional cluster architecture, as clusters get physically bigger, latency will also generally increase. But, with UltraCluster 2.0, AWS is increasing the size while reducing latency, and that’s exciting.

AWS is also continuing to help you make your network more efficient. Take for example a recent launch with Amazon EC2 Instance Topology API. It gives you an inside look at the proximity between your instances, so you can place jobs strategically. Optimized job scheduling means faster processing for distributed workloads. Moving jobs that exchange data the most frequently to the same physical location in a cluster can eliminate multiple hops in the data path. As models push boundaries, this type of software innovation is key to getting the most out of your hardware.

In addition to Amazon Q (a generative AI powered assistant from AWS), AWS also launched Amazon Q networking troubleshooting (preview).

You can ask Amazon Q to assist you in troubleshooting network connectivity issues caused by network misconfiguration in your current AWS account. For this capability, Amazon Q works with Amazon VPC Reachability Analyzer to check your connections and inspect your network configuration to identify potential issues. With Amazon Q network troubleshooting, you can ask questions about your network in conversational English—for example, you can ask, “why can’t I SSH to my server,” or “why is my website not accessible”.

Conclusion

AWS is bringing customers even more choice for their infrastructure, including price-performant, sustainability focused, and ease-of-use options. Last year, AWS capabilities across this stack solidified our commitment to meeting the customer focus and goal of: Making generative AI accessible to customers of all sizes and technical abilities so they can get to reinventing and transforming what is possible.

Additional resources

For more information on AWS generative AI Infrastructure, go to the AWS Machine Learning Infrastructure page.
For more information on how AWS is building in the cloud with Generative AI across applications, tools, and infrastructure, go to the blog, “Welcome to a New Era of Building in the Cloud with Generative AI on AWS”.

DNS over HTTPS is now available in Amazon Route 53 Resolver

2023-12-20 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/dns-over-https-is-now-available-in-amazon-route-53-resolver/

Starting today, Amazon Route 53 Resolver supports using the DNS over HTTPS (DoH) protocol for both inbound and outbound Resolver endpoints. As the name suggests, DoH supports HTTP or HTTP/2 over TLS to encrypt the data exchanged for Domain Name System (DNS) resolutions.

Using TLS encryption, DoH increases privacy and security by preventing eavesdropping and manipulation of DNS data as it is exchanged between a DoH client and the DoH-based DNS resolver.

This helps you implement a zero-trust architecture where no actor, system, network, or service operating outside or within your security perimeter is trusted and all network traffic is encrypted. Using DoH also helps follow recommendations such as those described in this memorandum of the US Office of Management and Budget (OMB).

DNS over HTTPS support in Amazon Route 53 Resolver
You can use Amazon Route 53 Resolver to resolve DNS queries in hybrid cloud environments. For example, it allows AWS services access for DNS requests from anywhere within your hybrid network. To do so, you can set up inbound and outbound Resolver endpoints:

Inbound Resolver endpoints allow DNS queries to your VPC from your on-premises network or another VPC.
Outbound Resolver endpoints allow DNS queries from your VPC to your on-premises network or another VPC.

After you configure the Resolver endpoints, you can set up rules that specify the name of the domains for which you want to forward DNS queries from your VPC to an on-premises DNS resolver (outbound) and from on-premises to your VPC (inbound).

Now, when you create or update an inbound or outbound Resolver endpoint, you can specify which protocols to use:

DNS over port 53 (Do53), which is using either UDP or TCP to send the packets.
DNS over HTTPS (DoH), which is using TLS to encrypt the data.
Both, depending on which one is used by the DNS client.
For FIPS compliance, there is a specific implementation (DoH-FIPS) for inbound endpoints.

Let’s see how this works in practice.

Using DNS over HTTPS with Amazon Route 53 Resolver
In the Route 53 console, I choose Inbound endpoints from the Resolver section of the navigation pane. There, I choose Create inbound endpoint.

I enter a name for the endpoint, select the VPC, the security group, and the endpoint type (IPv4, IPv6, or dual-stack). To allow using both encrypted and unencrypted DNS resolutions, I select Do53, DoH, and DoH-FIPS in the Protocols for this endpoint option.

After that, I configure the IP addresses for DNS queries. I select two Availability Zones and, for each, a subnet. For this setup, I use the option to have the IP addresses automatically selected from those available in the subnet.

After I complete the creation of the inbound endpoint, I configure the DNS server in my network to forward requests for the amazonaws.com domain (used by AWS service endpoints) to the inbound endpoint IP addresses.

Similarly, I create an outbound Resolver endpoint and and select both Do53 and DoH as protocols. Then, I create forwarding rules that tell for which domains the outbound Resolver endpoint should forward requests to the DNS servers in my network.

Now, when the DNS clients in my hybrid environment use DNS over HTTPS in their requests, DNS resolutions are encrypted. Optionally, I can enforce encryption and select only DoH in the configuration of inbound and outbound endpoints.

Things to know
DNS over HTTPS support for Amazon Route 53 Resolver is available today in all AWS Regions where Route 53 Resolver is offered, including GovCloud Regions and Regions based in China.

DNS over port 53 continues to be the default for inbound or outbound Resolver endpoints. In this way, you don’t need to update your existing automation tooling unless you want to adopt DNS over HTTPS.

There is no additional cost for using DNS over HTTPS with Resolver endpoints. For more information, see Route 53 pricing.

Start using DNS over HTTPS with Amazon Route 53 Resolver to increase privacy and security for your hybrid cloud environments.

— Danilo

Amazon Q brings generative AI-powered assistance to IT pros and developers (preview)

2023-11-28 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/amazon-q-brings-generative-ai-powered-assistance-to-it-pros-and-developers-preview/

Today, we are announcing the preview of Amazon Q, a new type of generative artificial intelligence (AI) powered assistant that is specifically for work and can be tailored to a customer’s business.

Amazon Q brings a set of capabilities to support developers and IT professionals. Now you can use Amazon Q to get started building applications on AWS, research best practices, resolve errors, and get assistance in coding new features for your applications. For example, Amazon Q Code Transformation can perform Java application upgrades now, from version 8 and 11 to version 17.

Amazon Q is available in multiple areas of AWS to provide quick access to answers and ideas wherever you work. Here’s a quick look at Amazon Q, including in integrated development environment (IDE):

Building applications together with Amazon Q
Application development is a journey. It involves a continuous cycle of researching, developing, deploying, optimizing, and maintaining. At each stage, there are many questions—from figuring out the right AWS services to use, to troubleshooting issues in the application code.

Trained on 17 years of AWS knowledge and best practices, Amazon Q is designed to help you at each stage of development with a new experience for building applications on AWS. With Amazon Q, you minimize the time and effort you need to gain the knowledge required to answer AWS questions, explore new AWS capabilities, learn unfamiliar technologies, and architect solutions that fuel innovation.

Let us show you some capabilities of Amazon Q.

1. Conversational Q&A capability
You can interact with the Amazon Q conversational Q&A capability to get started, learn new things, research best practices, and iterate on how to build applications on AWS without needing to shift focus away from the AWS console.

To start using this feature, you can select the Amazon Q icon on the right-hand side of the AWS Management Console.

For example, you can ask, “What are AWS serverless services to build serverless APIs?” Amazon Q provides concise explanations along with references you can use to follow up on your questions and validate the guidance. You can also use Amazon Q to follow up on and iterate your questions. Amazon Q will show more deep-dive answers for you with references.

There are times when we have questions for a use case with fairly specific requirements. With Amazon Q, you can elaborate on your use cases in more detail to provide context.

For example, you can ask Amazon Q, “I’m planning to create serverless APIs with 100k requests/day. Each request needs to lookup into the database. What are the best services for this workload?” Amazon Q responds with a list of AWS services you can use and tries to limit the answer results to those that are accurately referenceable and verified with best practices.

Here is some additional information that you might want to note:

Amazon Q conversational Q&A capability is currently in preview in all commercial AWS Regions.
This capability is integrated into the AWS Management Console, AWS Console Mobile Application, AWS Documentation, AWS websites, and Slack and Teams through AWS Chatbot, making it more convenient and easier to find what you need.

2. Optimize Amazon EC2 instance selection
Choosing the right Amazon Elastic Compute Cloud (Amazon EC2) instance type for your workload can be challenging with all the options available. Amazon Q aims to make this easier by providing personalized recommendations.

To use this feature, you can ask Amazon Q, “Which instance families should I use to deploy a Web App Server for hosting an application?” This feature is also available when you choose to launch an instance in the Amazon EC2 console. In Instance type, you can select Get advice on instance type selection. This will show a dialog to define your requirements.

Your requirements are automatically translated into a prompt on the Amazon Q chat panel. Amazon Q returns with a list of suggestions of EC2 instances that are suitable for your use cases. This capability helps you pick the right instance type and settings so your workloads will run smoothly and more cost-efficiently.

This capability to provide EC2 instance type recommendations based on your use case is available in preview in all commercial AWS Regions.

3. Troubleshoot and solve errors directly in the console
Amazon Q can also help you to solve errors for various AWS services directly in the console. With Amazon Q proposed solutions, you can avoid slow manual log checks or research.

Let’s say that you have an AWS Lambda function that tries to interact with an Amazon DynamoDB table. But, for an unknown reason (yet), it fails to run. Now, with Amazon Q, you can troubleshoot and resolve this issue faster by selecting Troubleshoot with Amazon Q.

Amazon Q provides concise analysis of the error which helps you to understand the root cause of the problem and the proposed resolution. With this information, you can follow the steps described by Amazon Q to fix the issue.

In just a few minutes, you will have the solution to solve your issues, saving significant time without disrupting your development workflow. The Amazon Q capability to help you troubleshoot errors in the console is available in preview in the US West (Oregon) for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon ECS, and AWS Lambda.

4. Network troubleshooting assistance
You can also ask Amazon Q to assist you in troubleshooting network connectivity issues caused by network misconfiguration in your current AWS account. For this capability, Amazon Q works with Amazon VPC Reachability Analyzer to check your connections and inspect your network configuration to identify potential issues.

This makes it easy to diagnose and resolve AWS networking problems, such as “Why can’t I SSH to my EC2 instance?” or “Why can’t I reach my web server from the Internet?” which you can ask Amazon Q.

Then, on the response text, you can select preview experience here, which will provide explanations to help you to troubleshoot network connectivity-related issues.

Here are a few things you need to know:

This capability is currently available in preview in the US East (N. Virginia).
To learn more about the feature and sample questions, see Getting started with Amazon Q network troubleshooting in the AWS Documentation.

5. Integration and conversational capabilities within your IDEs
As we mentioned, Amazon Q is also available in supported IDEs. This allows you to ask questions and get help within your IDE by chatting with Amazon Q or invoking actions by typing / in the chat box.

To get started, you need to install or update the latest AWS Toolkit and sign in to Amazon CodeWhisperer. Once you’re signed in to Amazon CodeWhisperer, it will automatically activate the Amazon Q conversational capability in the IDE. With Amazon Q enabled, you can now start chatting to get coding assistance.

You can ask Amazon Q to describe your source code file.

From here, you can improve your application, for example, by integrating it with Amazon DynamoDB. You can ask Amazon Q, “Generate code to save data into DynamoDB table called save_data() accepting data parameter and return boolean status if the operation successfully runs.”

Once you’ve reviewed the generated code, you can do a manual copy and paste into the editor. You can also select Insert at cursor to place the generated code into the source code directly.

This feature makes it really easy to help you focus on building applications because you don’t have to leave your IDE to get answers and context-specific coding guidance. You can try the preview of this feature in Visual Studio Code and JetBrains IDEs.

6. Feature development capability
Another exciting feature that Amazon Q provides is guiding you interactively from idea to building new features within your IDE and Amazon CodeCatalyst. You can go from a natural language prompt to application features in minutes, with interactive step-by-step instructions and best practices, right from your IDE. With a prompt, Amazon Q will attempt to understand your application structure and break down your prompt into logical, atomic implementation steps.

To use this capability, you can start by invoking an action command /dev in Amazon Q and describe the task you need Amazon Q to process.

Then, from here, you can review, collaborate and guide Amazon Q in the chat for specific areas that need to be implemented.

Additional capabilities to help you ship features faster with complete pull requests are available if you’re using Amazon CodeCatalyst. In Amazon CodeCatalyst, you can assign a new or an existing issue to Amazon Q, and it will process an end-to-end development workflow for you. Amazon Q will review the existing code, propose a solution approach, seek feedback from you on the approach, generate merge-ready code, and publish a pull request for review. All you need to do after is to review the proposed solutions from Amazon Q.

The following screenshots show a pull request created by Amazon Q in Amazon CodeCatalyst.

Here are a couple of things that you should know:

Amazon Q feature development capability is currently in preview in Visual Studio Code and Amazon CodeCatalyst
To use this capability in IDE, you need to have the Amazon CodeWhisperer Professional tier. Learn more on the Amazon CodeWhisperer pricing page.

7. Upgrade applications with Amazon Q Code Transformation
With Amazon Q, you can now upgrade an entire application within a few hours by starting a guided code transformation. This capability, called Amazon Q Code Transformation, simplifies maintaining, migrating, and upgrading your existing applications.

To start, navigate to the CodeWhisperer section and then select Transform. Amazon Q Code Transformation automatically analyzes your existing codebase, generates a transformation plan, and completes the key transformation tasks suggested by the plan.

Some additional information about this feature:

Amazon Q Code Transformation is available in preview today in the AWS Toolkit for IntelliJ IDEA and the AWS Toolkit for Visual Studio Code.
To use this capability, you need to have the Amazon CodeWhisperer Professional tier during the preview.
During preview, you can can upgrade Java 8 and 11 applications to version 17, a Java Long-Term Support (LTS) release.

Get started with Amazon Q today
With Amazon Q, you have an AI expert by your side to answer questions, write code faster, troubleshoot issues, optimize workloads, and even help you code new features. These capabilities simplify every phase of building applications on AWS.

Amazon Q lets you engage with AWS Support agents directly from the Q interface if additional assistance is required, eliminating any dead ends in the customer’s self-service experience. The integration with AWS Support is available in the console and will honor the entitlements of your AWS Support plan.

Learn more

— Donnie & Channy

Introducing Amazon CloudFront KeyValueStore: A low-latency datastore for CloudFront Functions

2023-11-21 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-cloudfront-keyvaluestore-a-low-latency-datastore-for-cloudfront-functions/

Amazon CloudFront allows you to securely deliver static and dynamic content with low latency and high transfer speeds. With CloudFront Functions, you can perform latency-sensitive customizations for millions of requests per second. For example, you can use CloudFront Functions to modify headers, normalize cache keys, rewrite URLs, or authorize requests.

Today, we are introducing CloudFront KeyValueStore, a secure global low-latency key value datastore that allows read access from within CloudFront Functions, enabling advanced customizable logic at the CloudFront edge locations.

Previously, you had to embed configuration data inside the function code. For example, data for determining if a URL should be redirected and which URL to redirect the viewer to. When embedding configuration data with the function code, every small change in configuration requires a code change and a redeployment of the function code. Updating and deploying code for every new lookup addition introduces the risk of making inadvertent changes to code. Also, the maximum function size is 10 KB, making it difficult for many use cases to fit all the data within the code.

With CloudFront KeyValueStore, you can now update the data associated with a function and the function code independently from each other. This simplifies function code and makes it easy to update data without the need to deploy code changes.

Let’s see how this works in practice.

Creating a CloudFront key value store
In the CloudFront console, I choose Functions from the navigation pane. In the KeyValueStores tab, I choose Create KeyValueStore.

Here, I have the option to import key value pairs from a JSON file in an Amazon Simple Storage Service (Amazon S3) bucket. I am not doing that now because I want to start with no keys. I enter a name and description and complete the creation of the key value store.

When the key value store has been created, I choose Edit in the Key value pairs section and then Add pair. I type hello for the key and Hello World for the value and save the changes. I can add more keys and values, but one key is enough for now.

When I update a key value store, changes are propagated to all CloudFront edge locations in a few seconds so that it can be used with low latency by the functions that are associated with the key value store. Let’s see how that works.

Using CloudFront KeyValueStore from CloudFront Functions
In the CloudFront console, I choose Functions in the navigation pane and then Create function. I type a name for the function, select the cloudfront-js-2.0 runtime, and complete the creation of the function. Then, I use the new option to associate the key value store with this function.

I copy the key value store ID from the console to use it in the following function code:

import cf from 'cloudfront';

const kvsId = '<KEY_VALUE_STORE_ID>';

// This fails if the key value store is not associated with the function
const kvsHandle = cf.kvs(kvsId);

async function handler(event) {
    // Use the first part of the pathname as key, for example http(s)://domain/<key>/something/else
    const key = event.request.uri.split('/')[1]
    let value = "Not found" // Default value
    try {
        value = await kvsHandle.get(key);
    } catch (err) {
        console.log(`Kvs key lookup failed for ${key}: ${err}`);
    }
    var response = {
        statusCode: 200,
        statusDescription: 'OK',
        body: {
            encoding: 'text',
            data: `Key: ${key} Value: ${value}\n`
        }
    };
    return response;
}

This function uses the first part of the path of the request as key and responds with the name of the key and its value.

I save the changes and publish the function. In the Publish tab of the function, I associate the function with a CloudFront distribution that I created before. I use the Viewer Request event type and Default (*) cache behavior to intercept all requests to the distribution.

In the console, I go back to the list of functions and wait for the function to be deployed. Then, I use curl from the command line to download content from the distribution and test the result of the function.

First, I try with a couple of paths that invoke the function and look up the key I created before (hello):

curl https://distribution-domain.cloudfront.net/hello
Key: hello Value: Hello World

curl https://distribution-domain.cloudfront.net/hello/world
Key: hello Value: Hello World

It works! Then, I try with a different path to see that the default value I use in the code is returned when the key is not found.

curl https://distribution-domain.cloudfront.net/hi
Key: hi Value: Not found

Now that this first example works, let’s try something more advanced and useful.

Rewriting the URL using configuration data in CloudFront KeyValueStore
Let’s build a function that uses the content of the URL in the HTTP request to look up in a key value store the custom path that CloudFront should use to make the actual request. This function can help manage the multiple services that are part of a website.

For example, I want to update the blog platform I use for my website. The old blog has origin path /blog-v1 while the new blog has origin path /blog-v2.

At first, I am still using the old blog. In the CloudFormation console, I add the blog key to the key value store with value blog-v1.

Then, I create the following function and associate it with the distribution using Viewer Request event and Default (*) cache behavior to intercept all requests to the distribution.

import cf from 'cloudfront';

const kvsId = "<KEY_VALUE_STORE_ID>";

// This fails if the key value store is not associated with the function
const kvsHandle = cf.kvs(kvsId);

async function handler(event) {
    const request = event.request;
    // Use the first segment of the pathname as key
    // For example http(s)://domain/<key>/something/else
    const pathSegments = request.uri.split('/')
    const key = pathSegments[1]
    try {
        // Replace the first path of the pathname with the value of the key
        // For example http(s)://domain/<value>/something/else
        pathSegments[1] = await kvsHandle.get(key);
        const newUri = pathSegments.join('/');
        console.log(`${request.uri} -> ${newUri}`)
        request.uri = newUri;
    } catch (err) {
        // No change to the pathname if the key is not found
        console.log(`${request.uri} | ${err}`);
    }
    return request;
}

Now, when I type blog at the beginning of the URL path, the request will actually go to the blog-v1 path. CloudFront will make the HTTP request to the old blog because blog-v1 is the origin path used by the old blog.

For example, if I type https://distribution-domain.cloudfront.net/blog/index.html in a browser, I see the old blog (V1).

In the console, I update the blog key with value blog-v2. I access the same URL after a few seconds, and now I reach the new blog (V2).

As you can see, the public URL is the same, but the content has changed. More generally, this function assumes that URLs do not change between the two blog versions.

I can now add more keys for the different services that are part of my website (blog, support, help, commerce, and so on) and set their values to use the correct URL path for each of them. When I add a new version for one of them (for example, I migrate to a new commerce platform), I can configure a new origin and update the corresponding key to use the new origin path.

This is just an example of the flexibility you get when you separate configuration data from code. If you are already using CloudFront Functions, you can simplify your code by using CloudFront KeyValueStore.

Things to know
CloudFront KeyValueStore is available today in all edge locations globally. With CloudFront KeyValueStore, you pay only for what you use based on the read/write operations from the public API and the read operations from within CloudFront Functions. For more information, see CloudFront pricing.

You can manage a key value store using the AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs. AWS CloudFormation support is coming soon. The maximum size of a key value store is 5 MB, and you can associate a single key value store to each function. The maximum size of a key is 512 bytes. Values can be up to 1KB in size. When creating a key value store, you can import key/value data during creation using a source file on Amazon S3 with this JSON structure:

{
  "data":[
    {
      "key":"key1",
      "value":"val1"
    },
    {
      "key":"key2",
      "value":"val2"
    }
  ]
}

Importing key/value data at creation can help automate the setup of a new environment (such as test or dev) and easily replicate the configuration from one environment to another (such as preproduction to production).

Simplify the way you add custom logic at the edge using CloudFront KeyValueStore.

— Danilo

Happy anniversary, Amazon CloudFront: 15 years of evolution and internet advancements

2023-11-16 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/happy-anniversary-amazon-cloudfront-15-years-of-evolution-and-internet-advancements/

I can’t believe it’s been 15 years since Amazon CloudFront was launched! When Amazon S3 became available in 2006, developers loved the flexibility and started to build a new kind of globally distributed applications where storage was not a bottleneck. These applications needed to be performant, reliable, and cost-efficient for every user on the planet. So in 2008 a small team (a “two-pizza team“) launched CloudFront in just 200 days. Jeff Barr hinted at the new and yet unnamed service in September and introduced CloudFront two months later.

Since the beginning, CloudFront has provided an easy way to distribute content to end users with low latency, high data transfer speeds, and no long-term commitments. What started as a simple cache for Amazon S3 quickly evolved into a fully featured content delivery network. Now CloudFront delivers applications at blazing speeds across the globe, supporting live sporting events such as NFL, Cricket World Cup, and FIFA World Cup.

At the same time, we also want to provide you with the best tools to secure applications. In 2015, we announced AWS WAF integration with CloudFront to provide fast and secure access control at the edge. Then, we focused on developing robust threat intelligence by combining signals across services. This threat intelligence integrates with CloudFront, adding AWS Shield to protect applications from common exploits and distributed denial of service (DDoS) attacks. For example, we recently detected an unusual spike in HTTP/2 requests to Amazon CloudFront. We quickly realized that CloudFront had automatically mitigated a new type of HTTP request flood DDoS event.

A lot also happens at lower levels than HTTP. For example, when you serve your application with CloudFront, all of the packets received by the application are inspected by a fully inline DDoS mitigation system which doesn’t introduce any observable latency. In this way, L3/L4 DDoS attacks against CloudFront distributions are mitigated in real time.

We also made under-the-hood improvements like s2n-tls (short for “signal to noise”), an open-source implementation of the TLS protocol that has been designed to be small and fast with simplicity as a priority. Another similar improvement is s2n-quic, an open-source QUIC protocol implementation written in Rust.

With CloudFront, you can also control access to content through a number of capabilities. You can restrict access to only authenticated viewers or, through geo-restriction capability, configure the specific geographic locations that can access content.

Security is always important, but not every organization has dedicated security experts on staff. To make robust security more accessible, CloudFront now includes built-in protections such as one-click web application firewall setup, security recommendations, and an intuitive security dashboard. With these integrated security features, teams can put critical safeguards in place without deep security expertise. Our goal is to empower all customers to easily implement security best practices.

Web applications delivery
During the past 15 years, web applications have become much more advanced and essential to end users. When CloudFront launched, our focus was helping deliver content stored in S3 buckets. Dynamic content was introduced to optimize web applications where portions of a website change for each user. Dynamic content also improves access to APIs that need to be delivered globally.

As applications become more distributed, we looked at ways to help developers make efficient use of its global footprint and resources at the edge. To allow customization and personalization of content close to end users and minimize latency, Lambda@Edge was introduced.

When fewer compute resources are needed, CloudFront Functions can run lightweight JavaScript functions across edge locations for low-latency HTTP manipulations and personalized content delivery. Recently, CloudFront Functions expanded to further customize responses, including modifying HTTP status codes and response bodies.

Today, CloudFront handles over 3 trillion HTTP requests daily and uses a global network of more than 600 points of presence and 13 Regional edge caches in more than 100 cities across 50 countries. This scale helps power the most demanding online events. For example, during the 2023 Amazon Prime Day, CloudFront handled peak loads of over 500 million HTTP requests per minute, totaling over 1 trillion HTTP requests.

Amazon CloudFront has more than 600,000 active developers building and delivering applications to end users. To help teams work at their full speed, CloudFront introduced continuous deployment so developers can test and validate configuration changes on a portion of traffic before full deployment.

Media and entertainment
It’s now common to stream music, movies, and TV series to our homes, but 15 years ago, renting DVDs was still the norm. Running streaming servers was technically complex, requiring long-term contracts to access the global infrastructure needed for high performance.

First, we added support for audio and video streaming capabilities using custom protocols since technical standards were still evolving. To handle large audiences and simplify cost-effective delivery of live events, CloudFront launched live HTTP streaming and, shortly after, improved support for both Flash-based (popular at the time) and Apple iOS devices.

As the media industry continued moving to internet-based delivery, AWS acquired Elemental, a pioneer in software-defined video solutions. Integrating Elemental offerings helped provide services, software, and appliances that efficiently and economically scale video infrastructures for use cases such as broadcast and content production.

The evolution of technologies and infrastructure allows for new ways of communication to become possible, such as when NASA did the first-ever live 4K stream from space using CloudFront.

Today, the world’s largest events and leading video platforms rely on CloudFront to deliver massive video catalogs and live stream content to millions. For example, CloudFront delivered streams for the FIFA World Cup 2022 on behalf of more than 19 major broadcasters globally. More recently, CloudFront handled over 120 Tbps of peak data transfer during one of the Thursday Night Football games of the NFL season on Prime Video and helped deliver the Cricket World Cup to millions of viewers across the globe.

What’s next?
Many things have changed during these 15 years but the focus on security, performance, and scalability stays the same. At AWS, it’s always Day 1, and the CloudFront team is constantly looking for ways to improve based on your feedback.

The rise of botnets is driving an ever-evolving, highly dynamic, and shifting threat landscape. Layer 7 DDoS attacks are becoming increasingly prevalent. The pervasiveness of bot traffic is increasing exponentially. As this occurs, we are evolving how we mitigate threats at the network border, at the edge, and in the Region, making it simpler for customers to configure the right security options.

Web applications are becoming more complex and interactive, and viewer expectations on latency and resiliency are even more stringent. This will drive new innovation. As new applications use generative artificial intelligence (AI), needs will evolve. These trends are will continue growing, so our investments will be focused on improving security and edge compute capabilities to support these new use cases.

With the current macroeconomic environment, many customers, especially small and medium-sized businesses and startups, look at how they can reduce their costs. Providing optimal price-performance has always been a priority for CloudFront. Cacheable data transferred to CloudFront edge locations from AWS resources does not incur additional fees. Also, 1 TB of data transfer from CloudFront to the internet per month is included in the free tier. CloudFront operates on a pay-as-you-go model with no upfront costs or minimum usage requirements. For more info, see CloudFront pricing.

As we approach AWS re:Invent, take note of these sessions that can help you learn about the latest innovations and connect with experts:

To learn more on how to speed up your websites and APIs and keep them protected, see the Application Security and Performance section of the AWS Developer Center.

Reduce latency and improve the security for your applications with Amazon CloudFront.

— Danilo

Automatically detect and block low-volume network floods

2023-09-07 Bryan Van Hook

Post Syndicated from Bryan Van Hook original https://aws.amazon.com/blogs/security/automatically-detect-and-block-low-volume-network-floods/

In this blog post, I show you how to deploy a solution that uses AWS Lambda to automatically manage the lifecycle of Amazon VPC Network Access Control List (ACL) rules to mitigate network floods detected using Amazon CloudWatch Logs Insights and Amazon Timestream.

Application teams should consider the impact unexpected traffic floods can have on an application’s availability. Internet-facing applications can be susceptible to traffic that some distributed denial of service (DDoS) mitigation systems can’t detect. For example, hit-and-run events are a popular approach that use short-lived floods that reoccur at random intervals. Each burst is small enough to go unnoticed by mitigation systems, but still occur often enough and are large enough to be disruptive. Automatically detecting and blocking temporary sources of invalid traffic, combined with other best practices, can strengthen the resiliency of your applications and maintain customer trust.

Use resilient architectures

AWS customers can use prescriptive guidance to improve DDoS resiliency by reviewing the AWS Best Practices for DDoS Resiliency. It describes a DDoS-resilient reference architecture as a guide to help you protect your application’s availability.

The best practices above address the needs of most AWS customers; however, in this blog we cover a few outlier examples that fall outside normal guidance. Here are a few examples that might describe your situation:

You need to operate functionality that isn’t yet fully supported by an AWS managed service that takes on the responsibility of DDoS mitigation.
Migrating to an AWS managed service such as Amazon Route 53 isn’t immediately possible and you need an interim solution that mitigates risks.
Network ingress must be allowed from a wide public IP space that can’t be restricted.
You’re using public IP addresses assigned from the Amazon pool of public IPv4 addresses (which can’t be protected by AWS Shield) rather than Elastic IP addresses.
The application’s technology stack has limited or no support for horizontal scaling to absorb traffic floods.
Your HTTP workload sits behind a Network Load Balancer and can’t be protected by AWS WAF.
Network floods are disruptive but not significant enough (too infrequent or too low volume) to be detected by your managed DDoS mitigation systems.

For these situations, VPC network ACLs can be used to deny invalid traffic. Normally, the limit on rules per network ACL makes them unsuitable for handling truly distributed network floods. However, they can be effective at mitigating network floods that aren’t distributed enough or large enough to be detected by DDoS mitigation systems.

Given the dynamic nature of network traffic and the limited size of network ACLs, it helps to automate the lifecycle of network ACL rules. In the following sections, I show you a solution that uses network ACL rules to automatically detect and block infrastructure layer traffic within 2–5 minutes and automatically removes the rules when they’re no longer needed.

Detecting anomalies in network traffic

You need a way to block disruptive traffic while not impacting legitimate traffic. Anomaly detection can isolate the right traffic to block. Every workload is unique, so you need a way to automatically detect anomalies in the workload’s traffic pattern. You can determine what is normal (a baseline) and then detect statistical anomalies that deviate from the baseline. This baseline can change over time, so it needs to be calculated based on a rolling window of recent activity.

Z-scores are a common way to detect anomalies in time-series data. The process for creating a Z-score is to first calculate the average and standard deviation (a measure of how much the values are spread out) across all values over a span of time. Then for each value in the time window calculate the Z-score as follows:

Z-score = (value – average) / standard deviation

A Z-score exceeding 3.0 indicates the value is an outlier that is greater than 99.7 percent of all other values.

To calculate the Z-score for detecting network anomalies, you need to establish a time series for network traffic. This solution uses VPC flow logs to capture information about the IP traffic in your VPC. Each VPC flow log record provides a packet count that’s aggregated over a time interval. Each flow log record aggregates the number of packets over an interval of 60 seconds or less. There isn’t a consistent time boundary for each log record. This means raw flow log records aren’t a predictable way to build a time series. To address this, the solution processes flow logs into packet bins for time series values. A packet bin is the number of packets sent by a unique source IP address within a specific time window. A source IP address is considered an anomaly if any of its packet bins over the past hour exceed the Z-score threshold (default is 3.0).

When overall traffic levels are low, there might be source IP addresses with a high Z-score that aren’t a risk. To mitigate against false positives, source IP addresses are only considered to be an anomaly if the packet bin exceeds a minimum threshold (default is 12,000 packets).

Let’s review the overall solution architecture.

Solution overview

This solution, shown in Figure 1, uses VPC flow logs to capture information about the traffic reaching the network interfaces in your public subnets. CloudWatch Logs Insights queries are used to summarize the most recent IP traffic into packet bins that are stored in Timestream. The time series table is queried to identify source IP addresses responsible for traffic that meets the anomaly threshold. Anomalous source IP addresses are published to an Amazon Simple Notification Service (Amazon SNS) topic. A Lambda function receives the SNS message and decides how to update the network ACL.

Figure 1: Automating the detection and mitigation of traffic floods using network ACLs

How it works

The numbered steps that follow correspond to the numbers in Figure 1.

Capture VPC flow logs. Your VPC is configured to stream flow logs to CloudWatch Logs. To minimize cost, the flow logs are limited to particular subnets and only include log fields required by the CloudWatch query. When protecting an endpoint that spans multiple subnets (such as a Network Load Balancer using multiple availability zones), each subnet shares the same network ACL and is configured with a flow log that shares the same CloudWatch log group.
Scheduled flow log analysis. Amazon EventBridge starts an AWS Step Functions state machine on a time interval (60 seconds by default). The state machine starts a Lambda function immediately, and then again after 30 seconds. The Lambda function performs steps 3–6.
Summarize recent network traffic. The Lambda function runs a CloudWatch Logs Insights query. The query scans the most recent flow logs (5-minute window) to summarize packet frequency grouped by source IP. These groupings are called packet bins, where each bin represents the number of packets sent by a source IP within a given minute of time.
Update time series database. A time series database in Timestream is updated with the most recent packet bins.
Use statistical analysis to detect abusive source IPs. A Timestream query is used to perform several calculations. The query calculates the average bin size over the past hour, along with the standard deviation. These two values are then used to calculate the maximum Z-score for all source IPs over the past hour. This means an abusive IP will remain flagged for one hour even if it stopped sending traffic. Z-scores are sorted so that the most abusive source IPs are prioritized. If a source IP meets these two criteria, it is considered abusive.
1. Maximum Z-score exceeds a threshold (defaults to 3.0).
2. Packet bin exceeds a threshold (defaults to 12,000). This avoids flagging source IPs during periods of overall low traffic when there is no need to block traffic.
Publish anomalous source IPs. Publish a message to an Amazon SNS topic with a list of anomalous source IPs. The function also publishes CloudWatch metrics to help you track the number of unique and abusive source IPs over time. At this point, the flow log summarizer function has finished its job until the next time it’s invoked from EventBridge.
Receive anomalous source IPs. The network ACL updater function is subscribed to the SNS topic. It receives the list of anomalous source IPs.
Update the network ACL. The network ACL updater function uses two network ACLs called blue and green. This verifies that the active rules remain in place while updating the rules in the inactive network ACL. When the inactive network ACL rules are updated, the function swaps network ACLs on each subnet. By default, each network ACL has a limit of 20 rules. If the number of anomalous source IPs exceeds the network ACL limit, the source IPs with the highest Z-score are prioritized. CloudWatch metrics are provided to help you track the number of source IPs blocked, and how many source IPs couldn’t be blocked due to network ACL limits.

Prerequisites

This solution assumes you have one or more public subnets used to operate an internet-facing endpoint.

Deploy the solution

Follow these steps to deploy and validate the solution.

Download the latest release from GitHub.
Upload the AWS CloudFormation templates and Python code to an S3 bucket.
Gather the information needed for the CloudFormation template parameters.
Create the CloudFormation stack.
Monitor traffic mitigation activity using the CloudWatch dashboard.

Let’s review the steps I followed in my environment.

Step 1. Download the latest release

I create a new directory on my computer named auto-nacl-deploy. I review the releases on GitHub and choose the latest version. I download auto-nacl.zip into the auto-nacl-deploy directory. Now it’s time to stage this code in Amazon Simple Storage Service (Amazon S3).

Figure 2: Save auto-nacl.zip to the auto-nacl-deploy directory

Step 2. Upload the CloudFormation templates and Python code to an S3 bucket

I extract the auto-nacl.zip file into my auto-nacl-deploy directory.

Figure 3: Expand auto-nacl.zip into the auto-nacl-deploy directory

The template.yaml file is used to create a CloudFormation stack with four nested stacks. You copy all files to an S3 bucket prior to creating the stacks.

To stage these files in Amazon S3, use an existing bucket or create a new one. For this example, I used an existing S3 bucket named auto-nacl-us-east-1. Using the Amazon S3 console, I created a folder named artifacts and then uploaded the extracted files to it. My bucket now looks like Figure 4.

Figure 4: Upload the extracted files to Amazon S3

Step 3. Gather information needed for the CloudFormation template parameters

There are six parameters required by the CloudFormation template.

Template parameter	Parameter description
VpcId	The ID of the VPC that runs your application.
SubnetIds	A comma-delimited list of public subnet IDs used by your endpoint.
ListenerPort	The IP port number for your endpoint’s listener.
ListenerProtocol	The Internet Protocol (TCP or UDP) used by your endpoint.
SourceCodeS3Bucket	The S3 bucket that contains the files you uploaded in Step 2. This bucket must be in the same AWS Region as the CloudFormation stack.
SourceCodeS3Prefix	The S3 prefix (folder) of the files you uploaded in Step 2.

For the VpcId parameter, I use the VPC console to find the VPC ID for my application.

Figure 5: Find the VPC ID

For the SubnetIds parameter, I use the VPC console to find the subnet IDs for my application. My VPC has public and private subnets. For this solution, you only need the public subnets.

Figure 6: Find the subnet IDs

My application uses a Network Load Balancer that listens on port 80 to handle TCP traffic. I use 80 for ListenerPort and TCP for ListenerProtocol.

The next two parameters are based on the Amazon S3 location I used earlier. I use auto-nacl-us-east-1 for SourceCodeS3Bucket and artifacts for SourceCodeS3Prefix.

Step 4. Create the CloudFormation stack

I use the CloudFormation console to create a stack. The Amazon S3 URL format is https://<bucket>.s3.<region>.amazonaws.com/<prefix>/template.yaml. I enter the Amazon S3 URL for my environment, then choose Next.

Figure 7: Specify the CloudFormation template

I enter a name for my stack (for example, auto-nacl-1) along with the parameter values I gathered in Step 3. I leave all optional parameters as they are, then choose Next.

Figure 8: Provide the required parameters

I review the stack options, then scroll to the bottom and choose Next.

Figure 9: Review the default stack options

I scroll down to the Capabilities section and acknowledge the capabilities required by CloudFormation, then choose Submit.

Figure 10: Acknowledge the capabilities required by CloudFormation

I wait for the stack to reach CREATE_COMPLETE status. It takes 10–15 minutes to create all of the nested stacks.

Figure 11: Wait for the stacks to complete

Step 5. Monitor traffic mitigation activity using the CloudWatch dashboard

After the CloudFormation stacks are complete, I navigate to the CloudWatch console to open the dashboard. In my environment, the dashboard is named auto-nacl-1-MitigationDashboard-YS697LIEHKGJ.

Figure 12: Find the CloudWatch dashboard

Initially, the dashboard, shown in Figure 13, has little information to display. After an hour, I can see the following metrics from my sample environment:

The Network Traffic graph shows how many packets are allowed and rejected by network ACL rules. No anomalies have been detected yet, so this only shows allowed traffic.
The All Source IPs graph shows how many total unique source IP addresses are sending traffic.
The Anomalous Source Networks graph shows how many anomalous source networks are being blocked by network ACL rules (or not blocked due to network ACL rule limit). This graph is blank unless anomalies have been detected in the last hour.
The Anomalous Source IPs graph shows how many anomalous source IP addresses are being blocked (or not blocked) by network ACL rules. This graph is blank unless anomalies have been detected in the last hour.
The Packet Statistics graph can help you determine if the sensitivity should be adjusted. This graph shows the average packets-per-minute and the associated standard deviation over the past hour. It also shows the anomaly threshold, which represents the minimum number of packets-per-minute for a source IP address to be considered an anomaly. The anomaly threshold is calculated based on the CloudFormation parameter MinZScore.
anomaly threshold = (MinZScore * standard deviation) + average

Increasing the MinZScore parameter raises the threshold and reduces sensitivity. You can also adjust the CloudFormation parameter MinPacketsPerBin to mitigate against blocking traffic during periods of low volume, even if a source IP address exceeds the minimum Z-score.
The Blocked IPs grid shows which source IP addresses are being blocked during each hour, along with the corresponding packet bin size and Z-score. This grid is blank unless anomalies have been detected in the last hour.

Figure 13: Observe the dashboard after one hour

Let’s review a scenario to see what happens when my endpoint sees two waves of anomalous traffic.

By default, my network ACL allows a maximum of 20 inbound rules. The two default rules count toward this limit, so I only have room for 18 more inbound rules. My application sees a spike of network traffic from 20 unique source IP addresses. When the traffic spike begins, the anomaly is detected in less than five minutes. Network ACL rules are created to block the top 18 source IP addresses (sorted by Z-score). Traffic is blocked for about 5 minutes until the flood subsides. The rules remain in place for 1 hour by default. When the same 20 source IP addresses send another traffic flood a few minutes later, most traffic is immediately blocked. Some traffic is still allowed from two source IP addresses that can’t be blocked due to the limit of 18 rules.

Figure 14: Observe traffic blocked from anomalous source IP addresses

Customize the solution

You can customize the behavior of this solution to fit your use case.

Block many IP addresses per network ACL rule. To enable blocking more source IP addresses than your network ACL rule limit, change the CloudFormation parameter NaclRuleNetworkMask (default is 32). This sets the network mask used in network ACL rules and lets you block IP address ranges instead of individual IP addresses. By default, the IP address 192.0.2.1 is blocked by a network ACL rule for 192.0.2.1/32. Setting this parameter to 24 results in a network ACL rule that blocks 192.0.2.0/24. As a reminder, address ranges that are too wide might result in blocking legitimate traffic.
Only block source IPs that exceed a packet volume threshold. Use the CloudFormation parameter MinPacketsPerBin (default is 12,000) to set the minimum packets per minute. This mitigates against blocking source IPs (even if their Z-score is high) during periods of overall low traffic when there is no need to block traffic.
Adjust the sensitivity of anomaly detection. Use the CloudFormation parameter MinZScore to set the minimum Z-score for a source IP to be considered an anomaly. The default is 3.0, which only blocks source IPs with packet volume that exceeds 99.7 percent of all other source IPs.
Exclude trusted source IPs from anomaly detection. Specify an allow list object in Amazon S3 that contains a list of IP addresses or CIDRs that you want to exclude from network ACL rules. The network ACL updater function reads the allow list every time it handles an SNS message.

Limitations

As covered in the preceding sections, this solution has a few limitations to be aware of:

CloudWatch Logs queries can only return up to 10,000 records. This means the traffic baseline can only be calculated based on the observation of 10,000 unique source IP addresses per minute.
The traffic baseline is based on a rolling 1-hour window. You might need to increase this if a 1-hour window results in a baseline that allows false positives. For example, you might need a longer baseline window if your service normally handles abrupt spikes that occur hourly or daily.
By default, a network ACL can only hold 20 inbound rules. This includes the default allow and deny rules, so there’s room for 18 deny rules. You can increase this limit from 20 to 40 with a support case; however, it means that a maximum of 18 (or 38) source IP addresses can be blocked at one time.
The speed of anomaly detection is dependent on how quickly VPC flow logs are delivered to CloudWatch. This usually takes 2–4 minutes but can take over 6 minutes.

Cost considerations

CloudWatch Logs Insights queries are the main element of cost for this solution. See CloudWatch pricing for more information. The cost is about 7.70 USD per GB of flow logs generated per month.

To optimize the cost of CloudWatch queries, the VPC flow log record format only includes the fields required for anomaly detection. The CloudWatch log group is configured with a retention of 1 day. You can tune your cost by adjusting the anomaly detector function to run less frequently (the default is twice per minute). The tradeoff is that the network ACL rules won’t be updated as frequently. This can lead to the solution taking longer to mitigate a traffic flood.

Conclusion

Maintaining high availability and responsiveness is important to keeping the trust of your customers. The solution described above can help you automatically mitigate a variety of network floods that can impact the availability of your application even if you’ve followed all the applicable best practices for DDoS resiliency. There are limitations to this solution, but it can quickly detect and mitigate disruptive sources of traffic in a cost-effective manner. Your feedback is important. You can share comments below and report issues on GitHub.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Simplify Service-to-Service Connectivity, Security, and Monitoring with Amazon VPC Lattice – Now Generally Available

2023-03-31 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/simplify-service-to-service-connectivity-security-and-monitoring-with-amazon-vpc-lattice-now-generally-available/

At AWS re:Invent 2022, we introduced in preview Amazon VPC Lattice, a new capability of Amazon Virtual Private Cloud (Amazon VPC) that gives you a consistent way to connect, secure, and monitor communication between your services. With VPC Lattice, you can define policies for network access, traffic management, and monitoring to connect compute services across instances, containers, and serverless applications.

Today, I am happy to share that VPC Lattice is now generally available. Compared to the preview, you have access to new capabilities:

Services can use a custom domain name in addition to the domain name automatically generated by VPC Lattice. When using HTTPS, you can configure an SSL/TLS certificate that matches the custom domain name.
You can deploy the open-source AWS Gateway API Controller to use VPC Lattice with a Kubernetes-native experience. It uses the Kubernetes Gateway API to let you connect services across multiple Kubernetes clusters and services running on EC2 instances, containers, and serverless functions.
You can use an Application Load Balancer (ALB) or a Network Load Balancer (NLB) as a target for a service.
The IP address target type now supports IPv6 connectivity.

Let’s see some of these new features in practice.

Using Amazon VPC Lattice for Service-to-Service Connectivity
In my previous post introducing VPC Lattice, I show how to create a service network, associate multiple VPCs and services, and configure target groups for EC2 instances and Lambda functions. There, I also show how to route traffic based on request characteristics and how to use weighted routing. Weighted routing is really handy for blue/green and canary-style deployments or for migrating from one compute platform to another.

Now, let’s see how to use VPC Lattice to allow the services of an e-commerce application to communicate with each other. For simplicity, I only consider four services:

The Order service, running as a Lambda function.
The Inventory service, deployed as an Amazon Elastic Container Service (Amazon ECS) service in a dual-stack VPC supporting IPv6.
The Delivery service, deployed as an ECS service using an ALB to distribute traffic to the service tasks.
The Payment service, running on an EC2 instance.

First, I create a service network. The Order service needs to call the Inventory service (to check if an item is available for purchase), the Delivery service (to organize the delivery of the item), and the Payment service (to transfer the funds). The following diagram shows the service-to-service communication from the perspective of the service network.

These services run in different AWS accounts and multiple VPCs. VPC Lattice handles the complexity of setting up connectivity across VPC boundaries and permission across accounts so that service-to-service communication is as simple as an HTTP/HTTPS call.

The following diagram shows how the communication flows from an implementation point of view.

The Order service runs in a Lambda function connected to a VPC. Because all the VPCs in the diagram are associated with the service network, the Order service is able to call the other services (Inventory, Delivery, and Payment) even if they are deployed in different AWS accounts and in VPCs with overlapping IP addresses.

Using a Network Load Balancer (NLB) as Target
The Inventory service runs in a dual-stack VPC. It’s deployed as an ECS service with an NLB to distribute traffic to the tasks in the service. To get the IPv6 addresses of the NLB, I look for the network interfaces used by the NLB in the EC2 console.

When creating the target group for the Inventory service, under Basic configuration, I choose IP addresses as the target type. Then, I select IPv6 for the IP Address type.

In the next step, I enter the IPv6 addresses of the NLB as targets. After the target group is created, the health checks test the targets to see if they are responding as expected.

Using an Application Load Balancer (ALB) as Target
Using an ALB as a target is even easier. When creating a target group for the Delivery service, under Basic configuration, I choose the new Application Load Balancer target type.

I select the VPC in which to look for the ALB and choose the Protocol version.

In the next step, I choose Register now and select the ALB from the dropdown. I use the default port used by the target group. VPC Lattice does not provide additional health checks for ALBs. However, load balancers already have their own health checks configured.

Using Custom Domain Names for Services
To call these services, I use custom domain names. For example, when I create the Payment service in the VPC console, I choose to Specify a custom domain configuration, enter a Custom domain name, and select an SSL/TLS certificate for the HTTPS listener. The Custom SSL/TLS certificate dropdown shows available certificates from AWS Certificate Manager (ACM).

Securing Service-to-Service Communications
Now that the target groups have been created, let’s see how I can secure the way services communicate with each other. To implement zero-trust authentication and authorization, I use AWS Identity and Access Management (IAM). When creating a service, I select the AWS IAM as Auth type.

I select the Allow only authenticated access policy template so that requests to services need to be signed using Signature Version 4, the same signing protocol used by AWS APIs. In this way, requests between services are authenticated by their IAM credentials, and I don’t have to manage secrets to secure their communications.

Optionally, I can be more precise and use an auth policy that only gives access to some services or specific URL paths of a service. For example, I can apply the following auth policy to the Order service to give to the Lambda function these permissions:

Read-only access (GET method) to the Inventory service /stock URL path.
Full access (any HTTP method) to the Delivery service /delivery URL path.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<Order Service Lambda Function IAM Role ARN>"
            },
            "Action": "vpc-lattice-svcs:Invoke",
            "Resource": "<Inventory Service ARN>/stock",
            "Condition": {
                "StringEquals": {
                    "vpc-lattice-svcs:RequestMethod": "GET"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<Order Service Lambda Function IAM Role ARN>"
            },
            "Action": "vpc-lattice-svcs:Invoke",
            "Resource": "<Delivery Service ARN>/delivery"
        }
    ]
}

Using VPC Lattice, I quickly configured the communication between the services of my e-commerce application, including security and monitoring. Now, I can focus on the business logic instead of managing how services communicate with each other.

Availability and Pricing
Amazon VPC Lattice is available today in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Europe (Ireland).

With VPC Lattice, you pay for the time a service is provisioned, the amount of data transferred through each service, and the number of requests. There is no charge for the first 300,000 requests every hour, and you only pay for requests above this threshold. For more information, see VPC Lattice pricing.

We designed VPC Lattice to allow incremental opt-in over time. Each team in your organization can choose if and when to use VPC Lattice. Other applications can connect to VPC Lattice services using standard protocols such as HTTP and HTTPS. By using VPC Lattice, you can focus on your application logic and improve productivity and deployment flexibility with consistent support for instances, containers, and serverless computing.

Simplify the way you connect, secure, and monitor your services with VPC Lattice.

— Danilo

New – Use Amazon S3 Object Lambda with Amazon CloudFront to Tailor Content for End Users

2023-03-14 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-use-amazon-s3-object-lambda-with-amazon-cloudfront-to-tailor-content-for-end-users/

With S3 Object Lambda, you can use your own code to process data retrieved from Amazon S3 as it is returned to an application. Over time, we added new capabilities to S3 Object Lambda, like the ability to add your own code to S3 HEAD and LIST API requests, in addition to the support for S3 GET requests that was available at launch.

Today, we are launching aliases for S3 Object Lambda Access Points. Aliases are now automatically generated when S3 Object Lambda Access Points are created and are interchangeable with bucket names anywhere you use a bucket name to access data stored in Amazon S3. Therefore, your applications don’t need to know about S3 Object Lambda and can consider the alias to be a bucket name.

You can now use an S3 Object Lambda Access Point alias as an origin for your Amazon CloudFront distribution to tailor or customize data for end users. You can use this to implement automatic image resizing or to tag or annotate content as it is downloaded. Many images still use older formats like JPEG or PNG, and you can use a transcoding function to deliver images in more efficient formats like WebP, BPG, or HEIC. Digital images contain metadata, and you can implement a function that strips metadata to help satisfy data privacy requirements.

Let’s see how this works in practice. First, I’ll show a simple example using text that you can follow along by just using the AWS Management Console. After that, I’ll implement a more advanced use case processing images.

Using an S3 Object Lambda Access Point as the Origin of a CloudFront Distribution
For simplicity, I am using the same application in the launch post that changes all text in the original file to uppercase. This time, I use the S3 Object Lambda Access Point alias to set up a public distribution with CloudFront.

I follow the same steps as in the launch post to create the S3 Object Lambda Access Point and the Lambda function. Because the Lambda runtimes for Python 3.8 and later do not include the requests module, I update the function code to use urlopen from the Python Standard Library:

import boto3
from urllib.request import urlopen

s3 = boto3.client('s3')

def lambda_handler(event, context):
  print(event)

  object_get_context = event['getObjectContext']
  request_route = object_get_context['outputRoute']
  request_token = object_get_context['outputToken']
  s3_url = object_get_context['inputS3Url']

  # Get object from S3
  response = urlopen(s3_url)
  original_object = response.read().decode('utf-8')

  # Transform object
  transformed_object = original_object.upper()

  # Write object back to S3 Object Lambda
  s3.write_get_object_response(
    Body=transformed_object,
    RequestRoute=request_route,
    RequestToken=request_token)

  return

To test that this is working, I open the same file from the bucket and through the S3 Object Lambda Access Point. In the S3 console, I select the bucket and a sample file (called s3.txt) that I uploaded earlier and choose Open.

A new browser tab is opened (you might need to disable the pop-up blocker in your browser), and its content is the original file with mixed-case text:

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers...

I choose Object Lambda Access Points from the navigation pane and select the AWS Region I used before from the dropdown. Then, I search for the S3 Object Lambda Access Point that I just created. I select the same file as before and choose Open.

In the new tab, the text has been processed by the Lambda function and is now all in uppercase:

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

Now that the S3 Object Lambda Access Point is correctly configured, I can create the CloudFront distribution. Before I do that, in the list of S3 Object Lambda Access Points in the S3 console, I copy the Object Lambda Access Point alias that has been automatically created:

In the CloudFront console, I choose Distributions in the navigation pane and then Create distribution. In the Origin domain, I use the S3 Object Lambda Access Point alias and the Region. The full syntax of the domain is:

ALIAS.s3.REGION.amazonaws.com

S3 Object Lambda Access Points cannot be public, and I use CloudFront origin access control (OAC) to authenticate requests to the origin. For Origin access, I select Origin access control settings and choose Create control setting. I write a name for the control setting and select Sign requests and S3 in the Origin type dropdown.

Now, my Origin access control settings use the configuration I just created.

To reduce the number of requests going through S3 Object Lambda, I enable Origin Shield and choose the closest Origin Shield Region to the Region I am using. Then, I select the CachingOptimized cache policy and create the distribution. As the distribution is being deployed, I update permissions for the resources used by the distribution.

Setting Up Permissions to Use an S3 Object Lambda Access Point as the Origin of a CloudFront Distribution
First, the S3 Object Lambda Access Point needs to give access to the CloudFront distribution. In the S3 console, I select the S3 Object Lambda Access Point and, in the Permissions tab, I update the policy with the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3-object-lambda:Get*",
            "Resource": "arn:aws:s3-object-lambda:REGION:ACCOUNT:accesspoint/NAME",
            "Condition": {
                "StringEquals": {
                    "aws:SourceArn": "arn:aws:cloudfront::ACCOUNT:distribution/DISTRIBUTION-ID"
                }
            }
        }
    ]
}

The supporting access point also needs to allow access to CloudFront when called via S3 Object Lambda. I select the access point and update the policy in the Permissions tab:

{
    "Version": "2012-10-17",
    "Id": "default",
    "Statement": [
        {
            "Sid": "s3objlambda",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:REGION:ACCOUNT:accesspoint/NAME",
                "arn:aws:s3:REGION:ACCOUNT:accesspoint/NAME/object/*"
            ],
            "Condition": {
                "ForAnyValue:StringEquals": {
                    "aws:CalledVia": "s3-object-lambda.amazonaws.com"
                }
            }
        }
    ]
}

The S3 bucket needs to allow access to the supporting access point. I select the bucket and update the policy in the Permissions tab:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "*",
            "Resource": [
                "arn:aws:s3:::BUCKET",
                "arn:aws:s3:::BUCKET/*"
            ],
            "Condition": {
                "StringEquals": {
                    "s3:DataAccessPointAccount": "ACCOUNT"
                }
            }
        }
    ]
}

Finally, CloudFront needs to be able to invoke the Lambda function. In the Lambda console, I choose the Lambda function used by S3 Object Lambda, and then, in the Configuration tab, I choose Permissions. In the Resource-based policy statements section, I choose Add permissions and select AWS Account. I enter a unique Statement ID. Then, I enter cloudfront.amazonaws.com as Principal and select lambda:InvokeFunction from the Action dropdown and Save. We are working to simplify this step in the future. I’ll update this post when that’s available.

Testing the CloudFront Distribution
When the distribution has been deployed, I test that the setup is working with the same sample file I used before. In the CloudFront console, I select the distribution and copy the Distribution domain name. I can use the browser and enter https://DISTRIBUTION_DOMAIN_NAME/s3.txt in the navigation bar to send a request to CloudFront and get the file processed by S3 Object Lambda. To quickly get all the info, I use curl with the -i option to see the HTTP status and the headers in the response:

curl -i https://DISTRIBUTION_DOMAIN_NAME/s3.txt

HTTP/2 200 
content-type: text/plain
content-length: 427
x-amzn-requestid: a85fe537-3502-4592-b2a9-a09261c8c00c
date: Mon, 06 Mar 2023 10:23:02 GMT
x-cache: Miss from cloudfront
via: 1.1 a2df4ad642d78d6dac65038e06ad10d2.cloudfront.net (CloudFront)
x-amz-cf-pop: DUB56-P1
x-amz-cf-id: KIiljCzYJBUVVxmNkl3EP2PMh96OBVoTyFSMYDupMd4muLGNm2AmgA==

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

It works! As expected, the content processed by the Lambda function is all uppercase. Because this is the first invocation for the distribution, it has not been returned from the cache (x-cache: Miss from cloudfront). The request went through S3 Object Lambda to process the file using the Lambda function I provided.

Let’s try the same request again:

curl -i https://DISTRIBUTION_DOMAIN_NAME/s3.txt

HTTP/2 200 
content-type: text/plain
content-length: 427
x-amzn-requestid: a85fe537-3502-4592-b2a9-a09261c8c00c
date: Mon, 06 Mar 2023 10:23:02 GMT
x-cache: Hit from cloudfront
via: 1.1 145b7e87a6273078e52d178985ceaa5e.cloudfront.net (CloudFront)
x-amz-cf-pop: DUB56-P1
x-amz-cf-id: HEx9Fodp184mnxLQZuW62U11Fr1bA-W1aIkWjeqpC9yHbd0Rg4eM3A==
age: 3

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

This time the content is returned from the CloudFront cache (x-cache: Hit from cloudfront), and there was no further processing by S3 Object Lambda. By using S3 Object Lambda as the origin, the CloudFront distribution serves content that has been processed by a Lambda function and can be cached to reduce latency and optimize costs.

Resizing Images Using S3 Object Lambda and CloudFront
As I mentioned at the beginning of this post, one of the use cases that can be implemented using S3 Object Lambda and CloudFront is image transformation. Let’s create a CloudFront distribution that can dynamically resize an image by passing the desired width and height as query parameters (w and h respectively). For example:

https://DISTRIBUTION_DOMAIN_NAME/image.jpg?w=200&h=150

For this setup to work, I need to make two changes to the CloudFront distribution. First, I create a new cache policy to include query parameters in the cache key. In the CloudFront console, I choose Policies in the navigation pane. In the Cache tab, I choose Create cache policy. Then, I enter a name for the cache policy.

In the Query settings of the Cache key settings, I select the option to Include the following query parameters and add w (for the width) and h (for the height).

Then, in the Behaviors tab of the distribution, I select the default behavior and choose Edit.

There, I update the Cache key and origin requests section:

In the Cache policy, I use the new cache policy to include the w and h query parameters in the cache key.
In the Origin request policy, use the AllViewerExceptHostHeader managed policy to forward query parameters to the origin.

Now I can update the Lambda function code. To resize images, this function uses the Pillow module that needs to be packaged with the function when it is uploaded to Lambda. You can deploy the function using a tool like the AWS SAM CLI or the AWS CDK. Compared to the previous example, this function also handles and returns HTTP errors, such as when content is not found in the bucket.

import io
import boto3
from urllib.request import urlopen, HTTPError
from PIL import Image

from urllib.parse import urlparse, parse_qs

s3 = boto3.client('s3')

def lambda_handler(event, context):
    print(event)

    object_get_context = event['getObjectContext']
    request_route = object_get_context['outputRoute']
    request_token = object_get_context['outputToken']
    s3_url = object_get_context['inputS3Url']

    # Get object from S3
    try:
        original_image = Image.open(urlopen(s3_url))
    except HTTPError as err:
        s3.write_get_object_response(
            StatusCode=err.code,
            ErrorCode='HTTPError',
            ErrorMessage=err.reason,
            RequestRoute=request_route,
            RequestToken=request_token)
        return

    # Get width and height from query parameters
    user_request = event['userRequest']
    url = user_request['url']
    parsed_url = urlparse(url)
    query_parameters = parse_qs(parsed_url.query)

    try:
        width, height = int(query_parameters['w'][0]), int(query_parameters['h'][0])
    except (KeyError, ValueError):
        width, height = 0, 0

    # Transform object
    if width > 0 and height > 0:
        transformed_image = original_image.resize((width, height), Image.ANTIALIAS)
    else:
        transformed_image = original_image

    transformed_bytes = io.BytesIO()
    transformed_image.save(transformed_bytes, format='JPEG')

    # Write object back to S3 Object Lambda
    s3.write_get_object_response(
        Body=transformed_bytes.getvalue(),
        RequestRoute=request_route,
        RequestToken=request_token)

    return

I upload a picture I took of the Trevi Fountain in the source bucket. To start, I generate a small thumbnail (200 by 150 pixels).

https://DISTRIBUTION_DOMAIN_NAME/trevi-fountain.jpeg?w=200&h=150

Now, I ask for a slightly larger version (400 by 300 pixels):

https://DISTRIBUTION_DOMAIN_NAME/trevi-fountain.jpeg?w=400&h=300

It works as expected. The first invocation with a specific size is processed by the Lambda function. Further requests with the same width and height are served from the CloudFront cache.

Availability and Pricing
Aliases for S3 Object Lambda Access Points are available today in all commercial AWS Regions. There is no additional cost for aliases. With S3 Object Lambda, you pay for the Lambda compute and request charges required to process the data, and for the data S3 Object Lambda returns to your application. You also pay for the S3 requests that are invoked by your Lambda function. For more information, see Amazon S3 Pricing.

Aliases are now automatically generated when an S3 Object Lambda Access Point is created. For existing S3 Object Lambda Access Points, aliases are automatically assigned and ready for use.

It’s now easier to use S3 Object Lambda with existing applications, and aliases open many new possibilities. For example, you can use aliases with CloudFront to create a website that converts content in Markdown to HTML, resizes and watermarks images, or masks personally identifiable information (PII) from text, images, and documents.

Customize content for your end users using S3 Object Lambda with CloudFront.

— Danilo

New – Visualize Your VPC Resources from Amazon VPC Creation Experience

2023-02-06 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-visualize-your-vpc-resources-from-amazon-vpc-creation-experience/

Today we are announcing Amazon Virtual Private Cloud (Amazon VPC) resource map, a new feature that simplifies the VPC creation experience in the AWS Management Console. This feature displays your existing VPC resources and their routing visually on a single page, allowing you to quickly understand the architectural layout of the VPC.

A year ago, in March 2022, we launched a new VPC creation experience that streamlines the process of creating and connecting VPC resources. With just one click, even across multiple Availability Zones (AZs), you can create and connect VPC resources, eliminating more than 90 percent of the manual steps required in the past. The new creation experience is centered around an interactive diagram that displays a preview of the VPC architecture and updates as options are selected, providing a visual representation of the resources and their relationships within the VPC that you are about to create.

However, after the creation of the VPC, the diagram that was available during the creation experience that many of our customers loved was no longer available. Today we are changing that! With VPC resource map, you can quickly understand the architectural layout of the VPC, including the number of subnets, which subnets are associated with the public route table, and which route tables have routes to the NAT Gateway.

You can also get to the specific resource details by clicking on the resource. This eliminates the need for you to map out resource relationships mentally and hold the information in your head while working with your VPC, making the process much more efficient and less prone to mistakes.

Getting Started with VPC Resource Map
To get started, choose an existing VPC in the VPC console. In the details section, select the Resource map tab. Here, you can see the resources in your VPC and the relationships between those resources.

As you hover over a resource, you can see the related resources and the connected lines highlighted. If you click to select the resource, you can see a few lines of details and a link to see the details of the selected resource.

Getting Started with VPC Creation Experience
I want to explain how to use the VPC creation experience to improve your workflow to create a new VPC to make a high-availability three-tier VPC easily.

Choose Create VPC and select VPC and more in the VPC console. You can preview the VPC resources that you are about to create all on the same page.

In Name tag auto-generation, you can specify a prefix value for Name tags. This value is used to generate Name tags for all VPC resources in the preview. If I change the default value, which is project to channy, the Name tag in the preview changes to channy- something, such as channy-vpc. You can customize a Name tag per resource in the preview by clicking each resource and making changes.

You can easily change the default CIDR value (10.0.0.0/16) when you click the IPv4 CIDR block field to reveal the CIDR joystick. Use the left or right arrow to move to the previous (9.255.0.0/16) or next (10.0.1.0/16) CIDR block within the /16 network mask. You can also change the subnet mask to /17 by using the down arrow, or go back to /16 using the up arrow.

Choose the number of Availability Zones (AZs) up to 3. The number of public and private subnet types changes based on the number of AZs and shows the total number of each subnet type it will create.

I want a high-availability VPC in three AZs and select 6 for the number of private subnets. In the preview panel, you can see that there are 9 subnets. When I hover over channy-rtb-public, I can visually confirm that this route table is connected to three public subnets and also routed to the internet gateway (channy-igw). The dotted lines indicate routes to network node, and the solid lines indicate relationships such as implicit or explicit associations.

Adding NAT gateways and VPC endpoints is easy. You can simply change the number of NAT gateways in or per Availability Zone (AZ). Note that there is a charge for each NAT gateway. We always recommend having one NAT gateway per AZ and route traffic from subnets in an AZ to the NAT gateway in the same AZ for high availability and to avoid inter-AZ data charges.

To route traffic to Amazon Simple Storage Service (Amazon S3) buckets more securely, you can choose the S3 Gateway endpoint by default. The S3 Gateway endpoint is free of charge and does not use NAT gateways when moving data from private subnets.

You can create additional tags and assign them to all resources in the VPC in no time. I select Add new tag and enter environment for the Key and test for the Value. This key-value pair will be added to every resource here.

Choose Create VPC at the bottom of the page and see the resources and the IDs of those resources that are being created. Before creating, please validate resources from the preview.

Once all the resources are created, choose View VPC at the bottom. The button takes you directly to the VPC resource map, where you can see a visual representation of what you created.

Now Available
Amazon VPC resource map is now available in all AWS Regions where Amazon VPC is available, and you can start using it today.

The VPC resource map and creation experience now only displays VPC, subnets, route tables, internet gateway, NAT gateways, and Amazon S3 gateway. The Amazon VPC console teams and user experience teams will continue to improve the console experience using customer feedback.

To learn more, see the Amazon VPC User Guide, and please send feedback to AWS re:Post for Amazon VPC or through your usual AWS support contacts.

– Channy

AWS Verified Access Preview — VPN-less Secure Network Access to Corporate Applications

2022-11-30 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-verified-access-preview-vpn-less-secure-network-access-to-corporate-applications/

Today, we announced the preview of AWS Verified Access, a new secure connectivity service that allows enterprises to enable local or remote secure access for their corporate applications without requiring a VPN.

Traditionally, remote access to applications when on the road or working from home is granted by a VPN. Once the remote workforce is authenticated on the VPN, they have access to a broad range of applications depending on multiple policies defined in siloed systems, such as the VPN gateway, the firewalls, the identity provider, the enterprise device management solution, etc. These policies are typically managed by different teams, potentially creating overlaps, making it difficult to diagnose application access issues. Internal applications often rely on older authentication protocols, like Kerberos, that were built with the LAN in mind, instead of modern protocols, like OIDC, that are better tuned to modern enterprise patterns. Customers told us that policy updates can take months to roll out.

Verified Access is built using the AWS Zero Trust security principles. Zero Trust is a conceptual model and an associated set of mechanisms that focus on providing security controls around digital assets that do not solely or fundamentally depend on traditional network controls or network perimeters.

Verified Access improves your organization’s security posture by leveraging multiple security inputs to grant access to applications. It grants access to applications only when users and their devices meet the specified security requirements. Examples of inputs are the user identity and role or the device security posture, among others. Verified Access validates each application request, regardless of user or network, before granting access. Having each application access request evaluated allows Verified Access to adapt the security posture based on changing conditions. For example, if the device security signals that your device posture is out of compliance, then Verified Access will not allow you to access the application anymore.

In my opinion, there are three main benefits when adopting Verified Access:

It is easy to use for IT administrators. As an IT Administrator, you can now easily set up applications for secure remote access. It provides a single configuration point to manage and enforce a multisystem security policy to allow or deny access to your corporate applications.

It provides an open ecosystem that allows you to retain your existing identity provider and device management system. I listed all our partners at the end of this post.

It is easy to use for end users. This is my preferred one. Your workforce is not required to use a VPN client anymore. A simple browser plugin is enough to securely grant access when the user and the device are identified and verified. As of today, we support Chrome and Firefox web browsers. This is something about which I can share my personal experience. Amazon adopted a VPN-less strategy a few years ago. It’s been a relief for my colleagues and me to be able to access most of our internal web applications without having to start a VPN client and keep it connected all day long.

Let’s See It in Action
I deployed a web server in a private VPC and exposed it to my end users through a private application load balancer (https://demo.seb.go-aws.com). I created a TLS certificate for the application external endpoint (secured.seb.go-aws.com). I also set up AWS Identity Center (successor of AWS SSO). In this demo, I will use it as a source for user identities. Now I am ready to expose this application to my remote workforce.

Creating a Verified Access endpoint is a four-step process. To get started, I navigate to the VPC page of the AWS Management Console. I first create the trust provider. A trust provider maintains and manages identity information for users and devices. When an application request is made, the identity information sent by the trust provider will be evaluated by Verified Access before allowing or denying the application request. I select Verified Access trust provider on the left-side navigation pane.

On the Create Verified Access trust provider page, I enter a Name and an optional Description. I enter the Policy reference name, an identifier that will be used when working with policy rules. I select the source of trust: User trust provider. For this demo, I select IAM Identity Center as the source of trust for user identities. Verified Access also works with other OpenID Connect-compliant providers. Finally, I select Create Verified Access trust provider.

I may repeat the operation when I have multiple trust providers. For example, I might have an identity-based trust provider to verify the identity of my end users and a device-based trust provider to verify the security posture of their devices.

I then create the Verified Identity instance. A Verified Access instance is a Regional AWS entity that evaluates application requests and grants access only when your security requirements are met.

On the Create Verified Access instance page, I enter a Name and an optional Description. I select the trust provider I just created. I can add additional trust provider types once the Verified Access instance is created.

Third, I create a Verified Access group.

A Verified Access group is a collection of applications that have similar security requirements. Each application within a Verified Access group shares a group-level policy. For example, you can group together all applications for “finance” users and use one common policy. This simplifies your policy management. You can use a single policy for a group of applications with similar access needs.

On the Create Verified Access group page, I enter a Name only. I will enter a policy at a later stage.

The fourth and last step before testing my setup is to create the endpoint.

A Verified Access endpoint is a regional resource that specifies the application that Verified Access will be providing access to. This is where your end users connect to. Each endpoint has its own DNS name and TLS certificate. After having evaluated incoming requests, the endpoint forwards authorized requests to your internal application, either an internal load balancer or a network interface. Verified Access supports network-level and application-level load balancers.

On the Create Verified Access endpoint page, I enter a Name and Description. I reference the Verified Access group that I just created.

In the Application details section, under Application domain, I enter the DNS name end users will use to access the application. For this demo, I use secured.seb.go-aws.com. Under Domain certificate ARN, I select a TLS certificate matching the DNS name. I created the certificate using AWS Certificate Manager.

On the Endpoint details section, I select VPC as Attachment type. I select one or multiple Security groups to attach to this endpoint. I enter awsnewsblog as Endpoint domain prefix. I select load balancer as Endpoint type. I select the Protocol (HTTP), then I enter the Port (80). I select the Load balancer ARN and the private Subnets where my load balancer is deployed.

Again, I leave the Policy details section empty. I will define a policy in the group instead. When I am done, I select Create Verified Access endpoint. It might take a few minutes to create.

Now it is time to grab a coffee and stretch my legs. When I return, I see the Verified Access endpoint is Active. I copy the Endpoint domain and add it as a CNAME record to my application DNS name (secured.seb.go-aws.com). I use Amazon Route 53 for this, but you can use your existing DNS server as well.

Then, I point my favorite browser to https://secured.seb.go-aws.com. The browser is redirected to IAM Identity Center (formerly AWS SSO). I enter the username and password of my test user. I am not adding a screenshot for this. After the redirection, I receive the error message : Unauthorized. This is expected because there is no policy defined on the Verified Access endpoint. It denies every request by default.

On the Verified Access groups page, I select the Policy tab. Then I select the Modify Verified Access endpoint policy button to create an access policy.

I enter a policy allowing anybody authenticated and having an email address ending with @amazon.com. This is the email address I used for the user defined in AWS Identity Center. Note that the name after context is the name I entered as Policy reference name when I created the Verified Access trust provider. The documentation page has the details of the policy syntax, the attributes, and the operators I can use.

permit(principal, action, resource)
when {
    context.awsnewsblog.user.email.address like "*@amazon.com"
};

After a few minutes, Verified Access updates the policy and becomes Active again. I force my browser to refresh, and I see the internal application now available to my authenticated user.

Pricing and Availability

AWS Verified Access is now available in preview in 10 AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Asia Pacific (Sydney), Canada (Central), Europe (Ireland, London, Paris), and South America (São Paulo).

As usual, pricing is based on your usage. There is no upfront or fixed price. We charge per application (Verified Access endpoint) per hour, with tiers depending on the number of applications. Prices start in US East (N. Virginia) Region at $0.27 per verified Access endpoint and per hour. This price goes down to $0.20 per endpoint per hour when you have more than 200 applications.

On top of this, there is a charge of $0.02 per GB for data processed by Verified Access. You also incur standard AWS data transfer charges for all data transferred using Verified Access.

This billing model makes it easy to start small and then grow at your own pace.

Go and configure your first Verified Access access point today.

— seb

Introducing VPC Lattice – Simplify Networking for Service-to-Service Communication (Preview)

2022-11-30 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-vpc-lattice-simplify-networking-for-service-to-service-communication-preview/

Modern applications are built using modular and distributed components. Each component is a service that implements its own subset of functionalities. To make these services communicate with each other, you need a way to let them discover where they are, authorize access, and route traffic. When troubleshooting issues, you need to keep communication configurations under control so that you can quickly understand what is happening at the application, service, and network levels. This can take a lot of your time.

Today, we are making available in preview Amazon VPC Lattice, a new capability of Amazon Virtual Private Cloud (Amazon VPC) that gives you a consistent way to connect, secure, and monitor communication between your services. With VPC Lattice, you can define policies for traffic management, network access, and monitoring so you can connect applications in a simple and consistent way across AWS compute services (instances, containers, and serverless functions). VPC Lattice automatically handles network connectivity between VPCs and accounts and network address translation between IPv4, IPv6, and overlapping IP addresses. VPC Lattice integrates with AWS Identity and Access Management (IAM) to give you the same authentication and authorization capabilities you are familiar with when interacting with AWS services today, but for your own service-to-service communication. With VPC Lattice, you have common controls to route traffic based on request characteristics and weighted routing for blue/green and canary-style deployments. For example, VPC Lattice allows you to mix and match compute types for a given service, which helps you modernize a monolith application architecture to microservices.

VPC Lattice is designed to be noninvasive, allowing teams across your organization to incrementally opt in over time. In this way, you are able to deliver applications faster by focusing on your application logic, while VPC Lattice handles service-to-service networking, security, and monitoring requirements.

How Amazon VPC Lattice Works
With VPC Lattice, you create a logical application layer network, called a service network, that connects clients and services across different VPCs and accounts, abstracting network complexity. A service network is a logical boundary that is used to automatically implement service discovery and connectivity as well as apply access and observability policies to a collection of services. It offers inter-application connectivity over HTTP/HTTPS and gRPC protocols within a VPC.

Once a VPC has been enabled for a service network, clients in the VPC will automatically be able to discover the services in the service network through DNS and will direct all inter-application traffic through VPC Lattice. You can use AWS Resource Access Manager (RAM) to control which accounts, VPCs, and applications can establish communication via VPC Lattice.

A service is an independently deployable unit of software that delivers a specific task or function. In VPC Lattice, a service is a logical component that can live in any VPC or account and can run on a mixture of compute types (virtual machines, containers, and serverless functions). A service configuration consists of:

One or two listeners that define the port and protocol that the service is expecting traffic on. Supported protocols are HTTP/1.1, HTTP/2, and gRPC, including HTTPS for TLS-enabled services.
Listeners have rules that consist of a priority, which specifies the order in which rules should be processed, one or more conditions that define when to apply the rule, and actions that forward traffic to target groups. Each listener has a default rule that takes effect when no additional rules are configured, or no conditions are met.
A target group is a collection of targets, or compute resources, that are running a specific workload you are trying to route toward. Targets can be Amazon Elastic Compute Cloud (Amazon EC2) instances, IP addresses, and Lambda functions. For Kubernetes workloads, VPC Lattice can target services and pods via the AWS Gateway Controller for Kubernetes. To have access to the AWS Gateway Controller for Kubernetes, you can join the preview.

To configure service access controls, you can use access policies. An access policy is an IAM resource policy that can be associated with a service network and individual services. With access policies, you can use the “PARC” (principal, action, resource, and condition) model to enforce context-specific access controls for services. For example, you can use an access policy to define which services can access a service you own. If you use AWS Organizations, you can limit access to a service network to a specific organization.

VPC Lattice also provides a service directory, a centralized view of the services that you own or have been shared with you via AWS RAM.

Using Amazon VPC Lattice
We expect people with different roles can use VPC Lattice. For example:

The service network administrator can:
- Create and manage a service network.
- Define access and monitoring for the service network.
- Associate client and services.
- Share the service network with other AWS accounts.
The service owner can:
- Create and manage a service, including access and monitoring.
- Define routing, for example, configuring listeners and rules that point to the target groups where the service is running.
- Associate a service to service networks.

Let’s see how this works in practice. In this quick walkthrough, I am covering both roles.

Creating Two Backend Services
There is nothing specific to VPC Lattice in this section. I am just creating a couple of services, one running on Amazon EC2 and one on AWS Lambda, that I’ll use later when I configure networking with VPC Lattice.

In an Amazon Linux EC2 instance, I create a web app that replies “Hello from the instance” to HTTP requests. To allow access to the instance from clients coming via VPC Lattice, I add an inbound rule to the security group to allow TCP traffic on port 8080 from the VPC Lattice AWS-managed prefix list.

Here’s the app.py file. I am using Python and Flask for this app, but you don’t need to know them to follow along with the post.

from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
  return 'Hello from the instance'

@app.route('/<path>')
def somePath(path):
  return 'Hello from the instance at path "{}"'.format(path)

app.run(host='0.0.0.0', port=8080)

Here’s the requirements.txt file with the Python dependencies. There’s only one line because the only module I need is flask:

flask

I install the dependencies:

pip3 install -r requirements.txt

Then, I start the web app using the nohup command to keep it running in case I log out of the instance:

nohup flask run --host=0.0.0.0 --port 8080 &

On the EC2 instance, the web service is now listening to HTTP traffic on port 8080.

In the Lambda console, I create a simple function using the Node.js 18.x runtime that replies “Hello from the function” to all invocations.

exports.handler = async (event) => {
    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from the function'),
    };
    return response;
};

The two services are now both ready. Let’s use VPC Lattice to configure networking.

Creating VPC Lattice Target Groups
I start by creating two target groups, one for the EC2 instance and one for the Lambda function. In the VPC console, there is a new VPC Lattice section in the navigation pane. There, I choose Target groups and then Create target group.

For the first target group, I choose the Instances target type and enter a name.

I choose the protocol (HTTP) and port (8080) used by the web app running on the instance. I select the VPC where the instance is running and the protocol version (HTTP1).

Now I can configure the health check that will be used to test the target status. In this case, I use the default values proposed by the console.

In the next step, I can register the targets. I select the instance on which the web app is running from the list and choose to include it.

I review the selected targets (one instance in this case) and choose Submit.

In a similar way, I create a target group for the Lambda function. This time, I select the function from the list. I can choose which function version or function alias to use. For simplicity, I use the $LATEST version.

Creating VPC Lattice Services
Now that the target groups are ready, I choose Services in the navigation pane and then Create service. I enter a name and a description.

Now, I can choose the authentication type. If I choose None, the service network does not authenticate or authorize client access, and the auth policy, if present, is not used. I select AWS IAM and then, from the Apply policy template dropdown, the template that allows both authenticated and unauthenticated access.

In the Monitoring section, I turn on Access logs. As the destination for the access logs, I use an Amazon CloudWatch Log group that I created before. I also have the option to use an Amazon Simple Storage Service (Amazon S3) bucket or a Amazon Kinesis Data Firehose delivery stream.

In the next step, I define routing for the service. I choose Add listener. For the protocol, I configure the service to listen using HTTPS. In the default action, I choose to send two-thirds (Weight 20) of the requests to the instance target group and one-third (Weight 10) to the function target group.

Then, I add two additional rules. The first rule (Priority 10) sends all requests where the path is /to-instance to the instance target group.

The second rule (Priority 20) sends all traffic where the path is /to-function to the function target group.

In the next step, I am asked to associate the service with one or more service networks. I didn’t create a service network yet, so I skip this step for now and choose Next. I review the configuration and create the service.

Creating VPC Lattice Service Networks
Now, I create the service network so that I can associate the service and the VPCs I want to use. I choose Service network from the navigation pane and then Create service network. I enter a name and a description for the service network.

In the Associate services, I select the service I just created.

In the VPC associations, I select the VPC used by the instance where the web app runs. This can help in the future because it allows the web app to call other services associated with the service network.

Then, I select a second VPC where I have another EC2 instance that I want to use to run some tests.

For simplicity, in the Access section, I select the None auth type.

In the Monitoring section, I choose to send the access logs for the whole service network to an S3 bucket.

I review the summary of the configuration and create the service network. After a few seconds all service and VPC associations are active, and I can start using the service.

I write down the domain name of the service from the list of service associations.

Testing Access to the Service Using VPC Lattice
I look at the Routing tab of the service to find a nice recap of how the listener is handling routing towards the different target groups.

Then, I log into the EC2 instance in my second VPC and use curl to call the service domain name. As expected, I get about two-thirds of the responses from the instance and one-third from the function.

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws
Hello from the instance

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws
Hello from the instance

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws
"Hello from the function"

When I call the /to-instance and /to-function paths, the additional rules forward the requests to the instance and the function, respectively.

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws/to-instance
Hello from the instance "to-instance" path

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws/to-function
"Hello from the function"

I can now review access to my service using the access log subscriptions I configured before.

For the service, I look in the CloudWatch Log group. There, I find a log stream containing detailed access information about the service.

The access log for all services associated with the service network is on the S3 bucket. I have only one service for now, but more are coming.

Available in Preview
Amazon VPC Lattice is available in preview in the US West (Oregon) Region.

VPC Lattice provides deployment consistency across AWS compute types so that you can connect your services across instances, containers, and serverless functions. You can use VPC Lattice to apply granular and rich traffic controls, such as policy-based routing and weighted targets to support blue/green and canary-style deployments.

VPC Lattice allows monitoring and troubleshooting service-to-service communication with detailed access logs and metrics that capture request type, volume of traffic, error rates, response time, and more. In this blog post, I only scratched the surface of what you can do with VPC Lattice.

Simplify the way you connect, secure, and monitor service-to-service communication with Amazon VPC Lattice.

New – ENA Express: Improved Network Latency and Per-Flow Performance on EC2

2022-11-29 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-ena-express-improved-network-latency-and-per-flow-performance-on-ec2/

We know that you can always make great use of all available network bandwidth and network performance, and have done our best to supply it to you. Over the years, network bandwidth has grown from the 250 Mbps on the original m1 instance to 200 Gbps on the newest m6in instances. In addition to raw bandwidth, we have also introduced advanced networking features including Enhanced Networking, Elastic Network Adapters (ENAs), and (for tightly coupled HPC workloads) Elastic Fabric Adapters (EFAs).

Introducing ENA Express
Today we are launching ENA Express. Building on the Scalable Reliable Datagram (SRD) protocol that already powers Elastic Fabric Adapters, ENA Express reduces P99 latency of traffic flows by up to 50% and P99.9 latency by up to 85% (in comparison to TCP), while also increasing the maximum single-flow bandwidth from 5 Gbps to 25 Gbps. Bottom line, you get a lot more per-flow bandwidth and a lot less variability.

You can enable ENA Express on new and existing ENAs and take advantage of this performance right away for TCP and UDP traffic between c6gn instances running in the same Availability Zone.

Using ENA Express
I used a pair of c6gn instances to set up and test ENA Express. After I launched the instances I used the AWS Management Console to enable ENA Express for both instances. I find each ENI, select it, and choose Manage ENA Express from the Actions menu:

I enable ENA Express and ENA Express UDP and click Save:

Then I set the Maximum Transmission Unit (MTU) to 8900 on both instances:

$ sudo /sbin/ifconfig eth0 mtu 8900

I install iperf3 on both instances, and start the first one in server mode:

$ iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

Then I run the second one in client mode and observe the results:

$ iperf3 -c 10.0.178.46
Connecting to host 10.0.178.46, port 5201
[  4] local 10.0.187.74 port 35622 connected to 10.0.178.46 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  2.80 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   1.00-2.00   sec  2.81 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   2.00-3.00   sec  2.80 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   3.00-4.00   sec  2.81 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   4.00-5.00   sec  2.81 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   5.00-6.00   sec  2.80 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   6.00-7.00   sec  2.80 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   7.00-8.00   sec  2.81 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   8.00-9.00   sec  2.81 GBytes  24.1 Gbits/sec    0   1.43 MBytes
[  4]   9.00-10.00  sec  2.81 GBytes  24.1 Gbits/sec    0   1.43 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  28.0 GBytes  24.1 Gbits/sec    0             sender
[  4]   0.00-10.00  sec  28.0 GBytes  24.1 Gbits/sec                  receiver

The ENA driver reports on metrics that I can review to confirm the use of SRD:

ethtool -S eth0 | grep ena_srd
     ena_srd_mode: 3
     ena_srd_tx_pkts: 25858313
     ena_srd_eligible_tx_pkts: 25858323
     ena_srd_rx_pkts: 2831267
     ena_srd_resource_utilization: 0

The metrics work as follows:

ena_srd_mode indicates that SRD is enabled for TCP and UDP.
ena_srd_tx_pkts denotes the number of packets that have been transmitted via SRD.
ena_srd_eligible_pkts denotes the number of packets that were eligible for transmission via SRD. A packet is eligible for SRD if ENA-SRD is enabled on both ends of the connection, both connections reside in the same Availability Zone, and the packet is using either UDP or TCP.
ena_srd_rx_pkts denotes the number of packets that have been received via SRD.
ena_srd_resource_utilization denotes the percent of allocated Nitro network card resources that are in use, and is proportional to the number of open SRD connections. If this value is consistently approaching 100%, scaling out to more instances or scaling up to a larger instance size may be warranted.

Thing to Know
Here are a couple of things to know about ENA Express and SRD:

Access – I used the Management Console to enable and test ENA Express; CLI, API, CloudFormation and CDK support is also available.

Fallback – If a TCP or UDP packet is not eligible for transmission via SRD, it will simply be transmitted in the usual way.

UDP – SRD takes advantage of multiple network paths and “sprays” packets across them. This would normally present a challenge for applications that expect packets to arrive more or less in order, but ENA Express helps out by putting the UDP packets back into order before delivering them to you, taking the burden off of your application. If you have built your own reliability layer over UDP, or if your application does not require packets to arrive in order, you can enable ENA Express for TCP but not for UDP.

Instance Types and Sizes – We are launching with support for the 16xlarge size of the c6gn instances, with additional instance families and sizes in the works.

Resource Utilization – As I hinted at above, ENA Express uses some Nitro card resources to process packets. This processing also adds a few microseconds of latency per packet processed, and also has a moderate but measurable effect on the maximum number of packets that a particular instance can process per second. In situations where high packet rates are coupled with small packet sizes, ENA Express may not be appropriate. In all other cases you can simply enable SRD to enjoy higher per-flow bandwidth and consistent latency.

Pricing – There is no additional charge for the use of ENA Express.

Regions – ENA Express is available in all commercial AWS Regions.

All About SRD
I could write an entire blog post about SRD, but my colleagues beat me to it! Here are some great resources to help you to learn more:

A Cloud-Optimized Transport for Elastic and Scalable HPC – This paper reviews the challenges that arise when trying to run HPC traffic across a TCP-based network, and points out that the variability (latency outliers) can have a profound effect on scaling efficiency, and includes a succinct overview of SRD:

Scalable reliable datagram (SRD) is optimized for hyper-scale datacenters: it provides load balancing across multiple paths and fast recovery from packet drops or link failures. It utilizes standard ECMP functionality on the commodity Ethernet switches and works around its limitations: the sender controls the ECMP path selection by manipulating packet encapsulation.

There’s a lot of interesting detail in the full paper, and it is well worth reading!

In the Search for Performance, There’s More Than One Way to Build a Network – This 2021 blog post reviews our decision to build the Elastic Fabric Adapter, and includes some important data (and cool graphics) to demonstrate the impact of packet loss on overall application performance. One of the interesting things about SRD is that it keeps track of the availability and performance of multiple network paths between transmitter and receiver, and sprays packets across up to 64 paths at a time in order to take advantage of as much bandwidth as possible and to recover quickly in case of packet loss.

— Jeff;

Our guide to AWS Compute at re:Invent 2022

2022-11-22 Sheila Busser

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/our-guide-to-aws-compute-at-reinvent-2022/

This blog post is written by Shruti Koparkar, Senior Product Marketing Manager, Amazon EC2.

AWS re:Invent is the most transformative event in cloud computing and it is starting on November 28, 2022. AWS Compute team has many exciting sessions planned for you covering everything from foundational content, to technology deep dives, customer stories, and even hands on workshops. To help you build out your calendar for this year’s re:Invent, let’s look at some highlights from the AWS Compute track in this blog. Please visit the session catalog for a full list of AWS Compute sessions.

Learn what powers AWS Compute

AWS offers the broadest and deepest functionality for compute. Amazon Elastic Cloud Compute (Amazon EC2) offers granular control for managing your infrastructure with the choice of processors, storage, and networking.

One of the best ways to start is with the AWS Compute Leadership Session: “CMP223-L: Compute Innovation to enable any application in the cloud”. Join Dave Brown, VP of Amazon EC2, to hear about the innovations AWS is delivering for millions of organizations.
If you are interested in learning about the diversity of compute options available, along with our latest offerings, attend our “CMP225: What’s New with Amazon EC2” session.
Attend “CMP221: Modernize your iOS build pipelines with Amazon EC2 Mac instances” to learn how you can extend the flexibility, scalability, and cost benefits of AWS to your iOS application development pipelines.

The AWS Nitro System is the underlying platform for our all our modern EC2 instances. It enables AWS to innovate faster, further reduce cost for our customers, and deliver added benefits like increased security and new instance types.

To learn more about the AWS Nitro System, attend “CMP 301: Powering Amazon EC2: Deep dive on the AWS Nitro system”.
We often say that security is job zero at AWS. And Nitro System provides enhanced security that continuously monitors, protects, and verifies the instance hardware and firmware. Attend “CMP302: Confidential Computing with AWS compute” to learn how Nitro System helps AWS provide a secure environment for customer workloads.

Discover the benefits of AWS Silicon

AWS has invested years designing custom silicon optimized for the cloud. This investment helps us deliver high performance at lower costs for a wide range of applications and workloads using AWS services.

Explore the AWS journey into silicon innovation with our “CMP201: Silicon Innovation at AWS” session. We will cover some of the thought processes, learnings, and results from our experience building silicon for AWS Graviton, AWS Nitro System, and AWS Inferentia.
To learn about customer-proven strategies to help you make the move to AWS Graviton quickly and confidently while minimizing uncertainty and risk, attend “CMP410: Framework for adopting AWS Graviton-based instances”.

Explore different use cases

Amazon EC2 provides secure and resizable compute capacity for several different use-cases including general purpose computing for cloud native and enterprise applications, and accelerated computing for machine learning and high performance computing (HPC) applications.

High performance computing

HPC on AWS can help you design your products faster with simulations, predict the weather, detect seismic activity with greater precision, and more. To learn how to solve world’s toughest problems with extreme-scale compute come join us for “CMP205: HPC on AWS: Solve complex problems with pay-as-you-go infrastructure”.
Single on-premises general-purpose supercomputers can fall short when solving increasingly complex problems. Attend “CMP222: Redefining supercomputing on AWS” to learn how AWS is reimagining supercomputing to provide scientists and engineers with more access to world-class facilities and technology.
AWS offers many solutions to design, simulate, and verify the advanced semiconductor devices that are the foundation of modern technology. Attend “CMP320: Accelerating semiconductor design, simulation, and verification” to hear from ARM and Marvel about how they are using AWS to accelerate EDA workloads.

Machine Learning

Machine learning training is a complex problem requiring significant compute capacity and can benefit from specialized silicon. To hear more about the latest AWS designed silicon for deep learning training, attend “CMP313: Accelerate deep learning and innovate faster with AWS Trainium”. You will learn how Trainium delivers high performance and up to 50% savings in training costs over comparable GPU-based instances.
Attend “CMP330: AWS managed services to run ML training workloads at scale with AWS Trainium” to hear about AWS managed services you can use to run your machine learning (ML) training workloads at scale on Trn1 instances.
Attend “CMP209: AI parallelism explained: How Amazon Search scales deep-learning training” to learn more about Amazon Search trains large language models using various parallelism strategies and deploys them into production at scale.
To learn about how to choose the right Amazon EC2 instance for ML training and inference, attend “CMP207: Choosing the right accelerator for training and inference”.

Cost Optimization

Attend “CMP319: Optimize for cost and availability with capacity management” and learn how to use Amazon EC2 Capacity Reservations, On-Demand Capacity Reservations, and Savings Plans to lower costs so that you can focus on innovating.
Do you want to host simple applications such phone systems, blogs, ecommerce sites etc. on the cloud? Amazon Lightsail is a cost-effective, easy-to-use cloud platform that’s ideal for simpler workloads and quick deployments. Attend “CMP218: Set your small business up for success with Amazon Lightsail” to learn how you can quickly deploy and configure virtually any application at a predictable monthly rate.
Amazon EC2 Spot Instances are spare compute capacity available to you for less than On-Demand prices. To learn about the APIs and commands used to create Spot Instances, attend “CMP324: Spot the savings: Use Amazon EC2 Spot to optimize cloud deployments”.

Hear from our customers

We have several sessions this year where AWS customers are taking the stage to share their stories and details of exciting innovations made possible by AWS.

Are you a machine learning (ML) startup trying to optimize your infrastructure costs? Attend “CMP226: How four customers reduced ML inference costs and drove innovation” to hear from Screening Eagle, Money Forward, Dataminr, and Actuate about how they have used Amazon EC2 Inf1 instances to get better performance at a lower cost per inference.
Want to learn about the techniques Standard Chartered Bank uses to reduce waste and optimize cost and performance at scale? Attend “CMP213: Building a budget-conscious culture at Standard Chartered Bank” to learn how they’ve used a combination of AWS Savings Plan and Amazon EC2 spot instances to build critical large systems such as scaling applications, container platforms, and their grid for calculating risk and analytics.
Come join us at “CMP208: Run large-scale graphics workloads with AWS, featuring Mircom and Snap” to learn about how Snap personalizes Bitmoji avatars and how Mircom creates digital twins of their fire alarm control panels and mission-critical smart building technologies. Learn how they leverage GPU-based instances in Amazon EC2 to unlock scalable graphics applications.
To hear about metaverse experiences, from a panel including Epic Games, Surreal Events, and Cavrnus Inc., attend “CMP212: Metaverse experiences come to life with Amazon EC2 accelerated computing”.

Get started with hands-on sessions

Nothing like a hands-on session where you can learn by doing and get started easily with AWS compute. Our speakers and workshop assistants will help you every step of the way. Just bring your laptop to get started!

Are you a developer or IT practitioner running Linux-based workloads in Amazon EC2 and looking for better price performance? Attend “CMP303: Get hands-on with AWS Graviton-based EC2 instances” to learn how to move a workload to AWS-Graviton based instances including containerized applications.
Hugging Face has democratized the use of BERT-based natural language processing (NLP) models, which are a common foundation for many NLP applications. Attend “CMP206: Train & deploy a Hugging Face NLP model with AWS Trainium & AWS Inferentia” to learn how to use AWS ML silicon to train and run these popular models.
Would you like to gain a solid skills foundation to run common HPC workloads using cloud technologies? Join us at “CMP305: Best practices for high performance computing in the cloud” to get hands-on experience setting up your own cluster in the cloud and running a sample application.

You’ll get to meet the global cloud community at AWS re:Invent and get an opportunity to learn, get inspired, and rethink what’s possible. So build your schedule in the re:Invent portal and get ready to hit the ground running. We invite you to stop by the AWS Compute booth and chat with our experts. We look forward to seeing you in Las Vegas!

New – Direct VPC Routing Between On-Premises Networks and AWS Outposts Rack

2022-09-15 Steve Roberts

Post Syndicated from Steve Roberts original https://aws.amazon.com/blogs/aws/new-direct-vpc-routing-between-on-premises-networks-and-aws-outposts-rack/

Today, we announced direct VPC routing for AWS Outposts rack. This enables you to connect Outposts racks and on-premises networks using simplified IP address management. Direct VPC routing automatically advertises Amazon Virtual Private Cloud (Amazon VPC) subnet CIDR addresses to on-premises networks. This enables you to use the private IP addresses of resources in your VPC when communicating with your on-premises network. Furthermore, you can enable direct VPC routing using a self-serve process without needing to contact AWS.

If you’re unfamiliar, AWS Outposts rack, a part of the Outposts family, is a fully-managed service that offers the same AWS infrastructure, AWS services, APIs, and tools to virtually any on-premises datacenter or co-location space for a consistent hybrid experience. They’re ideal for workloads that require low-latency access to on-premises systems, local data processing, data residency, and migration of applications with local system interdependencies. Once installed, your Outposts rack becomes an extension of your VPC, and it’s managed using the same APIs, tools, and management controls that you already use in the cloud.

With direct VPC routing, you now have two options to configure and connect your Outposts rack to your on-premises networks. Previously, to configure network routing between an on-premises network and an Outposts rack, you needed to use Customer-owned IP addresses (CoIP). During an Outposts rack installation, this involved providing a separate IP address range/CIDR from your on-premises network for AWS to create an address pool, which is known as a CoIP pool. When an Amazon Elastic Compute Cloud (Amazon EC2) instance on your Outposts rack needed to communicate with your on-premises network, Outposts rack would perform a 1:1 network address translation (NAT) from the VPC private IP address to a CoIP address in the CoIP pool. Using CoIP means that you must manage both VPC and CoIP address pools, without overlap, and configure route propagation between the two sets of addresses. When adding a subnet to a VPC, you also needed to follow several steps to update route propagation between your networks to recognize the new subnet addresses.

Managing IP address ranges for AWS cloud and onsite resources, as well as dealing with CoIP ranges on Outposts rack, can be an operational burden. Although the option to use CoIP is still available and will continue to be fully supported, the new direct VPC routing option simplifies your IP address management. Automatic advertisement of CIDR addresses for subnets, including new subnets added in the future, between the VPC and your Outposts rack, removes the need for you to reconfigure IP addresses. This also keeps route propagation up-to-date, thereby saving you time and effort. Furthermore, as mentioned earlier, you can enable all of this with a self-serve option.

Enabling Direct VPC Routing
You can select either CoIP or direct VPC routing approaches and utilize a new self-service API, CreateLocalGatewayRouteTable, to configure direct VPC routing for both new and existing Outposts racks. This eliminates the need to contact AWS to enable the configuration. To enable direct VPC routing, simply set the mode property in the CreateLocalGatewayRouteTable API’s request parameters to the value direct-vpc-routing. If you’re already using CoIP, then you must delete and recreate the route table that’s propagating traffic between the Outposts rack and your on-premises network.

The following example diagram, taken from the user guide, illustrates the setup for an Outposts rack running several Amazon EC2 instances and connected to an on-premises network, with automatic address advertisement. Note that private IP address ranges are utilized across the Outposts rack resources and the on-premises network.

Get started with Direct VPC Routing today
The option to enable direct VPC routing is available now for both new and existing Outposts racks. As mentioned earlier, the option to use CoIP will continue to be supported, but now you can choose between direct VPC routing and CoIP based on your on-premises networking needs. Direct VPC routing is available in all AWS Regions where Outposts rack is supported.

Find more information on this topic in the AWS Outposts User Guide. More information on AWS Outposts rack is available here.

— Steve

Let’s Architect! Architecting for the edge

2022-08-24 Luca Mezzalira

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-architecting-for-the-edge/

Edge computing comprises elements of geography and networking and brings computing closer to the end users of the application.

For example, using a content delivery network (CDN) such as AWS CloudFront can help video streaming providers reduce latency for distributing their material by taking advantage of caching at the edge. Another example might look like an Internet of Things (IoT) solution that helps a company run business logic in remote areas or with low latency.

IoT is a challenging field because there are multiple aspects to consider as architects, like hardware, protocols, networking, and software. All of these aspects must be designed to interact together and be fault tolerant.

In this edition of Let’s Architect!, we share resources that are helpful for teams that are approaching or expanding their workloads for edge computing We cover macro topics such as security, best practices for IoT, patterns for machine learning (ML), and scenarios with strict latency requirements.

Build Machine Learning at the edge applications

In Let’s Architect! Architecting for Machine Learning, we touched on some of the most relevant aspects to consider while putting ML into production. However, in many scenarios, you may also have specific constraints like latency or a lack of connectivity that require you to design a deployment at the edge.

This blog post considers a solution based on ML applied to agriculture, where a reliable connection to the Internet is not always available. You can learn from this scenario, which includes information from model training to deployment, to design your ML workflows for the edge. The solution uses Amazon SageMaker in the cloud to explore, train, package, and deploy the model to AWS IoT Greengrass, which is used for inference at the edge.

High-level architecture of the components that reside on the farm and how they interact with the cloud environment

Security at the edge

Security is one of the fundamental pillars described in the AWS Well-Architected Framework. In all organizations, security is a major concern both for the business and the technical stakeholders. It impacts the products they are building and the perception that customers have.

We covered security in Let’s Architect! Architecting for Security, but we didn’t focus specifically on edge technologies. This whitepaper shows approaches for implementing a security strategy at the edge, with a focus on describing how AWS services can be used. You can learn how to secure workloads designed for content delivery, as well as how to implement network protection to defend against DDoS attacks and protect your IoT solutions.

The AWS Well-Architected Tool is designed to help you review the state of your applications and workloads. It provides a central place for architectural best practices and guidance

AWS Outposts High Availability Design and Architecture Considerations

AWS Outposts allows companies to run some AWS services on-premises, which may be crucial to comply with strict data residency or low latency requirements. With Outposts, you can deploy servers and racks from AWS directly into your data center.

This whitepaper introduces architectural patterns, anti-patterns, and recommended practices for building highly available systems based on Outposts. You will learn how to manage your Outposts capacity and use networking and data center facility services to set up highly available solutions. Moreover, you can learn from mental models that AWS engineers adopted to consider the different failure modes and the corresponding mitigations, and apply the same models to your architectural challenges.

An Outpost deployed in a customer data center and connected back to its anchor Availability Zone and parent Region

AWS IoT Lens

The AWS Well-Architected Lenses are designed for specific industry or technology scenarios. When approaching the IoT domain, the AWS IoT Lens is a key resource to learn the best practices to adopt for IoT. This whitepaper breaks down the IoT workloads into the different subdomains (for example, communication, ingestion) and maps the AWS services for IoT with each specific challenge in the corresponding subdomain.

As architects and developers, we tend to automate and reduce the risk of human errors, so the IoT Lens Checklist is a great resource to review your workloads by following a structured approach.

Workload context checklist from the IoT Lens Checklist

See you next time!

Thanks for joining our discussion on architecting for the edge! See you in two weeks when we talk about database architectures on AWS.

Looking for more architecture content?

AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

New – HTTP/3 Support for Amazon CloudFront

2022-08-16 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-http-3-support-for-amazon-cloudfront/

Amazon CloudFront is a content delivery network (CDN) service, a network of interconnected servers that is geographically closer to the users and reaches their computers much faster. Amazon CloudFront reduces latency by delivering data through 410+ globally dispersed Points of Presence (PoPs) with automated network mapping and intelligent routing.

With Amazon CloudFront, content, API requests and responses or applications can be delivered over Hypertext Transfer Protocol (HTTP) version 1.1, and 2.0 over the latest version of Transport Layer Security (TLS) to encrypt and secure communication between the user client and CloudFront.

Today we are adding HTTP version 3.0 (HTTP/3) support for Amazon CloudFront. HTTP/3 uses QUIC, a user datagram protocol-based, stream-multiplexed, and secure transport protocol that combines and improves upon the capabilities of existing transmission control protocol (TCP), TLS, and HTTP/2. Now, you can enable HTTP/3 for end user connections in all new and existing CloudFront distributions on all edge locations worldwide, and there is no additional charge for using this feature.

What is HTTP/3?
HTTP/3 uses QUIC and overcomes many of TCP’s limitations and bring those benefits to HTTP. When using existing HTTP/2 over TCP and TLS, TCP needs a handshake to establish a session between a client and server, and TLS also needs its own handshake to ensure that the session is secured. Each handshake has to make the full round trip between client and server, which can take a long time when client and server and far apart, network-wise. But, QUIC only needs a single handshake to establish a secure session.

Also, TCP is understood and manipulated by a myriad of different middleboxes, such as firewalls and network address translation (NAT) devices. QUIC uses UDP as its basis to allow packet flows in an enterprise or public network and is fully encrypted, including the metadata, which makes middleboxes unable to inspect or manipulate its details.

HTTP/3 streams are multiplexed independently to eliminate head-of-line blocking between requests and responses. This is possible because stream multiplexing occurs in the transport layer as opposed to the application layer like HTTP/2 over TCP. This enables web applications to perform faster, especially over slow networks and latency-sensitive connections.

Benefits of HTTP/3 on CloudFront
Our customers always want to provide faster, more responsive and secure experience on the web for end users. HTTP/3 provides benefits to all CloudFront customers in the form of faster connection times, stream multiplexing, client-side connection migration, and fewer round trips in the handshake process to reduce error rates.

QUIC connections over UDP support connection reuse with a connection ID independent from IP address/port tuples so users have no interruption or impact. Customers operating in countries with low network connectivity will see improved performance from their applications.

CloudFront’s HTTP/3 support provides enhanced security built on top of s2n-quic, an open-source Rust implementation of the QUIC protocol added to our set of AWS encryption open-source libraries, both with a strong emphasis on efficiency and performance.

If you enable HTTP/3 in CloudFront distributions, the users can make HTTP/3 viewer request to CloudFront edge locations. Past the edge location, we have highly reliable networks within AWS Cloud and CloudFront will continue to use HTTP/1.1 for origin fetches. So, you don’t need to make any server-side changes in order to make your content accessible via HTTP/3.

For some types of applications, like those requiring an HTTP client library to make HTTP requests, customers may need to update their HTTP client library to a version that supports HTTP/3. But if for some operational reason clients cannot establish a QUIC connection, they can fall back to another supported protocol such as HTTP/1.1 or HTTP/2.

How to Enable HTTP/3
To enable HTTP/3 connection, you can edit the distribution configuration through the CloudFront console. You can select HTTP/3 in Supported HTTP versions on an existing distribution or create a new distribution without any changes to origin. You can use the UpdateDistribution API or use the CloudFormation template.

After deploying your distribution, you can connect with a browser that supports HTTP/3, such as the latest version of Google Chrome, Mozilla Firefox, and Microsoft Edge, and Apple Safari after turning it on manually. To learn more about web browser support, see the Can I Use – HTTP/3 Support page.

From web developer tools in your browser, you can see the HTTP/3 requests made when a page is loaded from the CloudFront. The image below is an example of Mozilla Firefox.

You can also add HTTP/3 support to Curl and test from the command line:

$ curl --http3 -i https://d1e0fmnut9xxxxx.cloudfront.net/speed.html
HTTP/3 200
content-type: text/html
content-length: 9286
date: Fri, 05 Aug 2022 15:49:52 GMT
last-modified: Thu, 28 Jul 2022 00:50:38 GMT
etag: "d928997023f6479537940324aeddabb3"
x-amz-version-id: mdUmFuUfVaSHPseoVPRoOKGuUkzWeUhK
accept-ranges: bytes
server: AmazonS3
vary: Origin
x-cache: Miss from cloudfront
via: 1.1 6e4f43c5af08f740d02d21f990dfbe80.cloudfront.net (CloudFront)
x-amz-cf-pop: ICN54-C2
alt-svc: h3=":443"; ma=86400
x-amz-cf-id: 6fy8rrUrtqDMrgoc7iJ73kzzXzHz7LQDg73R0lez7_nEXa3h9uAlCQ==

Customer Stories
Several AWS customers including Snap, Zillow, AC3/Movember, Audible, Skyscanner have already enabled HTTP/3 on their CloudFront distributions. Here are some of their voices:

Snap Inc is a social media company that offers Snapchat, an app that offers a fast and fun way to connect with close friends to its community around the world. On AWS, Snap now supports more than 306 million Snapchat users sending over 5.4 billion Snaps daily with 20 percent less latency than its prior architecture.

Mahmoud Ragab, Software Engineering Manager at Snapchat said:

“Snapchat helps millions of people around the world to share moments with friends. At Snapchat, we strive to be the fastest way to communicate. This is why we have been partnering with Amazon Cloudfront for fast, high-performance, low latency content delivery, leveraging QUIC on Cloudfront.

It offers significant advantages while sending and receiving content, especially in networks with lossy signals and intermittent connectivity. Improvements offered by QUIC, like zero round-trip time (0-RTT) connection setup and improved congestion control enables an average of 10% reduction in time to first byte (TTFB) while lowering overall error rates. Lower network latencies and errors make Snapchat better for people all over the world.

With early access to QUIC, we’ve been able to experiment and quickly iterate and improve server-side implementation and optimize integration between the client and the server. Both companies will continue to collaborate together as QUIC is made more widely available.”

Zillow is a real estate tech company that offer its customers an on-demand experience for selling, buying, renting and financing with transparency and nearly seamless end-to-end service. Since 2015, Zillow has increased the availability of its imaging system by using Amazon S3 and Amazon CloudFront.

Craig Link, Chief Cloud Architect at Zillow said:

“We are excited about the launch of HTTP/3 support for Amazon CloudFront. Enabling HTTP/3 on CloudFront was a seamless transition and our synthetic test and ad-hoc usage continued working without issue.”

AC3 is an Australia-based AWS Managed Services partner and has supported our customer, Movember Foundation, one of the leading charities for men’s health. Running an international charity that handles donations, data, events, and localized websites in 21 countries can pose some technical challenges. Born in the cloud, Movember has leveraged AWS technology in adopting new working models, ensuring a flexible IT platform, and innovating faster.

Greg Cockburn, Head of Hyperscale Cloud at AC3 said:

“AC3 is excited to work with their longtime partner Movember enabling HTTP3 on their CloudFront distributions serving web and API frontends and is encouraged by the performance improvements seen in the initial results.”

Now Available
The HTTP/3 support for Amazon CloudFront is now available in all 410+ CloudFront edge locations worldwide with no additional charge for using this feature. To learn more, see the FAQ and Developer Guide of Amazon CloudFront. Please send feedback to AWS re:Post for Amazon CloudFront or through your usual AWS support contacts.

– Channy

New – AWS Private 5G – Build Your Own Private Mobile Network

2022-08-12 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-private-5g-build-your-own-private-mobile-network/

Back in the mid-1990’s, I had a young family and 5 or 6 PCs in the basement. One day my son Stephen and I bought a single box that contained a bunch of 3COM network cards, a hub, some drivers, and some cables, and spent a pleasant weekend setting up our first home LAN.

Introducing AWS Private 5G
Today I would like to introduce you to AWS Private 5G, the modern, corporate version of that very powerful box of hardware and software. This cool new service lets you design and deploy your own private mobile network in a matter of days. It is easy to install, operate, and scale, and does not require any specialized expertise. You can use the network to communicate with the sensors & actuators in your smart factory, or to provide better connectivity for handheld devices, scanners, and tablets for process automation.

The private mobile network makes use of CBRS spectrum. It supports 4G LTE (Long Term Evolution) today, and will support 5G in the future, both of which give you a consistent, predictable level of throughput with ultra low latency. You get long range coverage, indoors and out, and fine-grained access control.

AWS Private 5G runs on AWS-managed infrastructure. It is self-service and API-driven, and can scale with respect to geographic coverage, device count, and overall throughput. It also works nicely with other parts of AWS, and lets you use AWS Identity and Access Management (IAM) to control access to both devices and applications.

Getting Started with AWS Private 5G
To get started, I visit the AWS Private 5G Console and click Create network:

I assign a name to my network (JeffCell) and to my site (JeffSite) and click Create network:

The network and the site are created right away. Now I click Create order:

I fill in the shipping address, agree to the pricing (more on that later), and click Create order:

Then I await delivery, and click Acknowledge order to proceed:

The package includes a radio unit and ten SIM cards. The radio unit requires AC power and wired access to the public Internet, along with basic networking (IPv4 and DHCP).

When the order arrives, I click Acknowledge order and confirm that I have received the desired radio unit and SIMs. Then I engage a Certified Professional Installer (CPI) to set it up. As part of the installation process, the installer will enter the latitude, longitude, and elevation of my site.

Things to Know
Here are a couple of important things to know about AWS Private 5G:

Partners – Planning and deploying a private wireless network can be complex and not every enterprise will have the tools to do this work on their own. In addition, CBRS spectrum in the United States requires Certified Professional Installation (CPI) of radios. To address these needs, we are building an ecosystem of partners that can provide customers with radio planning, installation, CPI certification, and implementation of customer use cases. You can access these partners from the AWS Private 5G Console and work with them through the AWS Marketplace.

Deployment Options – In the demo above, I showed you the cloud–based deployment option, which is designed for testing and evaluation purposes, for time-limited deployments, and for deployments that do not use the network in latency-sensitive ways. With this option, the AWS Private 5G Mobile Core runs within a specific AWS Region. We are also working to enable on-premises hosting of the Mobile Core on a Private 5G compute appliance.

CLI and API Access – I can also use the create-network, create-network-site, and acknowledge-order-receipt commands to set up my AWS Private 5G network from the command line. I still need to use the console to place my equipment order.

Scaling and Expansion – Each network supports one radio unit that can provide up to 150 Mbps of throughput spread across up to 100 SIMs. We are working to add support for multiple radio units and greater number of SIM cards per network.

Regions and Locations – We are launching AWS Private 5G in the US East (Ohio), US East (N. Virginia), and US West (Oregon) Regions, and are working to make the service available outside of the United States in the near future.

Pricing – Each radio unit is billed at $10 per hour, with a 60 day minimum.

To learn more, read about AWS Private 5G.

— Jeff;

New for AWS Global Accelerator – Internet Protocol Version 6 (IPv6) Support

2022-07-27 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-global-accelerator-internet-protocol-version-6-ipv6-support/

IPv6 adoption has consistently increased over the last few years, especially among mobile networks. The main reasons to move to IPv6 are:

The limited availability of IPv4 addresses can limit the ability to scale up public-facing web and applications servers.
IPv6 users from mobile networks experience better performance when their network traffic doesn’t need to manage IPv6 to IPv4 translation.
You might need to comply with regulatory rules (such as the Federal Acquisition Regulation in US) to run specific internet traffic over IPv6.

Based on this, we found that we could help improve the network path that your customers use to reach your applications by adding IPv6 support to AWS Global Accelerator. Global Accelerator uses the AWS global network to route network traffic and keep packet loss, jitter, and latency consistently low. Customers like Atlassian, New Relic, and SkyScanner already use Global Accelerator to improve the global availability and performance of their applications.

Global Accelerator provides two global static public IPs that act as a fixed entry point to your application. You can update your application endpoints without making user-facing changes to the IP address. If you configure more than one application endpoint, Global Accelerator automatically reroutes your traffic to your nearest healthy available endpoint to mitigate endpoint failure.

Starting today, you can provide better network performance by routing IPv6 traffic through Global Accelerator to your application endpoints running in AWS Regions. Global Accelerator now supports two types of accelerators: dual-stack and IPv4-only. With a dual-stack accelerator, you are provided with a pair of IPv4 and IPv6 global static IP addresses that can serve both IPv4 and IPv6 traffic.

For existing IPv4-only accelerators, you can update your accelerators to dual-stack to serve both IPv4 and IPv6 traffic. This update enables your accelerator to serve IPv6 traffic and doesn’t impact existing IPv4 traffic served by the accelerator.

Dual-stack accelerators supporting both IPv6 and IPv4 traffic require dual-stack endpoints in the back end. For example, Application Load Balancers (ALBs) can have their IP address type configured as either IPv4-only or dual stack, allowing them to accept both IPv4 or IPv6 client connections. Today, dual-stack ALBs are supported as endpoints for dual-stack accelerators.

Deploying a Dual-Stack Application
To test this new feature, I need a dual-stack application with an ALB entry point. The application must be deployed in Amazon Virtual Private Cloud (Amazon VPC) and support IPv6 traffic. I don’t happen to have IPv6-ready VPCs in my account. I can follow these instructions to migrate an existing VPC that supports IPv4 only to IPv6, or I can create a VPC that supports IPv6 addressing. For this post, I choose to create a VPC.

In the AWS Management Console, I navigate to the Amazon VPC Dashboard. I choose Launch VPC Wizard. In the wizard, I enter a value for the Name tag. This value will be used to auto-generate Name tags for all resources in the VPC. Then, I select the option to associate an Amazon-provided IPv6 CIDR block. I leave all other options to their default values and choose Create VPC.

After less than a minute, the VPC is ready. I edit the settings of both public subnets to enable the Auto-assign IP settings to automatically request both a public IPv4 address and an IPv6 address for new network interfaces in this subnet.

Now, I want to deploy an application in this VPC. The application will be the endpoint for my accelerator. I view and download the WordPress scalable and durable AWS CloudFormation template from the Sample solutions section of the CloudFormation documentation. This template deploys a full WordPress website behind an ALB. The web tier is scalable and implemented as an EC2 Auto Scaling group. The MySQL database is managed by Amazon Relational Database Service (RDS).

Before deploying the stack, I edit the template to make a few changes. First, I add a DBSubnetGroup resource:

"DBSubnetGroup" : {
  "Type": "AWS::RDS::DBSubnetGroup",
  "Properties": {
    "DBSubnetGroupDescription" : "DB subnet group",
    "SubnetIds" : { "Ref" : "Subnets"}
  }
},

Then, I add the DBSubnetGroupName property to the DBInstance resource. In this way, the database created by the template will be deployed in the same subnets (and VPC) as the web servers.

"DBSubnetGroupName" : { "Ref" : "DBSubnetGroup" },

The last change adds the IpAddressType property to the ApplicationLoadBalancer resource to create a dual-stack load balancer that has IPv6 addresses and will be ready to be used with the new dual-stack option of Global Accelerator.

"IpAddressType": "dualstack",

Because IpAddressType is set to dualstack, the ALB created by the stack will also have IPv6 addresses and will be ready to be used with the new dual-stack option of Global Accelerator.

In the CloudFormation console, I create a stack and upload the template I just edited. In the template parameters, I enter a database user and password to use. For the VpcId parameter, I select the IPv6-ready VPC I just created. For the Subnets parameter, I select the two public subnets of the same VPC. After that, I go to the next steps and create the stack.

After a few minutes, the stack creation is complete. To access the website, I need to open network access to the load balancer. In the EC2 console, I create a security group that allows public access using the HTTP and HTTPS protocols (ports 80 and 443).

I choose Load balancers from the navigation pane and select the ALB used by my application. In the Security section, I choose Edit security groups and add the security group I just created to allow web access.

Now, I look for the dual-stack (A or AAAA Record) DNS name of the load balancer. I open a browser and connect using the DNS name to complete the configuration of WordPress.

When connecting again to the endpoint, I see my new (and empty) WordPress website.

Using Dual-Stack Accelerators with Support for Both IPv6 and IPv4 traffic
Now that my application is ready, I add a dual-stack accelerator in front of the dual-stack ALB. In the Global Accelerator console, I choose Create accelerator. I enter a name for the accelerator and choose the Standard accelerator type.

To route both IPv4 and IPv6 through this accelerator, I select the Dual-stack option for the IP address type.

Then I add a listener for port 80 using the TCP protocol.

For that listener, I configure an endpoint group in the AWS Region where I have my application deployed.

I choose Application Load Balancer for the Endpoint type and select the ALB in the CloudFormation stack.

Then, I choose Create accelerator. After a few minutes, the accelerator is deployed, and I have a dual-stack DNS name to reach the ALB using IPv4 or IPv6 depending on the network used by the client.

Now, my customers can use the IPv4 and IPv6 addresses or, even better, the dual-stack DNS name of the accelerator to connect to the WordPress website. If there is a front-end or mobile application my customers use to connect to the WordPress REST APIs, I can use the dual-stack DNS name so that clients will connect using their preferred IPv4 or IPv6 route.

To understand if the communication between Global Accelerator and the ALB is working, I can monitor the new FlowsDrop Amazon CloudWatch metric. This metric tells me if Global Accelerator is unable to route IPv6 traffic through the endpoint. For example, that can happen if, after the creation of the accelerator, the configuration of the ALB is updated to use IPv4 only.

Availability and Pricing
You can configure dual-stack accelerators using the AWS Management Console, the AWS Command Line Interface (CLI), and AWS SDKs. You can use dual-stack accelerators to optimize access to your applications deployed in any commercial AWS Region.

Protocol translation is not supported, neither IPv4 to IPv6 nor IPv6 to IPv4. For example, Global Accelerator will not allow me to configure a dual-stack accelerator with an IPv4-only ALB endpoint. Also, for IPv6 ALB endpoints, client IP preservation must be enabled.

There are no additional costs for using dual-stack accelerators. You pay for the hours and the amount of data transfer in the dominant direction used by traffic to or from the accelerator. Data transfer costs depend on the location of your clients and the AWS Regions where you are running your applications. For more information, see the Global Accelerator pricing page.

Optimize the IPv6 and IPv4 network paths used by your customers to reach your applications with AWS Global Accelerator.

— Danilo

Strategies for efficient IP address consumption

Solutions for network size (IP address) expansion

Expand VPC CIDR ranges with routable addresses

Configure non-routable CIDR with a private NAT gateway

Prerequisites

Deploy the solution

Connect to an AWS Cloud9 instance

Prepare the source MySQL table

Prepare the target MySQL table

Verify the networking setup (Optional)

Run the AWS Glue crawlers

Run the AWS Glue ETL job

Verify the results

Clean up

Conclusion

About the authors

Use resilient architectures

Detecting anomalies in network traffic

Solution overview

How it works

Prerequisites

Deploy the solution

Step 1. Download the latest release

Step 2. Upload the CloudFormation templates and Python code to an S3 bucket

Step 3. Gather information needed for the CloudFormation template parameters

Step 4. Create the CloudFormation stack

Step 5. Monitor traffic mitigation activity using the CloudWatch dashboard

Customize the solution

Limitations

Cost considerations

Conclusion

Learn what powers AWS Compute

Discover the benefits of AWS Silicon

Explore different use cases

High performance computing

Machine Learning

Cost Optimization

Hear from our customers

Get started with hands-on sessions

See you next time!

Other posts in this series

Looking for more architecture content?

The collective thoughts of the interwebz